Cross-website results. ASI significantly surpasses baselines in sr and # steps (with |t| > 2 and p < 0.05) from our analysis in §B.3.

Cross-website results. ASI significantly surpasses baselines in sr and # steps (with |t| > 2 and p < 0.05) from our analysis in §B.3.

Source publication
Preprint
Full-text available
To succeed in common digital tasks such as web navigation, agents must carry out a variety of specialized tasks such as searching for products or planning a travel route. To tackle these tasks, agents can bootstrap themselves by learning task-specific skills online through interaction with the web environment. In this work, we demonstrate that prog...

Context in source publication

Context 1
... we find that agents can often spot this incompatibility and rarely attempt to use these deprecated skills. We thus allow agents to induce new skills or update previously acquired skills from experiences on the new website, denoted as +update entries in Table 6. We find that enabling skill update in both textual and program formats helps agent performance on new websites. ...