Yahoo
  • Sunnyvale, United States
Recent publications
Web page categorization has been extensively studied in the literature and has been successfully used to improve information retrieval, recommendation, personalization and ad targeting. With the new industry trend of not tracking users' online behavior without their explicit permission, using contextual targeting to accurately understand web pages in order to display ads that are topically relevant to the pages becomes more important. This is challenging, however, because an ad request only contains the URL of a web page. As a result, there is very limited available text for making accurate classifications. In this paper, we propose a unified multilingual model that can seamlessly classify web pages in 5 high-impact languages using either their full content or just their URLs with limited text. We adopt multiple data sampling techniques to increase coverage for rare categories in our training corpus, and modify the loss using class-based re-weighting to smooth the influence of frequent versus rare categories. We also propose using an ensemble of teacher models for knowledge distillation and explore different ways to create a teacher ensemble. Offline evaluation shows at least 2.6% improvement in mean average precision across 5 languages compared to a URL classification model trained with single-teacher knowledge distillation. The unified model for both full-content and URL-only input further improves the mean average precision of the dedicated URL classification model by 0.6%. We launched the proposed models, which achieve at least 37% better mean average precision than the legacy tree-based models, for contextual targeting in the Yahoo Demand Side Platform, leading to a significant ad delivery and revenue increase.
Word representations like GloVe and Word2Vec encapsulate semantic and syntactic attributes and constitute the fundamental building block in diverse Natural Language Processing (NLP) applications. Such vector embeddings are typically stored in float32 format, and for a substantial vocabulary size, they impose considerable memory and computational demands due to the resource-intensive float32 operations. Thus, representing words via binary embeddings has emerged as a promising but challenging solution. In this paper, we introduce BRECS, an autoencoder-based Siamese framework for the generation of enhanced binary word embeddings (from the original embeddings). We propose the use of the novel Binary Cosine Similarity (BCS) regularisation in BRECS, which enables it to learn the semantics and structure of the vector space spanned by the original word embeddings, leading to better binary representation generation. We further show that our framework is tailored with independent parameters within the various components, thereby providing it with better learning capability. Extensive experiments across multiple datasets and tasks demonstrate the effectiveness of BRECS, compared to existing baselines for static and contextual binary word embedding generation. The source code is available at https://github.com/rajbsk/brecs.
Generative AI uses a large set of sources to create content. The content generated by large language models is text. Often, that text contains statements that are inaccurate or false, sometimes called hallucinations. We explore how identifying citations for the generated text can enable people to determine whether to trust the statements in the text, by allowing different users to specify different trusted sets of sources as candidates for citations. Then we propose methods to eliminate or correct untrustworthy statements. We also consider how citations can help build consensus among people who have different trusted sources of information, by using a large language model to construct text, then editing the text so that it is supported by citations drawn from multiple sets of trusted sources. By using generative AI as a go-between, such a process may allow parties with mutual distrust to discover and confirm areas of agreement. This paper is a proposal for systems that enhance large language models’ usefulness and an outline of some challenges and methods for such systems; it is not a record of system development or testing.
While a multitude of studies have been conducted on graph drawing, many existing methods only focus on optimizing a single aesthetic aspect of graph layouts, which can lead to sub-optimal results. There are a few existing methods that have attempted to develop a flexible solution for optimizing different aesthetic aspects measured by different aesthetic criteria. Furthermore, thanks to the significant advance in deep learning techniques, several deep learning-based layout methods were proposed recently. These methods have demonstrated the advantages of deep learning approaches for graph drawing. However, none of these existing methods can be directly applied to optimizing non-differentiable criteria without special accommodation. In this work, we propose a novel Generative Adversarial Network (GAN) based deep learning framework for graph drawing, called, which can optimize different quantitative aesthetic goals, regardless of their differentiability. To demonstrate the effectiveness and efficiency of, we conducted experiments on minimizing stress, minimizing edge crossing, maximizing crossing angle, maximizing shape-based metrics, and a combination of multiple aesthetics. Compared with several popular graph drawing algorithms, the experimental results show that achieves good performance both quantitatively and qualitatively.
We study the multi-step Model-Agnostic Meta-Learning (MAML) framework where a group of n agents seeks to find a common point that enables “few-shot” learning (personalization) via local stochastic gradient steps on their local functions. We formulate the personalized optimization problem under the MAML framework and propose PARS-Push, a decentralized asynchronous algorithm robust to message failures, communication delays, and directed message sharing. We characterize the convergence rate of PARS-Push under arbitrary multi-step personalization for smooth strongly convex, and smooth non-convex functions. Moreover, we provide numerical experiments showing its performance under heterogeneous data setups.
Institution pages aggregate content on ResearchGate related to an institution. The members listed on this page have self-identified as being affiliated with this institution. Publications listed on this page were identified by our algorithms as relating to this institution. This page was not created or approved by the institution. If you represent an institution and have questions about these pages or wish to report inaccurate content, you can contact us here.
341 members
Yifan Hu
  • Scalable Machine Learning (NYC)
Mounia Lalmas
  • Advertising Sciences
Information
Address
Sunnyvale, United States