Figure - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
Source publication
This paper describes our zero-shot approaches for the Visual Word Sense Disambiguation (VWSD) Task in English. Our preliminary study shows that the simple approach of matching candidate images with the phrase using CLIP suffers from the many-to-many nature of image-text pairs. We find that the CLIP text encoder may have limited abilities in capturi...
Contexts in source publication
Context 1
... then rank the candidate images based on the new probability in descending order, with the highest probability candidate image being the predicted image from the ensembled model. See Table 1. ...Context 2
... Augment-CLIP does not outperform Base-CLIP, often due to poor translation, but, interestingly, it offers sufficient complementarity to Base-CLIP or other Augment-CLIP that it improves performance through ensembling. See results in Table 1. ...Context 3
... the organizers' baseline uses CLIP-ViT-largepatch14-336, an even larger model which improved performance in test data. See Table 1. This leads to the question of how different Base-CLIP embeddings affect performance on this task, which is outside the scope of this paper as we take the Base-CLIP embedding as a given in our systems. ...Similar publications
The increase in the popularity of code mixed languages has resulted in the need to engineer language models for the same. Unlike pure languages , code-mixed languages lack clear grammatical structures, leading to ambiguous sentence constructions. This ambiguity presents significant challenges for natural language processing tasks, including syntact...
We evaluate a battery of recent large language models on two benchmarks for word sense disambiguation in Swedish. At present, all current models are less accurate than the best supervised disambiguators in cases where a training set is available, but most models outperform graph-based unsupervised systems. Different prompting approaches are compare...
The level and volume of automatic computerized processing of linguistic information has become one of the most important criteria for measuring whether a country has entered the information society. The study begins with statistical linguistics and aims to process complicated Chinese information. In this paper, after establishing the word database...
Natural language processing (NLP) may face the inexplicable “black-box” problem of parameters and unreasonable modeling for lack of embedding of some characteristics of natural language, while the quantum-inspired models based on quantum theory may provide a potential solution. However, the essential prior knowledge and pretrained text features are...