Yang Yu’s research while affiliated with Institute of Scientific and Technical Information of China and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (2)


Learning for Search Results Diversification in Twitter
  • Conference Paper

June 2016

·

368 Reads

·

6 Citations

Lecture Notes in Computer Science

Ying Wang

·

Zhunchen Luo

·

Yang Yu

Diversifying the results retieved is an effective approach to tackling users’ information needs in Twitter, which typically described by query phrase are often ambiguous and have more than one interpretation. Due to tweets being often very short and lacking in reliable grammatical sytle, it reduces the effectiveness of traditional IR and NLP techniques. However, Twitter, as a social media, also presents interesting opportunies for this task (for example the author information such as the number of statuses). In this paper, we firstly address diversitication of the search results in Twitter with a learning method and explore a series of diversity features describing the relationship between tweets which include tweet content, sub-topic of tweet and the Twitter specific social information such as hashtags. The experimental results on the Tweets2013 datasets demonstrate the effectiveness of the learning approach. Additionally, the Twitter retrieval task achieves improvement by taking into account the diversity features. Finally, we find the sub-topic and Twitter specific social features can help solve the diversity task, especially the post time, hashtags of tweet and the location of author.


Structuring Tweets for improving Twitter search

May 2015

·

366 Reads

·

11 Citations

Journal of the Association for Information Science and Technology

Spam and wildly varying documents make searching in Twitter challenging. Most Twitter search systems generally treat a Tweet as a plain text when modeling relevance. However, a series of conventions allows users to Tweet in structural ways using a combination of different blocks of texts. These blocks include plain texts, hashtags, links, mentions, etc. Each block encodes a variety of communicative intent and the sequence of these blocks captures changing discourse. Previous work shows that exploiting the structural information can improve the structured documents (e.g., web pages) retrieval. In this study we utilize the structure of Tweets, induced by these blocks, for Twitter retrieval and Twitter opinion retrieval. For Twitter retrieval, a set of features, derived from the blocks of text and their combinations, is used into a learning-to-rank scenario. We show that structuring Tweets can achieve state-of-the-art performance. Our approach does not rely on social media features, but when we do add this additional information, performance improves significantly. For Twitter opinion retrieval, we explore the question of whether structural information derived from the body of Tweets and opinionatedness ratings of Tweets can improve performance. Experimental results show that retrieval using a novel unsupervised opinionatedness feature based on structuring Tweets achieves comparable performance with a supervised method using manually tagged Tweets. Topic-related specific structured Tweet sets are shown to help with query-dependent opinion retrieval.

Citations (2)


... This can be done by eliminating redundant documents in a search result list and putting forward relevant documents for the user's potential information needs. Search result diversification gained a lot of attention recently and several works tackled different search domains: legal [12], biomedical [19], social media [18], social image retrieval [11], personalization [17], medical [13] etc. ...

Reference:

Fuzzy ontologies for search results diversification: application to medical data
Learning for Search Results Diversification in Twitter
  • Citing Conference Paper
  • June 2016

Lecture Notes in Computer Science