Conference Paper

Social context summarization.

DOI: 10.1145/2009916.2009954 Conference: Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, July 25-29, 2011
Source: DBLP

ABSTRACT We study a novel problem of social context summarization for Web documents. Traditional summarization research has focused on extracting informative sentences from standard documents. With the rapid growth of online social networks, abundant user generated content (e.g., comments) associated with the standard documents is available. Which parts in a document are social users really caring about? How can we generate summaries for standard documents by considering both the informativeness of sentences and interests of social users? This paper explores such an approach by modeling Web documents and social contexts into a unified framework. We propose a dual wing factor graph (DWFG) model, which utilizes the mutual reinforcement between Web documents and their associated social contexts to generate summaries. An efficient algorithm is designed to learn the proposed factor graph model.Experimental results on a Twitter data set validate the effectiveness of the proposed model. By leveraging the social context information, our approach obtains significant improvement (averagely +5.0%-17.3%) over several alternative methods (CRF, SVM, LR, PR, and DocLead) on the performance of summarization.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a novel answer summarization method for community Question Answering services (cQAs) to address the problem of "incomplete answer", i.e., the "best answer" of a complex multi-sentence question misses valuable information that is contained in other answers. In order to automatically generate a novel and non-redundant community answer summary, we segment the complex original multi-sentence question into several sub questions and then propose a general Conditional Random Field (CRF) based answer summary method with group L1 regularization. Various textual and non-textual QA features are explored. Specifically, we explore four different types of contextual factors, namely, the information novelty and non-redundancy modeling for local and non-local sentence interactions under question segmentation. To further unleash the potential of the abundant cQA features, we introduce the group L1 regularization for feature learning. Experimental results on a Yahoo! Answers dataset show that our proposed method significantly outperforms state-of-the-art methods on cQA summarization task.
    Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1; 07/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the popularity of Web 2.0, comments left by readers on web documents have drawn much attention. In this paper, we study the problem of comments-oriented document summarization, which aims to summarize a web document by considering not only its content but also the comments. Generally, most of the comments usually convey one or a few aspects of the document. Given a sentence set from both the web document and its corresponding comments to summarize, we can divide different sentences into different clusters (named "aspects") according to the content. It is challenging and interesting to summarize the web document based on these clusters. Motivated by this, we propose a novel model: MultiAspectCoRank, for comments-oriented document summarization. Firstly we rank all the sentences based on the multiple aspects obtained from the whole document, and then provide each ranking list as feedback to others until the top-N results of each ranking list are unchanged. We get the final result by integrating these different ranking lists together. Experimental results on a set of real-world blog data with manually labeled sentences show the promising performance of our approach.
    Proceedings of the 14th international conference on Web-Age Information Management; 06/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Social media responses to news have increasingly gained in importance as they can enhance a consumer's news reading experience, promote information sharing and aid journalists in assessing their readership's response to a story. Given that the number of responses to an online news article may be huge, a common challenge is that of selecting only the most interesting responses for display. This paper addresses this challenge by casting message selection as an optimization problem. We define an objective function which jointly models the messages' utility scores and their entropy. We propose a near-optimal solution to the underlying optimization problem, which leverages the submodularity property of the objective function. Our solution first learns the utility of individual messages in isolation and then produces a diverse selection of interesting messages by maximizing the defined objective function. The intuitions behind our work are that an interesting selection of messages contains diverse, informative, opinionated and popular messages referring to the news article, written mostly by users that have authority on the topic. Our intuitions are embodied by a rich set of content, social and user features capturing the aforementioned aspects. We evaluate our approach through both human and automatic experiments, and demonstrate it outperforms the state of the art. Additionally, we perform an in-depth analysis of the annotated ``interesting'' responses, shedding light on the subjectivity around the selection process and the perception of interestingness.
    Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining; 08/2013

Full-text (2 Sources)

Available from
Jun 1, 2014