Mining Associative Meanings from the Web: from word disambiguation to the global brain

Source: CiteSeer

ABSTRACT . A general problem in all systems to process language (parsing, translating, etc.) is ambiguity: words have many, fuzzily defined meanings, and meanings shift with the context. This may be tackled by quantifying the connotative or associative meaning, which can be represented as a matrix of mutual association strengths. With many thousands of words, there are billions of possible associations, though, and there is no obvious method to measure all of them. This "knowledge acquisition bottleneck" can be tackled by mining implicit associations from the billions of documents and millions of users on the World-Wide Web. The present paper discusses two methods to achieve this: lexical co-occurrence, a measurement of the frequency with which words appear in each other's neighborhood, and web learning algorithms, an application of the Hebbian rule to create associations between subsequently "activated" words or pages. The mechanism of spreading activation can be applied to the resulting associative networks for clustering, contextdriven disambiguation, and personalized recommendation. A generalization of such methods could transform the web into a "global brain", that is, an intelligent, learning network that assimilates the implicit knowledge and preferences of its users. 1.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The present effort employs a new archival approach to study values and value- behavior relations, which is likely to be particularly useful in applied settings. A value lexicon was developed on the basis of the Schwartz (1992) value theory to extract lexical indicators of values from texts. The convergent, discriminant, and predictive validity of this measure was established using American newspaper content from 1900 to 2000 vis-à-vis existing self-report measures of values and objective indicators of value-expressive behaviors. Results provide empirical support for the use of the value lexicon to study values and value- behavior relations. First, the value lexicon demonstrated convergence with self-report responses of values. Second, values in American newspapers were associated with objective indicators of their corresponding value-expressive behaviors compared with noncorresponding value- expressive behaviors. Third, patterns of values over this 101-year period exhibited meaningful fluctuations with major historical and political events. The discussion describes new possibilities for future research on values in many applied settings with the value lexicon. The discussion also suggests that the principles of the value lexicon could be adopted to measure other psychological constructs of interest to applied psychology.
    Journal of Applied Psychology 06/2008; 93(3):483-97. · 4.31 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Traditional Web search engines mostly adopt a keyword-based approach. When the keyword submitted by the user is ambiguous, search result usually consists of documents related to various meanings of the keyword, while the user is probably interested in only one of them. In this paper we attempt to provide a solution to this problem using a k-nearest-neighbour approach to classify documents returned by a search engine, by building classifiers using data collected from collaborative tagging systems. Experiments on search results returned by Google show that our method is able to classify the documents returned with high precision.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Internetware is envisioned as a general software paradigm for the application style of resources integration and sharing in the open, dynamic and uncertain platforms such as the Internet. Continuing the agent-based Internetware model presented in a previous paper, in this paper, after an analysis of the behavioral patterns and the technical challenges of environment-driven applications, a software-structuring model is proposed for environment-driven Internetware applications. A series of explorations on the enabling techniques for the model, especially the modeling, management and utilization of context information are presented. Several prototypical systems have also been built to prove the concepts and evaluate the techniques. These research efforts make a further step toward the Internetware paradigm by providing an initial framework for the construction of context-aware and self-adaptive software application systems in the open network environment.
    Science in China Series F Information Sciences 01/2008; 51:683-721. · 0.66 Impact Factor

Full-text (2 Sources)

Available from
Jun 6, 2014