Mining Associative Meanings from the Web: from word disambiguation to the global brain

Source: CiteSeer

ABSTRACT . A general problem in all systems to process language (parsing, translating, etc.) is ambiguity: words have many, fuzzily defined meanings, and meanings shift with the context. This may be tackled by quantifying the connotative or associative meaning, which can be represented as a matrix of mutual association strengths. With many thousands of words, there are billions of possible associations, though, and there is no obvious method to measure all of them. This "knowledge acquisition bottleneck" can be tackled by mining implicit associations from the billions of documents and millions of users on the World-Wide Web. The present paper discusses two methods to achieve this: lexical co-occurrence, a measurement of the frequency with which words appear in each other's neighborhood, and web learning algorithms, an application of the Hebbian rule to create associations between subsequently "activated" words or pages. The mechanism of spreading activation can be applied to the resulting associative networks for clustering, contextdriven disambiguation, and personalized recommendation. A generalization of such methods could transform the web into a "global brain", that is, an intelligent, learning network that assimilates the implicit knowledge and preferences of its users. 1.

Download full-text


Available from: Francis Heylighen, Sep 18, 2012
19 Reads
  • Source
    • "It also is to discover its own categories for further usage, for example in the talking interfaces. For exploring such an area we use simple web-mining methods inspired on Heylighen et al.'s work [20]. As we described other particulars about Bacterium Lingualis before [11] we mention only that part C — the "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we introduce some ideas for reusing cognitive science concepts which realizing before was impossible due to the technical limits. We concentrate on the Schankian scripts which could help to build plans as the basic method for achieving goals. In contradistinction to the authors of classic cognitivistic ideas, we can currently use powerful computers and terabytes of data which could help to make their concepts usable in not restricted domains for any kind of application using commonsense knowledge. Many useful projects were abandoned because of difficulties due to the manual input of big sets of data. We plan to build a commonsense processing systems which retrieves commonsensical data from the WWW resources. This paper introduces the theoretical side of our research with some results of preliminary tests.
  • Source
    • "are not understandable for BL and the learning task is to discover them. For exploring such an area we use simple webmining methods inspired on Heylighen et al.'s work [15]. Most of the researches suggest that a machines have to be intelligent to mine knowledge for us, we suggest that they have to mine for themselves to be intelligent. "
    [Show abstract] [Hide abstract]
    ABSTRACT: As most of us subconsciously feel, it is a great difficulty to create a program which could imitate human's way of thinking. Recently the importance of the relation between expressions "feel", "create" and "way of thinking" used in the previous sentence is being noticed, what gave birth to so called "affective computing". During our experi- ments within GENTA project, we have observed useful con- notations between the common sense information and the emotional information which could be retrieved automati- cally from the Internet resources. Those observations seem promising for the language and knowledge acquisition and suggested us to investigate the subject, and also to develop some ideas, which could be useful to the researchers in vari- ous AI fields. We describe GENTA-related sub-projects and their preliminary experiments.
  • Source
    • "An extensive vocabulary is one of the most easy to observe results of this process (and as such one of the most commonly tested skills in IQ tests). New words are typically learned not by studying dictionaries, but by experiencing them in a context of already known words, so that associations with these words are created and the meaning can be inferred [Heylighen, 2001b]. This is something that better propagation will typically facilitate. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Giftedness, the potential for exceptional achievement, is characterized by high intelligence and creativity. Gifted people exhibit a complex of cognitive, perceptual, emotional, motivational and social traits. Extending neurophysiological hypotheses about the general intelligence (g) factor, a construct is proposed to explain these traits: neural propagation depth. The hypothesis is that in more intelligent brains, activation propagates farther, reaching less directly associated concepts. This facilitates problem-solving, reasoning, divergent thinking and the discovery of connections. It also explains rapid learning, perceptual and emotional sensitivity, and vivid imagination. Flow motivation is defined as the universal desire to balance skills and challenges. Gifted people, being more cognitively skilled, will seek out more difficult challenges. This explains their ambition, curiosity and perfectionism. Balance is difficult to achieve in interaction with non-gifted peers, though, explaining the gifted's autonomy, non-conformism and feeling of alienation. Together with the difficulty to find fitting challenges this constitutes a major obstacle to realizing the gifted's potential. The appendix sketches a simulation using word association networks to test the propagation depth model by answering IQ-test-like questions.Modern society has always been fascinated by creative genius [Simonton, 2001; Ochse, 1990; Eysenck, 1995; Terman, 1925], by the great thinkers, scientists, and artists, such as Einstein, Shakespeare or da Vinci, who have laid the foundations for our present knowledge and culture. Genius is generally viewed as very valuable, but rare. Therefore, it is worth investigating how we can optimally exploit this scarce resource, and, if possible, make it more abundant. This means that we must try to understand at the deepest level the characteristics that distinguish an exceptionally creative mind from an ordinary one. Moreover, we should try to understand the processes that produce these characteristics—whether at the biological, psychological, social or cultural level. This will allow us to see how we can foster such processes, and which obstacles we must remove in order to maximally reap the benefits from a gifted mind. That there are plenty of such obstacles becomes obvious once we note how unevenly distributed genius is: most well-known examples, such as the ones above, are European or American men, from a middle or upper class background. That world-changing creativity requires a minimum level of health, wealth, education and supporting infrastructure seems obvious, explaining why top intellectual achievements are rare in developing countries. However, it is much less obvious why such an exceedingly small number of women have reached the highest levels of eminence. None of the standard tests of intelligence and creativity find significant differences in potential achievement between men and women. The sociological observations of an "old boys networks" or "glass ceiling" in part explain this discrimination, but we need to move to a deeper, psychological level to fully understand the mechanisms that hold back women and other classes of gifted people from achieving their true potential. For that, we need to better understand what giftedness is.
Show more