Measuring Peculiarity of Text Using Relation between Words on the Web

DOI: 10.1007/978-3-642-13654-2_13 Conference: The Role of Digital Libraries in a Time of Global Change, 12th International Conference on Asia-Pacific Digital Libraries, ICADL 2010, Gold Coast, Australia, June 21-25, 2010. Proceedings
Source: DBLP


We define the peculiarity of text as a metric of information credibility. Higher peculiarity means lower credibility. We extract
the theme word and the characteristic words from text and check whether there is a subject-description relation between them.
The peculiarity is defined using the ratio of the subject-description relation between a theme word and characteristic words.
We evaluate the extent to which peculiarity can be used to judge by classifying text from Wikipedia and Uncyclopedia in terms
of the peculiarity.

