Conference Paper

Relationships among category semantics, perceptions of term utility, and term length and order in a social content creation system.

DOI: 10.1145/2132176.2132280 Conference: iConference 2012, Toronto, Ontario, Canada, February 7-10, 2012
Source: DBLP

ABSTRACT While there are increased efforts to extend existing controlled vocabularies through harvesting socially created image metadata from content creation communities (e.g., Flickr), questions remain about the quality and reuse value of this metadata. Data from a controlled experiment was used to examine relationships among categories of image tags, tag assignment order, and users' perception of usefulness of preassigned image index terms. Preliminary findings indicate that, on average, "Group" category terms were assigned first, and were also rated highest in usefulness. Other broad tag categories that were assigned earlier and rated more useful were Human Attributes and People, but others were more variable. However, the study found no correlation between tag length and assignment order, or term length and its perceived usefulness. The study's findings can inform the design of controlled vocabularies, indexing processes, and retrieval systems for images.

  • [Show abstract] [Hide abstract]
    ABSTRACT: There is at present a dearth of information on the everyday image information behavior of ordinary people. Analysis of a set of 64 image-related searches provides insight into potentially useful facilities for an image digital library
    ACM/IEEE Joint Conference on Digital Libraries, JCDL 2006, Chapel Hill, NC, USA, June 11-15, 2006, Proceedings; 01/2006
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: There have been ample suggestions in the literature that terms added to documents from Flickr and Wikipedia can complement traditional methods of indexing and controlled vocabularies. At the same time, adding new metadata to existing metadata objects may not always add value to those objects. This research examines the potential added value of using user-contributed ("social") terms from Flickr and the English Wikipedia in image indexing compared with using two expert-created controlled vocabularies— the Thesaurus for Graphic Materials and the Library of Congress Subject Headings. Our experiments confirmed that the social terms did provide added value relative to terms from the controlled vocabularies. The median rating for the usefulness of social terms was significantly higher than the baseline rating but was lower than the ratings for the terms from the Thesaurus for Graphic Materials and the Library of Congress Subject Headings. Furthermore, complementing the controlled vocabulary terms with social terms more than doubled the average coverage of participants' terms for a photograph. The study also investigated the relationships between user demographics and users' perceptions of the value of terms, as well as the relationships between user demographics and indexing quality, as measured by the number of terms participants assigned to a photograph. It was found that the participants with more tagging and indexing experience assigned a greater number of tags than did the other participants.
    Library & Information Science Research 04/2012; 34(2). · 1.63 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: There is growing interest in, and an increasing number of attempts by, traditional information providers to engage social content creation and sharing communities in creating and enhancing the metadata of their photo collections to make the collections more accessible and visible. To enable and guide effective metadata creation, however, it is essential to understand the structure and patterns of the activities of the community around the photographs, resources used, and scale and quality of the socially created metadata relative to the metadata and knowledge already encoded in existing knowledge organization systems. This article presents an analysis of Flickr member discussions around the photographs of the Library of Congress photostream in Flickr. The article also reports on an analysis of the intrinsic and relational quality of the photostream tags relative to two knowledge organization systems: the Thesaurus for Graphic Materials and the Library of Congress Subject Headings. Thirty seven percent of the original tag set and 15.3% of the preprocessed set (after the removal of tags with fewer than three characters and URLs) were invalid or misspelled terms. Nouns, named entity terms, and complex terms constituted approximately 77% of the preprocessed set. More than a half of the photostream tags were not found in the TGM and LCSH, and more than a quarter of those terms were regular nouns and noun phrases. This suggests that these terms could be complimentary to more traditional methods of indexing using controlled vocabularies.
    Journal of the American Society for Information Science and Technology 12/2010; 61(12):2477-2489. · 2.01 Impact Factor

Full-text (2 Sources)

Available from
May 16, 2014