Conference Paper

Purpose tagging: Capturing user intent to assist goal-oriented social search

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

The terms that are used by users during tagging have been found to be different from the terms that are used when searching for resources, which represents a fundamental problem for search in tagging based systems. To address this problem, we propose purpose tagging as a novel kind of tagging that focuses on capturing aspects of intent rather than content. By capturing the different purposes a given resource can serve, purpose tags appear useful to mediate between the vocabulary of user intent on one hand, and the vocabulary of contents and tags provided by social software applications on the other. The paper at hand makes the following contributions: 1) It extends the set of known kinds of tags with a novel type and 2) it provides first empirical evidence of the principle feasibility of purpose tagging and its potential to facilitate goal-oriented social search in an exploratory case study.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... The same authors also created the so called Taglines 2 which is an online tool demonstrating some novel contributions for expressing timescales to generate the possibility to navigate through the interesting tags for a particular period of time [5] . Alternative ways, where intent annotations can play a major role in supporting a user's understanding, including results presented in [13] that show how capturing aspects of intent rather than content can support social software. The work of [11] explores the way how users express their intentions in digital photo search. ...
... The algorithm described there is only one possible way how intent annotations can be generated. Also the already mentioned work of [13] shows another possibility . However this paper focuses mainly on exploring the usage and benefits of visual interfaces for intent annotations; the generation of the intent tags is not the focus of our investigations here. ...
Article
Full-text available
Getting a quick impression of the author's intention of a text is an task often performed. An author's intention plays a major role in successfully understanding a text. For supporting readers in this task, we present an inten-tional approach to visual text analysis, making use of tag clouds. The objective of tag clouds is presenting meta-information in a visually appealing way. How-ever there is also much uncertainty associated with tag clouds, such as giving the wrong impression. It is not clear whether the author's intent can be grasped clearly while looking at a corresponding tag cloud. Therefore it is interesting to ask to what extent, with tag clouds, it is possible to support the user in under-standing intentions expressed. In order to answer this question, we construct an intentional perspective on textual content. Based on an existing algorithm for extracting intent annotations from textual content we present a prototypical implementation to produce intent tag clouds, and describe a formative testing, illustrating how intent visualizations may support readers in understanding a text successfully. With the initial prototype, we conducted user studies of our intentional tag cloud visualization and a comparison with a traditional one that visualizes frequent terms. The evaluation's results indicate, that intent tag clouds have a positive effect on supporting users in grasping an author's intent.
... None of these classification approaches however categorize the tweet based on the intent of the user posting every single tweets. Several studies on users intentions using microblogging platform showed that people use microblogging as informal learn- ing [4], business branding, organization communication [13], discussion channel, sharing information/URL [6] , etc. However , there is no work on user intentions in individual level of single tweets, but only anecdotal reports in non-scientific websites or magazine From the different perspective, user intention in annotation resources [11] also can play a major role in supporting a user's social search. Stromhaier [11] propose a novel idea in tag recommendation; purpose tagging which focuses on capturing aspects of intent( " what it can be used for " ). ...
... Several studies on users intentions using microblogging platform showed that people use microblogging as informal learn- ing [4], business branding, organization communication [13], discussion channel, sharing information/URL [6] , etc. However , there is no work on user intentions in individual level of single tweets, but only anecdotal reports in non-scientific websites or magazine From the different perspective, user intention in annotation resources [11] also can play a major role in supporting a user's social search. Stromhaier [11] propose a novel idea in tag recommendation; purpose tagging which focuses on capturing aspects of intent( " what it can be used for " ). He stated that keywords or tags issued by user exhibit his/her intent in annotation the resources. ...
... Subramanya and Liu [17] propose a system that automatically recommends tags for blogs, using similarity ranking in a manner similar to collaborative filtering techniques. Stromhaier [16] studies a novel idea in tag recommendation, which bridges the gap between the keywords issued by a user in a query and the tags actually used by a social system. He argues that the tags used by a user when performing a query exhibit his or her intent, whereas the annotations of items describe content semantics. ...
Conference Paper
Full-text available
Social network systems, like last.fm, play a signicant role in Web 2.0, containing large amounts of multimedia-enriched data that are enhanced both by explicit user-provided anno- tations and implicit aggregated feedback describing the per- sonal preferences of each user. It is also a common tendency for these systems to encourage the creation of virtual net- works among their users by allowing them to establish bonds of friendship and thus provide a novel and direct medium for the exchange of data. We investigate the role of these additional relationships in developing a track recommendation system. Taking into account both the social annotation and friendships inherent in the social graph established among users, items and tags, we created a collaborative recommendation system that ef- fectively adapts to the personal information needs of each user. We adopt the generic framework of Random Walk with Restarts in order to provide with a more natural and ecient way to represent social networks. In this work we collected a representative enough portion of the music social network last.fm, capturing explicitly ex- pressed bonds of friendship of the user as well as social tags. We performed a series of comparison experiments between the Random Walk with Restarts model and a user-based collaborative ltering method using the Pearson Correlation similarity. The results show that the graph model system benets from the additional information embedded in social knowledge. In addition, the graph model outperforms the standard collaborative ltering method.
... [38] and [39] performed semantic tagging on terms lexically using the Unified Medical Language System (UMLS). [41] explores the use of "purpose tagging" to better capture the intent of the user when using tags to improve search results. The authors evaluate their work in a case study but do not provide a quantitative analysis of improvements in the search results. ...
Chapter
Full-text available
Under-specified queries often lead to undesirable search results that do not contain the information needed. This problem gets worse when it comes to medical information, a natural human demand everywhere. Existing search engines on the Web often are unable to handle medical search well because they do not consider its special requirements. Often a medical information searcher is uncertain about his exact questions and unfamiliar with medical terminology. To overcome the limitations of under-specified queries, we utilize tags to enhance information retrieval capabilities by expanding users’ original queries with context-relevant information. We compute a set of significant tag neighbor candidates based on the neighbor frequency and weight, and utilize the qualified tag neighbors to expand an entry query. The proposed approach is evaluated by using MedWorm medical article collection and results show considerable precision improvements over state-of-the-art approaches.
... This is done either for semantic reasons (for example, to enrich information items with additional meta data), conversational reasons (for example, for social signaling) [3] or for organizational reasons (for example, to categorize infor- mation) [21]. Regardless of why people tag [26, 29, 28], tags are typically visualized as the so-called tag clouds [3]. Basically, a tag cloud is a selection of tags related to a particular resource. ...
Article
Full-text available
Recent research has shown that the navigability of tagging systems leaves much to be desired. In general, it was observed that tagging systems are not navigable if the resource lists of the tagging system are limited to a certain factor k. Hence, in this paper a novel resource list generation approach is introduced that addresses this issue. The proposed approach is based on a hierarchical network model. The paper shows through a number of experiments based on a tagging dataset from a large online encyclopedia system called Austria-Forum, that the new algorithm is able to create tag network structures that are navigable in a efficient manner. Contrary to previous work, the method featured in this paper is completely generic, i.e. the introduced resource list generation approach could be used to improve the navigability of any tagging system. This work is relevant for researchers interested in navigability of emergent hypertext structures and for engineers seeking to improve the navigability of tagging systems.
... The three disjoint, finite sets of such a graph correspond to 1. a set of persons or users u ϵ U 2. a set of resources or objects o ϵ O and 3. a set of annotations or tags t ϵ T which are used by users U to annotate objects O. A very general model of folksonomies is defined by a set of annotations í µí°¹  í µí±ˆ × í µí±‡ × í µí±‚ ( [29], [30], [31], [32]). ...
... The study presents the idea of "tagging the tags and their relations" as a solution to the problem that tags are either depleted of accurate meanings, or they have meanings but without proper information about their contexts. In this "Extreme Tagging The research motivation behind the study in [302] is that "The terms that are used by users during tagging have been found to be different from the terms that are used when searching for resources, which represents a fundamental problem for search in tagging based systems". The solution provided in this work is to capture aspects of intent rather than content, in addition to common tagging practice, the users will have to indicate why they are tagging certain web content. ...
... Recently, social bookmarking systems emerged as an interesting alternative to search engines for finding relevant content [3,7]. These systems apply the con- cept of social navigation [5] i.e. users browse by means of so-called tag clouds, which are collections of keywords assigned to different online resources by dif- ferent users [2] driven by different motivations [8]. ...
Conference Paper
Full-text available
Nowadays, Web encyclopedias suffer from a high bounce rate. Typically, users come to an encyclopaedia from a search engine and upon reading the first page on the site they leave it immediately thereafter. To tackle this problem in systems such as Web shops additional browsing tools for easy finding of related content are provided. In this paper we present a tool that links related content in an encyclopaedia in a usable and visually appealing manner. The tool combines two promising approaches – tag clouds and historic search queries – into a new single one. Hence, each document in the system is enriched with a tag cloud containing collections of related concepts populated from historic search queries. A preliminary implementation of the tool is already provided within a Web encyclopaedia called Austria-Forum.
... Ruch et al. [38] and SB [24] performed semantic tagging on terms lexically using the Unified Medical Language System (UMLS). Strohmaier [40] explores the use of "purpose tagging" to better capture the intent of the user when using tags to improve search results. The authors evaluate their work in a case study but do not provide a quantitative analysis of improvements in the search results. ...
Article
Medical information is a natural human demand. Existing search engines on the Web often are unable to handle medical search well because they do not consider its special requirements. Often a medical information searcher is uncertain about his exact questions and unfamiliar with medical terminology. Under-specified queries often lead to undesirable search results that do not contain the information needed. To overcome the limitations of under-specified queries, we utilize tags to enhance information retrieval capabilities by expanding users’ original queries with context-relevant information. We compute a set of significant tag neighbor candidates based on the neighbor frequency and weight, and utilize the qualified tag neighbors to expand an entry query. The proposed approach is evaluated by using MedWorm medical article collection and results show considerable precision improvements over state-of-the-art approaches.
... Similarly, better understanding of the information seeking purpose, or task at hand, would allow us to better match the search methods with the search task (e.g. McNee, 2006;Strohmaier, 2008). ...
Article
Full-text available
Vuorikari, R. (2009). Tags and self-organisation: a metadata ecology for learning resources in a multilingual context. Doctoral thesis. November, 13, 2009, Heerlen, The Netherlands: Open University of the Netherlands, CELSTEC.
... Fortunately this involvement also provides a new kind of tools managed directly by users for content classification: tagging. Given human tendency to imitate each others [1], there are studies ( [2], [3], [4]) that analyze the use of tags to face this issue. ...
Conference Paper
Full-text available
In this paper we describe a model of social and collaborative search based on the use of tags. First we will introduce the issues that drove us to the definition of this model, analyzing different elements characterizing the web 2.0 paradigm and their effect on traditional search and classification systems. Afterwards we will present TAG Vision, a prototype implementation of the model, developed in order to investigate new approaches for information retrieval.
... Intent annotations could be useful, for example, to quickly grasp the main aspirations implicitly addressed by resources or to enable goal-oriented navigation of resources, such as blogs, on the web (cf. for example, [18]).Figure 1 shows an example tag cloud 1 of intent annotations.Figure 1 shows an example tag cloud of intent annotations.Figure 1 aims to illustrate the notion of intent annotations by giving an example of a tag cloud revealing information about goals and intentions referenced in a textual resource. Without knowing the underlying resource, a range of interesting analyses becomes possible. ...
Conference Paper
Full-text available
Annotations represent an increasingly popular means for organizing, categorizing and finding resources on the "social" web. Yet, only a small portion of the total resources available on the web are annotated. In this paper, we describe a prototype - iTAG - for automatically annotating textual resources with human intent, a novel dimension of tagging. We investigate the extent to which the automatic analysis of human intentions in textual resources is feasible. To address this question, we present selected evidence from a study aiming to automatically annotate intent in a simplified setting, that is transcripts of speeches given by US presidential candidates in 2008.
... Imagingvis, hb1, 958). Also, Strohmaier (2008) describes purpose tags which denote non-content specific functions that relate to an information seeking task of users (e.g. learn about LaTeX, get recommendations for music, translate text). ...
Conference Paper
Full-text available
Social tagging systems allow users to upload and assign keywords to digital resources. Thus a body of user annotated resources gradually evolves: Users can share resources, re-find their own resources or use the systems as search engines for items added by the whole user population. In this paper we want to contribute towards a better understanding of usage patterns within social tagging systems by presenting results from a survey of 142 users of the systems Flickr, Youtube, Delicious and Connotea. Data was gathered partly by using the Mechanical Turk service, and partly via an announcement on the Connotea blog. Our study reveals differences of user motivation and tag usage between systems. While (resource) sharing emerges as an all-embracing intra-system motivation, users differ with respect to social spheres of sharing. Based on our results which we integrated with earlier research from Cool and Belkin (2002), we propose a model of information behaviour in social tagging systems.
Article
Full-text available
In this paper we present an approach to improving navigability of a hierarchically structured Web content. The approach is based on an integration of a tagging module and adoption of tag clouds as a navigational aid for such content. The main idea of this approach is to apply tagging for the purpose of a better highlighting of cross–references between information items across the hierarchy. Although in principle tag clouds have the potential to support efficient navigation in tagging systems, recent research identified a number of limitations. In particular, applying tag clouds within pragmatic limits of a typical user interface leads to poor navigational performance as tag clouds are vulnerable to a so-called pagination effect. In this paper, a solution to the pagination problem is discussed, implemented as a part of an Austrian online encyclopedia called Austria-Forum, and analyzed. In addition, a simulation-based evaluation of the new algorithm has been conducted. The first evaluation results are quite promising, as the efficient navigational properties are restored.
Article
Full-text available
This paper presents an approach for an automated discovery, categorization and retrieval of metadata enriched E-learning resources. This research has further been implemented on a real platform "HyperManyMedia" at Western Kentucky University. It also presents the contribution that this work has made to the Adaptive Hypermedia systems. The main research question guiding this work is whether it is feasible and benecial to add the metadata to learning resources, while still being able to retrieve learning resources that are satisfactory and effective for the learner. The main goals of our HyperManyMedia system as implementation based on the suggested architecture, are to: (1) provide the learner with metadata search engine to overcome the limitations of searching for learning objects, and (2) be an open source repository for online learning.
Conference Paper
Based on an increasing number of web resources and services, the mashup paradigm enables end users to create custom web applications consisting of several components in order to fulfill specific needs. End user development of such composite web applications poses tough challenges to composition platforms, especially with non-programmers as end users. For instance, communicating on a non-technical level is crucial. Furthermore, assistance is essential throughout the entire process, ranging from composition to usage of mashups. Amongst others, users should be supported by explaining inter-widget communication, by helping to understand a mashup’s functionality and by identifying mashups providing desired functionality. However, prevalent mashup solutions provide no or limited concepts regarding these aspects. In this paper, we introduce our proposal for formalizing and calculating the functionality of mashup compositions based on capabilities and communication relations of mashup components as well as semantic domain knowledge. It serves as a foundation for our assisted, capability-centered end user development approach within the CRUISE platform. The latter features several assistance mechanisms, like presenting the functionality of mashups and recommending composition steps. We describe a prototypical implementation of the proposed algorithm and discuss its usage in our platform. Additionally, we evaluate our modeling and algorithmic concepts by means of example applications and an expert evaluation.
Conference Paper
Full-text available
Already existing open educational resources in management have a high potential for enterprises to address the increasing training needs of their employees. However, access barriers still prevent the full exploitation of this potential. Users have to search a number of repositories with heterogeneous interfaces in order to retrieve the desired content. In addition, the use of search criteria related to skills, such as learning objectives and skill-levels is not generally supported. The demonstrator presented in this paper addresses these shortcomings by federating multiple repositories, integrating and enriching their metadata, and employing skill-based search for management related content.
Conference Paper
Social bookmarking tools are generating an enormous pool of metadata describing and categorizing web resources. The value of these metadata in the form of tags can be fully realized only when they are shared and reused for web search and retrieval. The research described in this paper proposes a facet classification mechanism, and a tag relationship ontology to organize tags into a meaningful and intuitively useful structure. We have implemented a web-based prototype system to effectively search and browse bookmarked web resources using this approach. We collected real tag data from del.icio.us for a wide range of popular domains. We analyzed, processed, and organized these tags to demonstrate the effectiveness and utility of our approach for tag organization and reuse.
Article
Today'smultimedia search engines are expected to respond to queries reflecting a wide variety of information needs from users with different goals. The topical dimension ("what" the user is searching for) of these information needs is well studied; however, the intent dimension ("why" the user is searching) has received relatively less attention. Specifically, intent is the "immediate reason, purpose, or goal" that motivates a user to query a search engine. We present a thorough survey of multimedia information retrieval research directed at the problem of enabling search engines to respond to user intent. The survey begins by defining intent, including a differentiation from related, often-confused concepts. It then presents the key conceptual models of search intent. The core is an overview of intent-aware approaches that operate at each stage of the multimedia search engine pipeline (i.e., indexing, query processing, ranking). We discuss intent in conventional text-based search wherever it provides insight into multimedia search intent or intentaware approaches. Finally, we identify and discuss the most important future challenges for intent-aware multimedia search engines. Facing these challenges will allow multimedia information retrieval to recognize and respond to user intent and, as a result, fully satisfy the information needs of users.
Chapter
Originally introduced by social bookmarking systems, collaborative tagging, or social tagging, has been widely adopted by many web-based systems like wikis, e-commerce platforms, or social networks. Collaborative tagging systems allow users to annotate resources using freely chosen keywords, so called tags. Those tags help users in finding/retrieving resources, discovering new resources, and navigating through the system. The process of tagging resources is laborious. Therefore, most systems support their users by tag recommender components that recommend tags in a personalized way. The Discovery Challenges 2008 and 2009 of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) tackled the problem of tag recommendations in collaborative tagging systems. Researchers were invited to test their methods in a competition on datasets from the social bookmark and publication sharing system BibSonomy. Moreover, the 2009 challenge included an online task where the recommender systems were integrated into BibSonomy and provided recommendations in real time. In this chapter we review, evaluate and summarize the submissions to the two Discovery Challenges and thus lay the groundwork for continuing research in this area.
Article
This article focuses on the analysis of this distributed, heterogeneous, open and continuous production of “data about data” and its possible contribution to early detection objectives. It assesses whether the related processes of open classification and free annotation could provide promising results not only for information retrieval goals, which they are meant for, but also for detecting weak signals of incoming changes.
Article
This article describes an innovative approach to reorganizing the tag space generated by social bookmarking services. The objective of this work is to enable effective search and discovery of Web content using social bookmarking tags. Tags are metadata generated by users for Web content annotation. Their potential as effective Web search and discovery tool is hindered by challenges such as, the tag space being untidy due to ambiguity, and hidden or implicit semantics. Using a novel analytics approach, we conducted network analyses on tags and discovered that tags are generated for different purposes and that there are inherent relationships among tags. Our approach can be used to extract the purposes of tags and relationships among the tags and this information can be used as facets to add structure and hierarchy to reorganize the flat tag space. The semantics of relationships and hierarchy in our proposed faceted model of tags enable searches on annotated Web content in an effective manner. We describe the implementation of a prototype system called FASTS to demonstrate feasibility and effectiveness of our approach.
Article
Software development is a social process: tasks such as implementing a requirement or fixing a bug typically spark conversations between the stakeholders of a software project, where they identify points of uncertainty in the solution space and explore proposals to resolve them. Due to the fluid nature of these interactions, it is hard for project managers to maintain an overall understanding of the state of the discussion and to know when and how to intervene. We propose an approach for extracting the uncertainty information from developer conversations in order to provide managers with analytics. Using these allows us to recommend specific actions that managers can take to better facilitate the resolution of uncertainty.
Article
The advent of high-speed Internet connections has revolutionized the way research is being carried out to obtain relevant information. Conversely, retrieving pertinent information from the copious resources available is not only difficult but also time consuming. In the recent years, tagging activity has been perceived as a potential source of knowledge on personal preferences, interests, targets, goals, and other attributes. Tags allow users to effectively annotate resources using keywords to personalize their recommendations and organize the resources for easy retrieval. However, the preference of users varies extremely resulting in tagging being counterproductive. These shortcomings reduce the application of the tagging system for filtering as well as retrieval of information. The tag recommendation system becomes useful by suggesting a set of relevant keywords to annotate the resources. This paper presents a review of the tag recommendation systems and the constraints that affects the available tag recommendation systems. Furthermore, we propose the use of spreading activation algorithm to study the role of constructed topic ontology for efficient tag recommendations. This approach is founded on the assumption that tags that are recommended to the user are predicted from the extracted keywords from the existing blogs and the topics in constructed topic ontology. We have also proposed a tag classification system, namely Correlation-based Feature Selection–Hybrid Genetic Algorithm and classifier HGA-SVM (support vector machine), and have compared the results with results produced by other existing feature selection methods. The results obtained from the experiments have been presented.For further resources related to this article, please visit the WIREs website.Conflict of interest: The authors have declared no conflicts of interest for this article.
Article
Full-text available
The growing popularity of Social Networks raises the important issue of trust. Among many systems which have realized the impact of trust, Recommender Systems have been the most influential ones. Collaborative Filtering Recommenders take advantage of trust relations between users for generating more accurate predictions. In this paper, we propose a semantic recommendation framework for creating trust relationships among all types of users with respect to different types of items, which are accessed by unique URI across heterogeneous networks and environments. We gradually build up the trust relationships between users based on the rating information from user profiles and item profiles to generate trust networks of users. For analyzing the formation of trust networks, we employ T-index as an estimate of a user’s trustworthiness to identify and select neighbors in an effective manner. In this work, we utilize T-index to form the list of an item’s raters, called Top- Trustee list for keeping the most reliable users who have already shown interest in the respective item. Thus, when a user rates an item, he/she is able to find users who can be trustworthy neighbors even though they might not be accessible within an upper bound of traversal path length. An empirical evaluation demonstrates how T-index improves the Trust Network structure by generating connections to more trustworthy users. We also show that exploiting Tindex results in better prediction accuracy and coverage of recommendations collected along few edges that connect users on a Social Network.
Conference Paper
Full-text available
Many activities on the web are driven by high-level goals of users, such as “plan a trip” or “buy some product”. In this paper, we are interested in exploring the role and structure of users’ goals in web search. We want to gain insights into how users express goals, and how their goals can be represented in a semi-formal way. This paper presents results from an exploratory study that focused on analyzing selected search sessions from a search engine log. In a detailed example, we demonstrate how goal-oriented search can be represented and understood as a traversal of goal graphs. Finally, we provide some ideas on how to construct large-scale goal graphs in a semi-algorithmic, collaborative way. We conclude with a description of a series of challenges that we consider to be important for future research.
Conference Paper
Full-text available
The identification of the user’s intention or interest through queries that they submit to a search engine can be very useful to offer them more adequate results. In this work we present a framework for the identification of user’s interest in an automatic way, based on the analysis of query logs. This identification is made from two perspectives, the objectives or goals of a user and the categories in which these aims are situated. A manual classification of the queries was made in order to have a reference point and then we applied supervised and unsupervised learning techniques. The results obtained show that for a considerable amount of cases supervised learning is a good option, however through unsupervised learning we found relationships between users and behaviors that are not easy to detect just taking the query words. Also, through unsupervised learning we established that there are categories that we are not able to determine in contrast with other classes that were not considered but naturally appear after the clustering process. This allowed us to establish that the combination of supervised and unsupervised learning is a good alternative to find user’s goals. From supervised learning we can identify the user interest given certain established goals and categories; on the other hand, with unsupervised learning we can validate the goals and categories used, refine them and select the most appropriate to the user’s needs.
Conference Paper
Full-text available
Social bookmarking is a recent phenomenon which has the potential to give us a great deal of data about pages on the web. One major question is whether that data can be used to augment systems like web search. To answer this ques- tion, over the past year we have gathered what we believe to be the largest dataset from a social bookmarking site yet analyzed by academic researchers. Our dataset represents about forty million bookmarks from the social bookmarking site del.icio.us. We contribute a characterization of posts to del.icio.us: how many bookmarks exist (about 115 million), how fast is it growing, and how active are the URLs being posted about (quite active). We also contribute a character- ization of tags used by bookmarkers. We found that certain tags tend to gravitate towards certain domains, and vice versa. We also found that tags occur in over 50 percent of the pages that they annotate, and in only 20 percent of cases do they not occur in the page text, backlink page text, or forward link page text of the pages they annotate. We conclude that social bookmarking can provide search data not currently provided by other sources, though it may cur- rently lack the size and distribution of tags necessary to make a significant impact.
Conference Paper
Full-text available
In recent years, tagging systems have become increasingly popular. These systems enable users to add keywords (i.e., "tags") to Internet resources (e.g., web pages, images, videos) without relying on a controlled vocabulary. Tagging systems have the potential to improve search, spam detection, reputation systems, and personal organization while introducing new modalities of social communication and opportunities for data mining. This potential is largely due to the social structure that underlies many of the current systems. Despite the rapid expansion of applications that support tagging of resources, tagging systems are still not well studied or understood. In this paper, we provide a short description of the academic related work to date. We offer a model of tagging systems, specifically in the context of web-based systems, to help us illustrate the possible benefits of these tools. Since many such systems already exist, we provide a taxonomy of tagging systems to help inform their analysis and design, and thus enable researchers to frame and compare evidence for the sustainability of such systems. We also provide a simple taxonomy of incentives and contribution models to inform potential evaluative frameworks. While this work does not present comprehensive empirical results, we present a preliminary study of the photo- sharing and tagging system Flickr to demonstrate our model and explore some of the issues in one sample system. This analysis helps us outline and motivate possible future directions of research in tagging systems.
Conference Paper
Full-text available
Social bookmarking is an emerging type of a Web service that helps users share, classify, and discover interesting resources. In this paper, we explore the concept of an enhanced search, in which data from social bookmarking systems is exploited for enhancing search in the Web. We propose combining the widely used link-based ranking metric with the one derived using social bookmarking data. First, this increases the precision of a standard link-based search by incorporating popularity estimates from aggregated data of bookmarking users. Second, it provides an opportunity for extending the search capabilities of existing search engines. Individual contributions of bookmarking users as well as the general statistics of their activities are used here for a new kind of a complex search where contextual, temporal or sentiment-related information is used. We investigate the usefulness of social bookmarking systems for the purpose of enhancing Web search through a series of experiments done on datasets obtained from social bookmarking systems. Next, we show the prototype system that implements the proposed approach and we present some preliminary results.
Conference Paper
Full-text available
We describe online collaborative communities by tripartite networks, the nodes being persons, items and tags. We introduce projection methods in order to uncover the structures of the networks, i.e. communities of users, genre families... To do so, we focus on the correlations between the nodes, depending on their profiles, and use percolation techniques that consist in removing less correlated links and observing the shaping of disconnected islands. The structuring of the network is visualised by using a tree representation. The notion of diversity in the system is also discussed.
Article
Full-text available
Classic IR (information retrieval) is inherently predicated on users searching for information, the so-called "information need". But the need behind a web search is often not informational -- it might be navigational (give me the url of the site I want to reach) or transactional (show me sites where I can perform a certain transaction, e.g. shop, download a file, or find a map). We explore this taxonomy of web searches and discuss how global search engines evolved to deal with web-specific needs.
Article
Full-text available
Collaborative tagging describes the process by which many users add metadata in the form of keywords to shared content. Recently, collaborative tagging has grown in popularity on the web, on sites that allow users to tag bookmarks, photographs and other content. In this paper we analyze the structure of collaborative tagging systems as well as their dynamic aspects. Specifically, we discovered regularities in user activity, tag frequencies, kinds of tags used, bursts of popularity in bookmarking and a remarkable stability in the relative proportions of tags within a given URL. We also present a dynamic model of collaborative tagging that predicts these stable patterns and relates them to imitation and shared knowledge.
Article
Full-text available
Geographic space still lacks the semantics allowing a unified view of spatial data. Indeed, as a unique but all encompassing domain, it presents specificities that geospatial applications are still unable to handle. Moreover, to be useful, new spatial applications need to match human cognitive abilities of spatial representation and reasoning. In this context, eMerges, an approach to geospatial data integration based on Semantic Web Services (SWS), allows the unified representation and manipulation of heterogeneous spatial data sources. eMerges provides this integration by mediating legacy spatial data sources to high-level spatial ontologies through SWS and by presenting for each object context dependent affordances. This generic approach is applied here in the context of an emergency management use case developed in collaboration with emergency planners of public agencies.
Article
Full-text available
This paper gives an overview of current trends in manual indexing on the Web. Along with a general rise of user generated content there are more and more tagging systems that allow users to annotate digital resources with tags (keywords) and share their annotations with other users. Tagging is frequently seen in contrast to traditional knowledge organization systems or as something completely new. This paper shows that tagging should better be seen as a popular form of manual indexing on the Web. Difference between controlled and free indexing blurs with sufficient feedback mechanisms. A revised typology of tagging systems is presented that includes different user roles and knowledge organization systems with hierarchical relationships and vocabulary control. A detailed bibliography of current research in collaborative tagging is included.
Conference Paper
Full-text available
A novice search engine user may find searching the web for information difficult and frustrating because she may naturally express search goals rather than the topic keywords search engines need. In this paper, we present GOOSE (goal-oriented search engine), an adaptive search engine interface that uses natural language processing to parse a user's search goal, and uses "common sense" reasoning to translate this goal into an effective query. For a source of common sense knowledge, we use Open Mind, a knowledge base of approximately 400,000 simple facts such as "If a pet is sick, take it to the veterinarian " garnered from a Web-wide network of contributors. While we cannot be assured of the robustness of the common sense inference, in a substantial number of cases, GOOSE is more likely to satisfy the user's original search goals than simple keywords or conventional query expansion.
Conference Paper
On the web, search engines represent a primary instrument through which users exercise their intent. Understanding the specific goals users express in search queries could improve our theoretical knowledge about strategies for search goal formulation and search behavior, and could equip search engine providers with better descriptions of users’ information needs. However, the degree to which goals are explicitly expressed in search queries can be suspected to exhibit considerable variety, which poses a series of challenges for researchers and search engine providers. This paper introduces a novel perspective on analyzing user goals in search query logs by proposing to study different degrees of intentional explicitness. To explore the implications of this perspective, we studied two different degrees of explicitness of user goals in the AOL search query log containing more than 20 million queries. Our results suggest that different degrees of intentional explicitness represent an orthogonal dimension to existing search query categories and that understanding these different degrees is essential for effective search. The overall contribution of this paper is the elaboration of a set of theoretical arguments and empirical evidence that makes a strong case for further studies of different degrees of intentional explicitness in search query logs.
Conference Paper
Many users are familiar with the interesting but limited functionality of Data Detector interfaces like Microsoft's Smart Tags and Google's AutoLink. In this paper we significantly expand the breadth and functionality of this type of user interface through the use of large-scale knowledge bases of semantic information. The result is a Web browser that is able to generate personalized semantic hypertext, providing a goal-oriented browsing experience. We present (1) Creo, a Programming by Example system for the Web that allows users to create a general-purpose procedure with a single example, and (2) Miro, a Data Detector that matches the content of a page to high-level user goals.
Conference Paper
Why do people tag? Users have mostly avoided annotating media such as photos - both in desktop and mobile environments - despite the many potential uses for annotations, including recall and retrieval. We investigate the incentives for annotation in Flickr, a popular web-based photo-sharing system, and ZoneTag, a cameraphone photo capture and annotation tool that uploads images to Flickr. In Flickr, annotation (as textual tags) serves both personal and social purposes, increasing incentives for tagging and resulting in a relatively high number of annotations. ZoneTag, in turn, makes it easier to tag cameraphone photos that are uploaded to Flickr by allowing annotation and suggesting relevant tags immediately after capture. A qualitative study of ZoneTag/Flickr users exposed various tagging patterns and emerging motivations for photo annotation. We offer a taxonomy of motivations for annotation in this system along two dimensions (sociality and function), and explore the various factors that people consider when tagging their photos. Our findings suggest implications for the design of digital photo organization and sharing applications, as well as other applications that incorporate user-based annotation.
Book
This is an extensively revised and expanded second edition of the successful textbook on social network analysis integrating theory, applications, and network analysis using Pajek. The main structural concepts and their applications in social research are introduced with exercises. Pajek software and data sets are available so readers can learn network analysis through application and case studies. Readers will have the knowledge, skill, and tools to apply social network analysis across the social sciences, from anthropology and sociology to business administration and history. This second edition has a new chapter on random network models, for example, scale-free and small-world networks and Monte Carlo simulation; discussion of multiple relations, islands, and matrix multiplication; new structural indices such as eigenvector centrality, degree distribution, and clustering coefficients; new visualization options that include circular layout for partitions and drawing a network geographically as a 3D surface; and using Unicode labels. This new edition also includes instructions on exporting data from Pajek to R software. It offers updated descriptions and screen shots for working with Pajek (version 2.03).
Conference Paper
Social bookmarking systems allow users to store links to internet resources on a web page. As social bookmarking systems are growing in popularity, search algorithms have been developed that transfer the idea of link-based rankings in the Web to a social bookmarking system’s data structure. These rankings differ from traditional search engine rankings in that they incorporate the rating of users. In this study, we compare search in social bookmarking systems with traditionalWeb search. In the first part, we compare the user activity and behaviour in both kinds of systems, as well as the overlap of the underlying sets of URLs. In the second part,we compare graph-based and vector space rankings for social bookmarking systems with commercial search engine rankings. Our experiments are performed on data of the social bookmarking system Del.icio.us and on rankings and log data from Google, MSN, and AOL. We will show that part of the difference between the systems is due to different behaviour (e. g., the concatenation of multi-word lexems to single terms in Del.icio.us), and that real-world events may trigger similar behaviour in both kinds of systems. We will also show that a graph-based ranking approach on folksonomies yields results that are closer to the rankings of the commercial search engines than vector space retrieval, and that the correlation is high in particular for the domains that are well covered by the social bookmarking system.
Article
Auch erschienen in: Moor, Aldo de u.a. (Hrsg.): Proceedings of the First Conceptual Structures Tool Interoperability Workshop at the 14th International Conference on Conceptual Structures. Aalborg : Universitetsforlag, 2006. S. 87-102 Social bookmark tools are rapidly emerging on the Web. In such systems users are setting up lightweight conceptual structures called folksonomies. The reason for their immediate success is the fact that no specific skills are needed for participating. In this paper we specify a formal model for folksonomies and briefly describe our own system BibSonomy, which allows for sharing both bookmarks and publication references in a kind of personal library.
Article
We describe ConceptNet, a freely available semantic network presently consisting of over 250,000 elements of commonsense knowledge. Inspired by Cyc, ConceptNet includes a wide range of commonsense concepts and relations, and inspired by WordNet, it is structured as a simple, easy-to-use semantic network. ConceptNet supports many of the same applications as WordNet, such as query expansion and determining semantic similarity, but it also allows simple temporal, spatial, affective, and several other types of inferences. This paper is structured as follows. We first discuss how ConceptNet was built and the nature and structure of its contents. We then present the ConceptNet toolkit, a reasoning system designed to support textual reasoning tasks by providing facilities for spreading activation, analogy, and path-finding between concepts. Third, we provide some quantitative and qualitative analyses of ConceptNet. We conclude by describing some ways we are currently exploring to improve ConceptNet.
  • D Roman
  • U Keller
  • H Lausen
  • J De Bruijn
  • R Lara
  • M Stollberg
  • A Polleres
  • C Feier
  • C Bussler
  • D Fensel
Roman, D., Keller, U., Lausen, H., de Bruijn, J., Lara, R., Stollberg, M., Polleres, A., Feier, C., Bussler, C. and Fensel, D. 2005. Web Service Modeling Ontology, Applied Ontology 1(1), 77-106.