Dimitar Dimitrov's research while affiliated with Universität Koblenz-Landau and other places

Publications (10)

Preprint
As one of the richest sources of encyclopedic information on the Web, Wikipedia generates an enormous amount of traffic. In this paper, we study large-scale article access data of the English Wikipedia in order to compare articles with respect to the two main paradigms of information seeking, i.e., search by formulating a query, and navigation by f...
Chapter
Allowing users to organize content by tagging resources in webbased systems has led to the emergence of the so-called SocialWeb. Tags turned out to be helpful not only for giving recommendations and improving search in social tagging systems but also for enhancing information access by navigating. In this chapter, we will cover much of the pioneer...
Conference Paper
While a plethora of hypertext links exist on the Web, only a small amount of them are regularly clicked. Starting from this observation, we set out to study large-scale click data from Wikipedia in order to understand what makes a link successful. We systematically analyze effects of link properties on the popularity of links. By utilizing mixed-ef...
Conference Paper
Navigation in an information space is a natural way to explore and discover its content. Information systems on the Web like digital encyclopedias (e.g., Wikipedia) are interested in providing good navigational support to their users. To that end, navigation models can be useful for estimating the general navigability of an information space and fo...
Article
While a plethora of hypertext links exist on the Web, only a small amount of them are regularly clicked. Starting from this observation, we set out to study large-scale click data from Wikipedia in order to understand what makes a link successful. We systematically analyze effects of link properties on the popularity of links. By utilizing mixed-ef...
Conference Paper
Wikipedia supports its users to reach a wide variety of goals: looking up facts, researching a topic, making an edit or simply browsing to pass time. Some of these goals, such as the lookup of facts, can be effectively supported by search functions. However, for other use cases such as researching an unfamiliar topic, users need to rely on the link...
Conference Paper
In this work, we study the visual position of links and their clicks on Wikipedia, particularly where links are visually located, at which screen positions users click on links, and which areas on the screen exhibit more or less clicks per links. For that purpose, we introduce a novel dataset containing the on-screen coordinate position for all lin...
Conference Paper
HypTrails is a bayesian approach for comparing different hypotheses about human trails on the web. While a standard implementation exists, it exposes performance issues when working with large-scale data. In this paper, we propose a distributed implementation of HypTrails based on Apache Spark taking advantage of several structural properties inher...
Conference Paper
Today, a variety of user interfaces exists for navigating information spaces, including, for example, tag clouds, breadcrumbs, subcategories and others. However, such navigational user interfaces are only useful to the extent that they expose the underlying topology---or network structure---of the information space. Yet, little is known about which...

Citations

... Information-search behavior on Wikipedia is strongly influenced by search functionalities and navigation paths (Dimitrov, Lemmerich, Flöck, & Strohmaier, 2018;Medelyan, Milne, Legg, & Witten, 2009), and its article and hyperlink structures, in particular, explain users' navigation paths on Wikipedia quite well (Lamprecht, Lerman, Helic, & Strohmaier, 2017). Because of its structure, Wikipedia allows for both general and specific searches: articles are hyperlinked to other articles, enabling users to learn about general topics, browse from one Wikipedia site to another and explore new topics there (Medelyan et al., 2009). ...
... For example, Begelman et al. [1] employed clustering to identify semantically related tags and also intended to remove redundant tags. Dimitrov et al. [9] utilized the structure among the tags to model tag-based navigation to discover intrinsic navigability of social tagging system. Bischoff et al. [3] attempted to classify tags into different categories with the help of tag taxonomies. ...
... In other investigations, researchers found that Wikipedia articles relay different traffic volumes based on their topics [3] and the type of page [5]. Readers also show preferences for links that appear at the top of the page and are semantically closer to the current article [4,10]. Reading preferences were shown to fall into 4 types of behaviors described as focus, trending, exploration and passing [11]. ...
... The basic assumption is that individuals or groups might have a regular activity area, which indicates the inner similarity of social and geographic closeness (Cranshaw et al. 2010). Becker et al. (2016) introduce a Bayesian approach for comparing hypotheses about human trails on the web. Piatkowski et al. (2013) present a graphical model designed for efficient probabilistic modeling of spatiotemporal data, which can keep the accuracy as well as efficiency. ...
... into sources/sink or bottlenecks [14,28]. Other approaches used the clickstream data to assess whether some articles should be read before others when learning about a specific topic [56], to detect structural biases in content [43], extract semantic relationships between articles [13], as a ground-truth for other tasks [11,58], or to study how the structure of the page influences the links clicked by the readers [15]. Finally, Rodi et al. [54] characterized search strategies in synthetic data generated from clickstream, however, not without making strong assumptions about the underlying process of navigation. ...
... A manual inspection of the top 100 common trees shows that the most frequent pathway (90 occurrences) that does not rely on navigational links of the infobox is a path from Egg to Philosophy, following only the first link in the article. This behavior may be caused by readers curious to verify the popular Wikipedia property that following the first link of each page recursively leads to the philosophy article [8,9,23]. ...
... It contains the counts of (source-target)-pairs of articles (i.e., how often a link from one page to another page was clicked by readers) in a given month. It has not only been used to study popularity of links [16] but also for generating synthetic reading pathways to infer properties of navigation patterns [54]. ...
... They found novelty-and diversity-based strategies to be the best (in this order). This was also confirmed by the findings in [11]; the authors found out that particularly useful tags are those with high popularity (i.e., occurring frequently in the dataset), but with low clustering, which means that is important to consider not only the number of co-occurring tags, but also their diversity. Even though tag clouds can be an efficient way of navigation in document corpora, this ability is seriously limited if there are too many resources and pagination is used as showed in [14]. ...