Gerd Stumme

Gerd Stumme
Universität Kassel · Research Group of Knowledge & Data Engineering (KDE)

About

297
Publications
31,497
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,285
Citations

Publications

Publications (297)
Preprint
Attribute exploration is a method from Formal Concept Analysis (FCA) that helps a domain expert discover structural dependencies in knowledge domains which can be represented as formal contexts (cross tables of objects and attributes). In this paper we present an extension of attribute exploration that allows for a group of domain experts and explo...
Article
Full-text available
Zusammenfassung Social Machines sind ein Paradigma für die Gestaltung soziotechnischer Systeme, die unter Verwendung von Web- und Plattformlösungen das Potenzial digitaler Technologien mit der Eigenlogik sozialer Interaktion, Organisation und Strukturbildung auf neue Weise zusammenführen. Im Folgenden diskutieren wir das Paradigma der Social Machin...
Article
The curse of dimensionality is a phenomenon frequently observed in machine learning (ML) and knowledge discovery (KD). There is a large body of literature investigating its origin and impact, using methods from mathematics as well as from computer science. Among the mathematical insights into data dimensionality, there is an intimate link between t...
Preprint
The visualization of social networks is often hindered by their size as such networks often consist of thousands of vertices and edges. Hence, it is of major interest to derive compact structures that represent important connections of the original network. In order to do so, we transfer concepts from the realm of orometry to graphs. These concepts...
Preprint
Full-text available
Selecting the best scientific venue (i.e., conference/journal) for the submission of a research article constitutes a multifaceted challenge. Important aspects to consider are the suitability of research topics, a venue's prestige, and the probability of acceptance. The selection problem is exacerbated through the continuous emergence of additional...
Chapter
Formal Concept Analysis (FCA) allows to analyze binary data by deriving concepts and ordering them in lattices. One of the main goals of FCA is to enable humans to comprehend the information that is encapsulated in the data; however, the large size of concept lattices is a limiting factor for the feasibility of understanding the underlying structur...
Preprint
The automatic verification of document authorships is important in various settings. Researchers are for example judged and compared by the amount and impact of their publications and public figures are confronted by their posts on social media platforms. Therefore, it is important that authorship information in frequently used web services and pla...
Chapter
It is known that a (concept) lattice contains an n-dimensional Boolean suborder if and only if the context contains an n-dimensional contra-nominal scale as subcontext. In this work, we investigate more closely the interplay between the Boolean subcontexts of a given finite context and the Boolean suborders of its concept lattice. To this end, we d...
Chapter
Order diagrams allow human analysts to understand and analyze structural properties of ordered data. While an expert can create easily readable order diagrams, the automatic generation of those remains a hard task. In this work, we adapt force-directed approaches, which are known to generate aesthetically-pleasing drawings of graphs, to the realm o...
Chapter
Formal Concept Analysis (FCA) provides a method called attribute exploration which helps a domain expert discover structural dependencies in knowledge domains that can be represented by a formal context (a cross table of objects and attributes). Triadic Concept Analysis is an extension of FCA that incorporates the notion of conditions. Many extensi...
Preprint
Formal Concept Analysis (FCA) allows to analyze binary data by deriving concepts and ordering them in lattices. One of the main goals of FCA is to enable humans to comprehend the information that is encapsulated in the data; however, the large size of concept lattices is a limiting factor for the feasibility of understanding the underlying structur...
Preprint
Full-text available
The ubiquitous presence of WiFi access points and mobile devices capable of measuring WiFi signal strengths allow for real-world applications in indoor localization and mapping. In particular, no additional infrastructure is required. Previous approaches in this field were, however, often hindered by problems such as effortful map-building processe...
Article
Full-text available
The annual number of publications at scientific venues, for example, conferences and journals, is growing quickly. Hence, even for researchers it becomes harder and harder to keep track of research topics and their progress. In this task, researchers can be supported by automated publication analysis. Yet, many such methods result in uninterpretabl...
Preprint
It is known that a (concept) lattice contains an n-dimensional Boolean suborder if and only if the context contains an n-dimensional contra-nominal scale as subcontext. In this work, we investigate more closely the interplay between the Boolean subcontexts of a given finite context and the Boolean suborders of its concept lattice. To this end, we d...
Article
Full-text available
Creation and exchange of knowledge depends on collaboration. Recent work has suggested that the emergence of collaboration frequently relies on geographic proximity. However, being co-located tends to be associated with other dimensions of proximity, such as social ties or a shared organizational environment. To account for such factors, multiple d...
Preprint
Formal Concept Analysis (FCA) provides a method called attribute exploration which helps a domain expert discover structural dependencies in knowledge domains that can be represented by a formal context (a cross table of objects and attributes). Triadic Concept Analysis is an extension of FCA that incorporates the notion of conditions. Many extensi...
Preprint
Order diagrams allow human analysts to understand and analyze structural properties of ordered data. While an experienced expert can create easily readable order diagrams, the automatic generation of those remains a hard task. In this work, we adapt force-directed approaches, which are known to generate aesthetically-pleasing drawings of graphs, to...
Preprint
Full-text available
The annual number of publications at scientific venues, for example, conferences and journals, is growing quickly. Hence, even for researchers becomes harder and harder to keep track of research topics and their progress. In this task, researchers can be supported by automated publication analysis. Yet, many such methods result in uninterpretable,...
Conference Paper
A large amount of data accommodated in knowledge graphs (KG) is metric. For example, the Wikidata KG contains a plenitude of metric facts about geographic entities like cities or celestial objects. In this paper, we propose a novel approach that transfers orometric (topographic) measures to bounded metric spaces. While these methods were originally...
Article
Full-text available
Null model generation for formal contexts is an important task in the realm of formal concept analysis. These random models are in particular useful for, but not limited to, comparing the performance of algorithms. Nonetheless, a thorough investigation of how to generate null models for formal contexts is absent. Thus we suggest a novel approach us...
Chapter
We give a brief introduction into Formal Concept Analysis, an approach to explaining data by means of lattice theory.
Preprint
A common representation of information about relations of objects and attributes in knowledge domains are data-tables. The structure of such information can be analysed using Formal Concept Analysis (FCA). Attribute exploration is a knowledge acquisition method from FCA that reveals dependencies in a set of attributes with help of a domain expert....
Poster
Full-text available
We propose an interactive game for exploring implicit knowledge in Wikidata, based on the theory laid out in the publication "Discovering Implicational Knowledge in Wikidata" and the Conceptual Exploration techniques from Formal Concept Analysis. This poster describes the prototype implementation of the game. The development on this tool continues.
Preprint
Full-text available
A large amount of data accommodated in knowledge graphs (KG) is actually metric. For example, the Wikidata KG contains a plenitude of metric facts about geographic entities like cities, chemical compounds or celestial objects. In this paper, we propose a novel approach that transfers orometric (topographic) measures to bounded metric spaces. While...
Conference Paper
Computing conceptual structures, like formal concept lattices, is a challenging task in the age of massive data sets. There are various approaches to deal with this, e.g., random sampling, parallelization, or attribute extraction. A so far not investigated method in the realm of formal concept analysis is attribute selection, as done in machine lea...
Preprint
Full-text available
Order diagrams are an important tool to visualize the complex structure of ordered sets. Favorable drawings of order diagrams, i.e., easily readable for humans, are hard to come by, even for small ordered sets. Many attempts were made to transfer classical graph drawing approaches to order diagrams. Although these methods produce satisfying results...
Conference Paper
Knowledge graphs have recently become the state-of-the-art tool for representing the diverse and complex knowledge of the world. Among the freely available knowledge graphs, Wikidata stands out by being collaboratively edited and curated. Among the vast numbers of facts, complex knowledge is just waiting to be discovered, but the sheer size of Wiki...
Article
Social network analysis is playing an increasingly important role in sociological studies. At the same time, new technologies such as wearable sensors make it possible to collect new types of social network data. We employed RFID tags to capture face-to-face interactions of participants of two consecutive Ph.D. retreats of a graduate school on clim...
Preprint
Full-text available
The field of collaborative interactive learning (CIL) aims at developing and investigating the technological foundations for a new generation of smart systems that support humans in their everyday life. While the concept of CIL has already been carved out in detail (including the fields of dedicated CIL and opportunistic CIL) and many research obje...
Preprint
Full-text available
Concept lattice drawings are an important tool to visualize complex relations in data in a simple manner to human readers. Many attempts were made to transfer classical graph drawing approaches to order diagrams. Although those methods are satisfying for some lattices they unfortunately perform poorly in general. In this work we present a novel too...
Preprint
Full-text available
Knowledge graphs have recently become the state-of-the-art tool for representing the diverse and complex knowledge of the world. Examples include the proprietary knowledge graphs of companies such as Google, Facebook, IBM, or Microsoft, but also freely available ones such as YAGO, DBpedia, and Wikidata. A distinguishing feature of Wikidata is that...
Preprint
Full-text available
Computing conceptual structures, like formal concept lattices, is in the age of massive data sets a challenging task. There are various approaches to deal with this, e.g., random sampling, parallelization, or attribute extraction. A so far not investigated method in the realm of formal concept analysis is attribute selection, as done in machine lea...
Conference Paper
Finding structural similarities in graph data, like social networks, is a far-ranging task in data mining and knowledge discovery. A (conceptually) simple reduction would be to compute the automorphism group of a graph. However, this approach is ineffective in data mining since real world data does not exhibit enough structural regularity. Here we...
Preprint
Full-text available
For localization and mapping of indoor environments through WiFi signals, locations are often represented as likelihoods of the received signal strength indicator. In this work we compare various measures of distance between such likelihoods in combination with different methods for estimation and representation. In particular, we show that among t...
Preprint
Full-text available
Finding structural similarities in graph data, like social networks, is a far-ranging task in data mining and knowledge discovery. A (conceptually) simple reduction would be to compute the automorphism group of a graph. However, this approach is ineffective in data mining since real world data does not exhibit enough structural regularity. Here we...
Preprint
Full-text available
The curse of dimensionality in the realm of association rules is twofold. Firstly, we have the well known exponential increase in computational complexity with increasing item set size. Secondly, there is a \emph{related curse} concerned with the distribution of (spare) data itself in high dimension. The former problem is often coped with by projec...
Chapter
With the growth of the Social Web, a variety of new web-based services arose and changed the way users interact with the internet and consume information. One central phenomenon was and is tagging which allows to manage, organize and access information in social systems. Tagging helps to manage all kinds of resources, making their access much easie...
Article
Full-text available
Geometric analysis is a very capable theory to understand the influence of the high dimensionality of the input data in machine learning (ML) and knowledge discovery (KD). With our approach we can assess how far the application of a specific KD/ML-algorithm to a concrete data set is prone to the curse of dimensionality. To this end we extend V.~Pes...
Article
The k-Nearest Neighbor (kNN) classification approach is conceptually simple - yet widely applied since it often performs well in practical applications. However, using a global constant k does not always provide an optimal solution, e.g., for datasets with an irregular density distribution of data points. This paper proposes an adaptive kNN classif...
Article
Full-text available
Peatland fires and haze events are disasters with national, regional and international implications. The phenomena lead to direct damage to local assets, as well as broader economic and environmental losses. Satellite imagery is still the main and often the only available source of information for disaster management. In this article, we test the p...
Article
When evaluating the cause of one's popularity on Twitter, one thing is considered to be the main driver: Many tweets. There is debate about the kind of tweet one should publish, but little beyond tweets. Of particular interest is the information provided by each Twitter user's profile page. One of the features are the given names on those profiles....
Conference Paper
The analysis of social interaction networks is essential for understanding and modeling network structures as well as the behavior of the involved actors. This paper describes an analysis at large scale using (sensor) data collected by RFID tags complemented by self-report data obtained using surveys. We focus on the social network of a students' f...
Article
In social tagging systems, like Mendeley, CiteULike, and BibSonomy, users can post, tag, visit, or export scholarly publications. In this paper, we compare citations with metrics derived from users’ activities (altmetrics) in the popular social bookmarking system BibSonomy. Our analysis, using a corpus of more than 250,000 publications published be...
Conference Paper
Much attention has been given to the task of gender inference of Twitter users. Although names are strong gender indicators, the names of Twitter users are rarely used as a feature; probably due to the high number of ill-formed names, which cannot be found in any name dictionary. Instead of relying solely on a name database, we propose a novel name...
Article
Social bookmarking systems have established themselves as an important part in today’s Web. In such systems, tag recommender systems support users during the posting of a resource by suggesting suitable tags. Tag recommender algorithms have often been evaluated in offline benchmarking experiments. Yet, the particular setup of such experiments has r...
Conference Paper
Group formation and evolution are prominent topics in social contexts. This paper focuses on the analysis of group evolution events in networks of face-to-face proximity. We first analyze statistical properties of group evolution, e.g., individual activity and typical group sizes. After that, we define a set of specific group evolution events. Thes...
Conference Paper
Full-text available
Today, many people spend a lot of time online. Their social interactions captured in online social networks are an important part of the overall personal social profile, in addition to interactions taking place offline. This paper investigates whether relations captured by online social networks can be used as a proxy for the relations in offline s...
Article
Full-text available
Today, many people spend a lot of time online. Their social interactions captured in online social networks are an important part of the overall personal social profile, in addition to interactions taking place offline. This paper investigates whether relations captured by online social networks can be used as a proxy for the relations in offline s...
Conference Paper
Scholarly success is traditionally measured in terms of citations to publications. With the advent of publication management and digital libraries on the web, scholarly usage data has become a target of investigation and new impact metrics computed on such usage data have been proposed -- so called altmetrics. In scholarly social bookmarking system...
Article
Full-text available
The issue of sustainability is at the top of the political and societal agenda, being considered of extreme importance and urgency. Human individual action impacts the environment both locally (e.g., local air/water quality, noise disturbance) and globally (e.g., climate change, resource use). Urban environments represent a crucial example, with an...
Article
Full-text available
Applications of the Social Web are ubiquitous and have become an integral part of everyday life: Users make friends,for example,with the help of online social networks, share thoughts via Twitter, or collaboratively write articles in Wikipedia. All such interactions leave digital traces; thus, users participate in the creation of heterogeneous,dist...
Conference Paper
In this paper, we analyze the stability of user interaction ties within Twitter focusing on link decay prediction: for a tweet created by one user mentioning another user we study the task of predicting the decay of the corresponding interaction link over time. For this task, we employ the history of timestamped mention interactions between both us...
Article
This paper focuses on the prediction of real-world talk attendances at academic conferences with respect to different influence factors. We study and discuss the predictability of talk attendances using real-world face-to-face contact data and user interests extracted from the users' previous publications. For our experiments, we apply RFID-tracked...
Article
The analysis of link structures and particularly their dynamics is important for enhancing our understanding of the underlying (social) processes. This paper analyzes such structures in networks of face-to-face spatial proximity: we focus on evolving contacts and triadic closure and present new insights on the dynamic and static contact behavior in...
Article
Understanding the structures why links are formed is an important and prominent research topic. In this paper, we therefore consider the link prediction problem in face-to-face contact networks, and analyze the predictability of new and recurring links. Furthermore, we study additional influence factors, and the role of stronger ties in these netwo...
Article
This paper focuses on the prediction of real-world talk attendances at academic conferences with respect to different influence factors. We study the predictability of talk attendances using real-world tracked face-to-face contacts. Furthermore, we investigate and discuss the predictive power of user interests extracted from the users' previous pub...
Conference Paper
Indoor localization of humans is still a complex problem, especially in resource-constrained environments, e. g., if there is only a small number of data available over time. We address this problem using active RFID technology and focus on room-level localization. We propose several unsupervised localization approaches and compare their accuracy t...
Conference Paper
This paper focuses on the analysis of group evolution events in networks of face-to-face proximity. First, we analyze statistical properties of group evolution, e.g., individual activity and typical group sizes. Furthermore, we define a set of specific group evolution events. We analyze these using real-world data collected at the LWA 2010 conferen...
Conference Paper
This paper focuses on the predictability of recurring links: These links are generated repeatedly in a network for different forms of social ties, e.g. by face-to-face interactions in offline social networks. In particular, we analyse the predictability of recurring links in networks of face-to-face proximity using several path-based measures, and...
Chapter
The application of ubiquitous and social computational systems shows a rapidly increasing trend in our everyday environments: Enhancing social interactions and communication in both online and real-world settings is an important issue in a broad range of application contexts. This chapter describes the development of ubiquitous and social software...
Chapter
Exploiting social links is an important issue for enhancing ubiquitous knowledge engineering because they are a substitute for a wide range of properties depending on which relation spans the link: in case of human face-to-face contacts, similar locations or potential knowledge transfer for the people in contact can be derived. This information can...
Article
The combination of ubiquitous and social computing is an emerging research area which integrates different but complementary methods, techniques and tools. In this paper, we focus on the Ubicon platform, its applications, and a large spectrum of analysis results. Ubicon provides an extensible framework for building and hosting applications targetin...
Book
Social Tagging Systems are web applications in which users upload resources (e.g., bookmarks, videos, photos, etc.) and annotate it with a list of freely chosen keywords called tags. This is a grassroots approach to organize a site and help users to find the resources they are interested in. Social tagging systems are open and inherently social; fe...
Article
Temporal dynamics of social interaction networks as well as the analysis of communities are key aspects to gain a better understanding of the involved processes, important influence factors, their effects, and their structural implications. In this article, we analyze temporal dynamics of contacts and the evolution of communities in networks of fac...
Chapter
Das World Wide Web (WWW) hat sich seit seiner Geburtsstunde im Jahr 1989 rasant entwickelt. Stand am Anfang der einfache Informationsaustausch zwischen Wissenschaftlern im Vordergrund, werden heute vielfältige Dienste angeboten. Eine große Revolution im Web gab es vor ca. 10 Jahren, als das Web 2.0 entstand. In diesem Kapitel werden die technischen...
Chapter
Das Kapitel fasst die Ergebnisse des Buches zusammen und gibt einen Ausblick auf Datenschutzaspekte im ubiquitären und mobilen Web sowie im Themenfeld Collective Intelligence.
Chapter
Datenschutz zielt nicht auf den Schutz der Daten im Sinn der ausschließlichen Verfügung über die Daten durch den Datenverarbeiter – dies betrifft allenfalls Fragen der Datensicherheit, sondern letztlich auf eine freie Kommunikationsverfassung der Gesellschaft. Es geht um die Frage, wer über welche personenbezogenen Daten verfügen und diese in gesel...
Chapter
Die wachsende Popularität von Web 2.0Systemen lockt nicht nur echte Nutzer an. Auch Spammer stellen zu nehmend Posts in Social BookmarkingSystemen zu typischen SpamRessourcen (z. B. Links auf WebspamSeiten) ein, die die erwünschten Nutzer belästigen und dem Anbieter Speicherkapazität rauben. Auf Grund der Öffentlichkeit der Posts erscheinen die Lin...
Chapter
Aufgrund der unüberschaubaren Vielfalt von Anwendungsmöglichkeiten des Web 2.0 findetman mittlerweile fast zu jedem Lebensbereich eine passende Community im Netz. Dabeihaben OnlineBewertungsportaleals Erscheinungsform des Web 2.0 seit geraumer Zeit Konjunktur. Diese Entwicklung birgt die Gefahr, dass die dadurch gewonnenen persönlichen Daten das An...
Chapter
Social BookmarkingSysteme gehören zu den in den letzten Jahren entstandenen Web 2.0Anwendungen. Die zumeist kostenlos nutzbaren Angebote ermöglichen ihren Nutzern, verschiedene Medien (z. B. Bookmarks, Fotos oder Videos) online abzuspeichern und diese mit prägnanten, frei wählbaren Stichwörtern (Tags) zu beschreiben. In diesem Kapitel werden die Ke...
Chapter
Empfehlungssysteme bilden einen wertvollen Bestandteil des Web 2.0. Eine Einbeziehung von Bewertungen in verschiedenen Internetfunktionen und diensten entspricht umfassend der Kernidee des Web 2.0, den Nutzern von Internetdiensten eine Teilhabe einzuräumen. Gefilterte Sichten auf große Datenbestände helfen dem Nutzer, die für ihn relevanten Ressour...
Chapter
Das Web 2.0 verändert die Verantwortungsräume zwischen Anbietern und Nutzern von InternetPlattformen. Soweit die Nutzer die Inhalte der WebAngebote bestimmen, stellt sich die Frage, inwieweit die gesetzlichen Regelungen zur ProviderHaftung und die Rechtsprechung zur Störerhaftung anwendbar sind.
Article
Full-text available
With social media and the according social and ubiquitous applications finding their way into everyday life, there is a rapidly growing amount of user generated content yielding explicit and implicit network structures. We consider social activities and phenomena as proxies for user relatedness. Such activities are represented in so-called social i...
Conference Paper
An increasing number of platforms like Xively or ThingSpeak are available to manage ubiquitous sensor data enabling the Internet of Things. Strict data formats allow interoperability and informative visualizations, supporting the development of custom user applications. Yet, these strict data formats as well as the common feed-centric approach limi...
Conference Paper
Understanding the process of link creation is rather important for link prediction in social networks. Therefore, this paper analyzes contact structures in networks of face-to-face spatial proximity, and presents new insights on the dynamic and static contact behavior in such real world networks. We focus on face-to-face contact networks collected...