Dimitrios Gunopulos

Dimitrios Gunopulos
National and Kapodistrian University of Athens | uoa · Sector of Computer Systems and Applications

About

283
Publications
63,314
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
16,612
Citations
Introduction
Skills and Expertise

Publications

Publications (283)
Article
News articles generated by online media are a major source of information. In this work, we present News Monitor, a framework that automatically collects news articles from a wide variety of online news portals and performs various analysis tasks. The framework initially identifies fresh news (first stories) and clusters articles about the same inc...
Article
Full-text available
The Infant Mortality Rate (IMR) is defined as the number of infants for every thousand infants that do not survive until their first birthday. IMR is an important metric not only because it provides information about infant births in an area, but it also measures the general societal health status. In the United States of America, the IMR is higher...
Chapter
News articles generated by online media are a major source of information. In this work, we present News Monitor, a framework that automatically collects news articles from a variety of web pages and performs various analysis tasks. The framework initially identifies fresh news and clusters articles about the same incidents. For every story, it ext...
Article
News portals, such as Yahoo News or Google News, collect large amounts of news articles from a variety of sources on a daily basis. Only a small portion of these documents can be selected and displayed on the homepage. Thus, there is a strong preference for major, recent events. In this work, we propose a scalable First Story Detection (FSD) pipeli...
Conference Paper
Travel time estimation is a critical task, useful to many urban applications at the individual citizen and the stakeholder level. This paper presents a novel hybrid algorithm for travel time estimation that leverages historical and sparse real-time trajectory data. Given a path and a departure time we estimate the travel time taking into account th...
Preprint
Full-text available
The Infant Mortality Rate (IMR) is the number of infants per 1000 that do not survive until their first birthday. It is an important metric providing information about infant health but it also measures the society's general health status. Despite the high level of prosperity in the U.S.A., the country's IMR is higher than that of many other develo...
Conference Paper
Full-text available
Social event planning has received a great deal of attention in recent years where various entities, such as event planners and marketing companies, organizations, venues, or users in Event-based Social Networks , organize numerous social events (e.g., festivals, conferences, promotion parties). Recent studies show that "attendance" is the most com...
Article
Full-text available
Applications targeting smart cities tackle common challenges, however solutions are seldom portable from one city to another due to the heterogeneity of smart city ecosystems. A major obstacle involves the differences in the levels of available information. In this work, we present REMI, which is a mining framework that handles varying degrees of i...
Conference Paper
In this demonstration we present \emph{Dione} a novel framework for automatic profiling and tuning big data applications. Our system allows a non-expert user to submit Spark or Flink applications to his/her cluster and Dione automatically determines the impact of different configuration parameters on the application's execution time and monetary co...
Article
Full-text available
We present low-rank methods for event detection. We assume that normal observation come from a low-rank subspace, prior to being corrupted by a uniformly distributed noise. Correspondingly, we aim at recovering a representation of the subspace, and perform event detection by running point-to-subspace distance query in $\ell^\infty$, for each incomi...
Article
Full-text available
A major challenge for social event organizers (e.g., event planning companies, venues) is attracting the maximum number of participants, since it has great impact on the success of the event, and, consequently, the expected gains (e.g., revenue, artist/brand publicity). In this paper, we introduce the Social Event Scheduling (SES) problem, which sc...
Conference Paper
Full-text available
A major challenge for social event organizers (e.g., event planning and marketing companies, venues) is attracting the maximum number of participants, since it has great impact on the success of the event, and, consequently, the expected gains (e.g., revenue, artist/brand publicity). In this paper, we introduce the Social Event Scheduling (SES) pro...
Conference Paper
The proliferation of smart technologies has produced significant changes in the way people interact in a city. Smart traffic monitoring systems allow citizens and city operators to acquire a real-time view of the city traffic state. Furthermore, alternative means of transport, such as bike sharing systems, have enjoyed tremendous success in many ma...
Conference Paper
Social networks have become the de facto online resource for people to share, comment on and be informed about events pertinent to their interests and livelihood, ranging from road traffic or an illness to concerts and earthquakes, to economics and politics. This has been the driving force behind research endeavors that analyse such data. In this p...
Conference Paper
Full-text available
This paper examines the connectivity among political networks on Twitter. We explore dynamics inside and between the far right and the far left, as well as the relation between the structure of the network and sentiment. The 2015 Greek political context offers a unique opportunity to investigate political communication in times of political intensi...
Article
Online Social Networks (OSNs) constitute one of the most important communication channels and are widely utilized as news sources. Information spreads widely and rapidly in OSNs through the word-of-mouth effect. However, it is not uncommon for misinformation to propagate in the network. Misinformation dissemination may lead to undesirable effects,...
Conference Paper
Full-text available
Applications targeting Smart Cities tackle common challenges, however solutions are seldom portable from one city to another due to the heterogeneity of city ecosystems. A major obstacle involves the differences in the levels of available information. In this demonstration we present REMI, a reusable elements framework to handle varying degrees of...
Article
In any competitive business, success is based on the ability to make an item more appealing to customers than the competition. A number of questions arise in the context of this task: how do we formalize and quantify the competitiveness between two items? Who are the main competitors of a given item? What are the features of an item that most affec...
Conference Paper
Recommending nearby Points of Interest (POI) has received growing interest in mobile location-based networks today, where users share content embedded with location information. In this work, we propose a novel caching framework to support personalised proactive caching for mobile location-based social networks. We propose "LOCAI", which uses a pro...
Conference Paper
The flourish of Web-based Online Social Networks (OSNs) has led to numerous applications that exploit social relationships to boost the influence of content in the network. However, existing approaches focus on the social ties and ignore how the topic of a post and its structure relate to its popularity. Our work assists in filling this gap. The co...
Conference Paper
In this demo we present INSIGHT, a system that provides traffic event detection in Dublin by exploiting Big Data and Crowdsourcing techniques. Our system is able to process and analyze input from multiple heterogeneous urban data sources.
Conference Paper
Urban data management is already an essential element of modern cities. The authorities can build on the variety of automatically generated information and develop intelligent services that improve citizens daily life, save environmental resources or aid in coping with emergencies. From a data mining perspective, urban data introduce a lot of chall...
Article
Modern cities generate a flood of rich and varied data. New information sources like public transport and wearable devices provide opportunities for novel applications that will improve citizens׳ quality of life by reducing transportation time, enhancing city planning, and improving air quality to name a few applications. From a data science perspe...
Chapter
Event detection is a research area that attracted attention during the last years due to the widespread availability of social media data. The problem of event detection has been examined in multiple social media sources like Twitter, Flickr, YouTube and Facebook. The task comprises many challenges including the processing of large volumes of data...
Conference Paper
In recent years crowdsourcing systems have shown to provide important benefits to Smartcities, where ubiquitous citizens, acting as mobile human sensors, assist in responding to signals and providing real-time information about city events, to improve the quality of life for businesses and citizens. In this paper we present REquEST, our approach to...
Article
Top-k dominating queries combine the natural idea of selecting the k best items with a comprehensive "goodness" criterion based on dominance. A point p(1) dominates p(2) if p(1) is as good as P-2 in all attributes and is strictly better in at least one. Existing works address the problem in settings where data objects are multidimensional points. H...
Chapter
Full-text available
Applying real-time, cost-effective Complex Event processing (CEP) in the cloud has been an important goal in recent years. Distributed Stream Processing Systems (DSPS) have been widely adopted by major computing companies such as Facebook and Twitter for performing scalable event processing in streaming data. However, dynamically balancing the load...
Article
Micro-blogging services such as Twitter have gained enormous popularity over the last few years leading to massive volumes of user generated content. A portion of this content is shared via geo-aware mobile devices, such as smartphones. Pieces of information shared on such a device can be tagged with the user׳s location, conditional on the user׳s s...
Conference Paper
Full-text available
Supporting real-time, cost-effective execution of Complex Event processing applications in the cloud has been an important goal for many scientists in recent years. Distributed Stream Processing Systems (DSPS) have been widely adopted by major computing companies as a powerful approach for largescale Complex Event processing (CEP). However, determi...
Article
We present a subsequence matching framework that allows for gaps in both query and target sequences, employs variable matching tolerance efficiently tuned for each query and target sequence, and constrains the maximum matching range. Using this framework, a dynamic programming method is proposed, called SMBGT, that, given a short query sequence Q a...
Article
Users in social networks utilize hashtags for a variety of reasons. In many cases, hashtags serve retrieval purposes by labeling the content they accompany. More often than not, hashtags are used to promote content, ideas, or conversations producing viral memes. This paper addresses a specific case of hashtag classification: meme-filtering. We argu...
Conference Paper
Full-text available
Twitter is one of the most prominent social media platforms nowadays. A primary reason that has brought the medium at the spotlight of academic attention is its real-time nature, with people constantly uploading information regarding their surroundings. This trait, coupled with the service's data access policy for researchers and developers, has al...
Article
We present a Software Keyboard for smart touchscreen devices that learns its owner's unique dictionary in order to produce personalized typing predictions. The learning process is accelerated by analysing user's past typed communication. Moreover, personal temporal user behaviour is captured and exploited in the prediction engine. Computational and...
Conference Paper
Full-text available
Detecting traffic events using the sensor network infrastructure is an important service in urban environments that enables the authorities to handle traffic incidents. However, irregular measurements in such settings can derive either from faulty sensors or from unpredictable events. In this paper, we propose an efficient solution to resolve in re...
Conference Paper
Researchers, nowadays, have at their disposal valuable data from social networking applications, of which Twitter and Facebook are the most prominent examples. To retrieve this content, the Twitter service provides 2 distinct Application Programming Interfaces (APIs): a probe-based and a streaming one, each of which imposes different limitations on...
Conference Paper
We give an overview of an intelligent urban traffic management system. Complex events related to congestions are detected from heterogeneous sources involving fixed sensors mounted on intersections and mobile sensors mounted on public transport vehicles. To deal with data veracity, sensor disagreements are resolved by crowdsourcing. To deal with da...
Article
A large number of mainstream applications, like temporal search, event detection, and trend identification, assume knowledge of the timestamp of every document in a given textual collection. In many cases, however, the required timestamps are either unavailable or ambiguous. A charac- teristic instance of this problem emerges in the context of larg...
Article
Full-text available
Wireless sensor networks enable cost-effective data collection for tasks such as precision agriculture and environment monitoring. However, the resource-constrained nature of sensor nodes, which often have both limited computational capabilities and battery lifetimes, means that applications that use them must make judicious use of these resources....
Conference Paper
Full-text available
Wireless sensor networks enable cost-effective data collection for tasks such as precision agriculture and environment monitoring. However, the resource-constrained nature of sensor nodes, which often have both limited computational capabilities and battery lifetimes, means that applications that use them must make judicious use of these resources....
Article
Browsing the web is one of the most common activities that users engage in nowadays, and downloading web resources of interest, such as images, documents, music, etc., is part of this process. However, users would rather temporarily save that resource to a default path that they have easy access to (e.g. their "Desktop") than select the actual dire...
Article
Many recent sensor devices are being equipped with flash memories due to their unique advantages: non-volatile storage, small size, shock-resistance, fast read access and power efficiency. The ability of storing large amounts of data in sensor devices necessitates the need for efficient indexing structures to locate required information. The challe...
Article
Skyline queries have emerged as an expressive and informative tool, with minimal user input and thus, they have gained widespread attention. However, previous research works tackle the problem from an efficiency standpoint, i.e., returning the skyline as fast as possible, leaving it to the user to manually inspect the entire skyline result. Clearly...
Conference Paper
Full-text available
Intelligent transport management involves the use of voluminous amounts of uncertain sensor data to identify and effectively manage issues of congestion and quality of service. In particular, urban traffic has been in the eye of the storm for many years now and gathers increasing interest as cities become bigger, crowded, and “smart”. In this work...
Article
Smartphones are nowadays equipped with a number of sensors, such as WiFi, GPS, accelerometers, etc. This capability allows smartphone users to easily engage in crowdsourced computing services, which contribute to the solution of complex problems in a distributed manner. In this work, we leverage such a computing paradigm to solve efficiently the fo...
Article
In this paper, we focus the attention on the operator placement problem in Wireless Sensor Networks (WSN). This problem is very relevant for in-network query processing over WSN, where query routing trees are decomposed into three sub-components that must be processed at query time, namely operator tree, operator placement assignment scheme and rou...
Chapter
Microblogging platforms are at the core of what is known as the Live Web: the most dynamic, and fast changing portion of the web, where content is generated constantly by the users, in snippets of information. Therefore, the Live Web (or Now Web) is a good source of information for event detection, because it reflects what is happening in the physi...
Conference Paper
Location is prevalent in most applications nowadays, and is considered a first class citizen in social networks. Locational information is of great significance since it can be used to map information from the online back to the physical world, to contextualize information, or to provide localized recommendations through Location-Based Services (LB...
Article
Full-text available
The last edition of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) was held in Athens, Greece, during September 5 9, 2011. The paper 'Comparing Apples and Oranges: Measuring Differences between Exploratory Data Mining Results' by Tatti and Vreeken, provides a means to highligh...
Conference Paper
In any competitive business, success is based on the ability to make an item more appealing to customers than the competition. A number of questions arise in the context of this task: how do we formalize and quantify the competitiveness relationship between two items? Who are the true competitors of a given item? What are the features of an item th...
Article
Full-text available
We present "Hum-a-song", a system built for music retrieval, and particularly for the Query-By-Humming (QBH) application. According to QBH, the user is able to hum a part of a song that she recalls and would like to learn what this song is, or find other songs similar to it in a large music repository. We present a simple yet efficient approach tha...
Article
We present "Hum-a-song", a system built for music retrieval, and particularly for the Query-By-Humming (QBH) application. According to QBH, the user is able to hum a part of a song that she recalls and would like to learn what this song is, or find other songs similar to it in a large music repository. We present a simple yet efficient approach tha...
Conference Paper
Full-text available
The recent years have seen a proliferation of community sensing or participatory sensing paradigms, where individuals rely on the use of smart and powerful mobile devices to collect, store and analyze data from everyday life. Due to this massive collection of the data, a key challenge to all such developments, is to provide a simple but efficient w...
Conference Paper
Full-text available
Link analysis ranking methods are widely used for summarizing the connectivity structure of large networks. We explore a weighted version of two common link analysis ranking algorithms, PageRank and HITS, and study their applicability to assistive environment data. Based on these methods, we propose a novel approach for identifying representative o...
Article
Thousands of documents are made available to the users via the web on a daily basis. One of the most extensively studied problems in the context of such document streams is burst identification. Given a term t, a burst is generally exhibited when an unusually high frequency is observed for t. While spatial and temporal burstiness have been studied...
Chapter
Most of today’s smart-phones are geared towards a single user experience, whether it is reading a book, watching a movie, playing a game or listening to music. However, there has been a shift towards providing a more complex and social experience: applications are being developed and deployed to help users connect and share information with each ot...
Article
Molecular similarity is an important tool in protein and drug design for analyzing the quantitative relationships between physicochemical properties of two molecules. We present a family of similarity measures which exploits the ability of wavelet transformation to analyze the spectral components of physicochemical properties and suggests a sensiti...