About
243
Publications
66,081
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,558
Citations
Introduction
Publications
Publications (243)
In recent years, serverless computing, especially Function as a Service (FaaS), is rapidly growing in popularity as a cloud programming model. The serverless computing model provides an intuitive interface for developing cloud-based applications, where the development and deployment of scalable microservices has become easier and cost-effective. An...
Serverless computing, also referred to as Function-as-a-Service (FaaS), is a cloud computing model that has attracted significant attention and has been widely adopted in recent years. The serverless computing model offers an intuitive, event-based interface that makes the development and deployment of scalable cloud-based applications easier and c...
In recent years Serverless Computing has emerged as a compelling cloud based model for the development of a wide range of data-intensive applications. However, rapid container provisioning introduces non-trivial challenges for FaaS cloud providers, as (i) real-world FaaS workloads may exhibit highly dynamic request patterns, (ii) applications have...
The radical advances in mobile computing, the IoT technological evolution along with cyberphysical components (e.g., sensors, actuators, control centers) have led to the development of smart city applications that generate raw or pre-processed data, enabling workflows involving the city to better sense the urban environment and support citizens' ev...
The proliferation of smartphone devices has led to the emergence of powerful user services from enabling interactions with friends and business associates to mapping, finding nearby businesses and alerting users in real-time. Moreover, users do not realize that continuously sharing their trajectory data with online systems may end up revealing a gr...
The ability to handle large volumes of event data and react to unexpected spikes, in real-time, remains an important challenge in stream processing systems, such as Apache Kafka, due to the amount of custom coding and technical expertise required to configure these systems. In this paper we investigate the use of reinforcement learning as a promisi...
Bike-sharing systems have enjoyed tremendous success in many major cities around the world today as a new means of urban public transportation offering a green and facile solution for daily commuters and tourists. One common problem featured in these systems is that the distribution of bikes among stations can be quite uneven, due to various factor...
To process big data, applications have been utilizing data processing libraries over the last years, which are however not optimized to work together for efficient processing. Intermediate Representations (IR) have been introduced for unifying essential functions into an abstract interface that supports cross-optimization between applications. Stil...
News portals, such as Yahoo News or Google News, collect large amounts of news articles from a variety of sources on a daily basis. Only a small portion of these documents can be selected and displayed on the homepage. Thus, there is a strong preference for major, recent events. In this work, we propose a scalable First Story Detection (FSD) pipeli...
In recent years we have experienced a wide adoption of novel distributed processing frameworks such as Apache Spark for handling batch and stream processing big data applications. An important aspect that has not been examined in these systems yet, is the energy consumption during the applications' execution. Reducing the energy consumption of mode...
Online Social Networks (OSNs) are being extensively used in a variety of campaigns, with the objective to raise the awareness of the audience regarding a specific piece of information, e.g, product awareness, political positions, etc. An important characteristic is the correlations of different strength and types, i.e., either positive or negative,...
Social event planning has received a great deal of attention in recent years where various entities, such as event planners and marketing companies, organizations, venues, or users in Event-based Social Networks , organize numerous social events (e.g., festivals, conferences, promotion parties). Recent studies show that "attendance" is the most com...
Social event planning has received a great deal of attention in recent years where various entities, such as event planners and marketing companies, organizations, venues, or users in Event-based Social Networks, organize numerous social events (e.g., festivals, conferences, promotion parties). Recent studies show that "attendance" is the most comm...
In recent years we observe the rapid growth of
large-scale analytics applications in a wide range of domains –
from healthcare infrastructures to traffic management. The high
volume of data that need to be processed has stimulated the
development of special purpose frameworks which handle the
data deluge by parallelizing data processing and concurr...
Distributed topic-based publish/subscribe systems like Apache Kafka provide a scalable and decentralized approach to achieve data dissemination. However, despite their wide adoption they can suffer from performance degradation due to the uneven load distribution between the nodes that receive and forward the messages (i.e., brokers). This problem o...
In a networked world, events are transmitted from multiple distributed sources into CEP systems, where events are related to one another along multiple dimensions, e.g., temporal and spatial, to create complex events. The big data era brought with it an increase in the scale and frequency of event reporting. Internet of Things adds another layer of...
In this demonstration we present \emph{Dione} a novel framework for automatic profiling and tuning big data applications. Our system allows a non-expert user to submit Spark or Flink applications to his/her cluster and Dione automatically determines the impact of different configuration parameters on the application's execution time and monetary co...
We present low-rank methods for event detection. We assume that normal observation come from a low-rank subspace, prior to being corrupted by a uniformly distributed noise. Correspondingly, we aim at recovering a representation of the subspace, and perform event detection by running point-to-subspace distance query in $\ell^\infty$, for each incomi...
A major challenge for social event organizers (e.g., event planning companies, venues) is attracting the maximum number of participants, since it has great impact on the success of the event, and, consequently, the expected gains (e.g., revenue, artist/brand publicity). In this paper, we introduce the Social Event Scheduling (SES) problem, which sc...
A major challenge for social event organizers (e.g., event planning and marketing companies, venues) is attracting the maximum number of participants, since it has great impact on the success of the event, and, consequently, the expected gains (e.g., revenue, artist/brand publicity). In this paper, we introduce the Social Event Scheduling (SES) pro...
In recent years distributed processing frameworks such as Apache Spark have been utilized for running big data
applications. Predicting the application’s execution time has been an important goal since it can help the end user to determine the necessary processing resources to be reserved. While there have been some previous works that examine the...
The proliferation of smart technologies has produced significant changes in the way people interact in a city. Smart traffic monitoring systems allow citizens and city operators to acquire a real-time view of the city traffic state. Furthermore, alternative means of transport, such as bike sharing systems, have enjoyed tremendous success in many ma...
Social networks have become the de facto online resource for people to share, comment on and be informed about events pertinent to their interests and livelihood, ranging from road traffic or an illness to concerts and earthquakes, to economics and politics. This has been the driving force behind research endeavors that analyse such data. In this p...
Online Social Networks (OSNs) constitute one of the most important communication channels and are widely utilized as news sources. Information spreads widely and rapidly in OSNs through the word-of-mouth effect. However, it is not uncommon for misinformation to propagate in the network. Misinformation dissemination may lead to undesirable effects,...
Nowadays we see the wide adoption of novel distributed processing frameworks such as Apache Spark for handling batch and stream processing big data applications. An important aspect that has not been examined in these systems is their energy consumption during the application execution. Reducing the power consumption of modern datacenters is a nece...
Bike-sharing systems have been deployed in many major cities around the world today. Bike sharing systems provide great advantages as a mean of urban public transportation facilitating a green solution for daily commuters and tourists. Users tend to use more often this type of transportation for their daily needs. The key to success for such system...
The lack of parking spaces in large urban cities is responsible for a series of problems such as traffic congestion, air pollution and social anxiety. A promising approach to alleviate those effects is harnessing contributions from the human crowd equipped with mobile phones to find available and affordable parking spaces. In this work we propose a...
The problem of coping with the demands of determinism and meeting latency constraints is challenging in distributed data stream processing systems that have to process high volume data streams that arrive from different unsynchronized input sources.
In order to deterministically process the streaming data, they need mechanisms that synchronize the...
Supporting high throughput in Distributed Stream Processing Systems (DSPSs) has been an important goal in recent years. Current works either focus on automatically increasing the system resources whenever the current setup is inadequate or apply load shedding techniques discarding some of the incoming data. However, both approaches have significant...
Nowadays distributed processing frameworks like Apache Spark have been successfully used for the execution of big data applications. Despite their wide adoption little work has been done in terms of controlling the applications' energy consumption. Datacenters contribute over 2% of the total US electric usage therefore minimizing the energy utiliza...
The wide adoption of Location Based Social Networks along with advances in mobile technology, has brought forth as a core service the analysis of large volumes of location-based data for personalized Point of Interest (POIs) recommendations. The majority of the existing recommendation systems take advantage of Collaborative Filtering, but they fail...
Recommending nearby Points of Interest (POI) has received growing interest in mobile location-based networks today, where users share content embedded with location information. In this work, we propose a novel caching framework to support personalised proactive caching for mobile location-based social networks. We propose "LOCAI", which uses a pro...
The flourish of Web-based Online Social Networks (OSNs) has led to numerous applications that exploit social relationships to boost the influence of content in the network. However, existing approaches focus on the social ties and ignore how the topic of a post and its structure relate to its popularity. Our work assists in filling this gap. The co...
With the massive prevalence of smartphones, mobile social sensing systems in which humans acting as social sensors respond to geo-located crowdsourcing tasks, became extremely popular. Such systems can provide significant benefits particularly during crisis management and emergency situations. However, not only querying users can be extremely costl...
In this demo we present INSIGHT, a system that provides traffic event detection in Dublin by exploiting Big Data and Crowdsourcing techniques. Our system is able to process and analyze input from multiple heterogeneous urban data sources.
Urban data management is already an essential element of modern cities. The authorities can build on the variety of automatically generated information and develop intelligent services that improve citizens daily life, save environmental resources or aid in coping with emergencies. From a data mining perspective, urban data introduce a lot of chall...
Modern cities generate a flood of rich and varied data. New information sources like public transport and wearable devices provide opportunities for novel applications that will improve citizens׳ quality of life by reducing transportation time, enhancing city planning, and improving air quality to name a few applications. From a data science perspe...
In this demo we present CrowdAlert, a mobile application that we have developed that enables users to report and receive traffic information and unusual events in SmartCities. CrowdAlert provides great benefits to both citizens and authorities as it allows the former to be alerted about ongoing local events of interest, and the latter to identify,...
This paper contributes to mobile crowdsourcing applications by developing a privacy preserving framework that enables users to contribute content to the community while controlling their privacy exposure. One fundamental challenge in such applications is how to preserve user privacy, as participants may end up revealing a great deal of user-identif...
In recent years many organizations adopt the usage of multiple concurrent MapReduce frameworks running on different clusters in order to support data, failure, version and performance isolation for their Big Data applications. However, efficiently scheduling MapReduce workloads in such environments can be particularly challenging due to the observe...
With the advent of mobile networking and the widespread adoption of smartphone devices, a number of location-based services have emerged, where users actively participate by sharing and receiving mobility data. However, the collection and analysis of user mobility data, such as user location information and trajectory data, especially when exploite...
The usage of smartphone and mobile devices has increased tremendously in recent years and nowadays the most popular OS for smartphones is the Android OS. However, a significant percentage of the users does not realize that there are applications that can threaten their privacy. Any user can download freely applications from the Google Play Store, w...
Applying real-time, cost-effective Complex Event processing (CEP) in the cloud has been an important goal in recent years. Distributed Stream Processing Systems (DSPS) have been widely adopted by major computing companies such as Facebook and Twitter for performing scalable event processing in streaming data. However, dynamically balancing the load...
In recent years we are observing an increased
demand for processing large amounts of data. The MapReduce
programming model has been utilized by major computing companies
in order to perform large-scale data processing. However,
the problem of efficiently scheduling MapReduce workloads in
cluster environments, like Amazon’s EC2, can be challenging d...
Modern cities are flooded with data. New information sources like public transport and wearable devices provide opportunities for novel applications that will improve citizens' quality of life. From a data science perspective, data emerging from smart cities give rise to a lot of challenges that constitute a new interdisciplinary field of research....
In recent years crowdsourcing systems have shown to provide important benefits to Smartcities, where ubiquitous citizens, acting as mobile human sensors, assist in responding to signals and providing real-time information about city events, to improve the quality of life for businesses and citizens. In this paper we present REquEST, our approach to...
Online Social Networks (OSNs) have become increasingly popular means of information sharing among users. The spread of news regarding emergency events is common in OSNs and so is the spread of misinformation related to the event. We define as misinformation any false or inaccurate information that is spread either intentionally or unintentionally....
In recent years, we observe an increasing demand for systems that are capable of efficiently managing and processing huge amounts of data. Apache's Hadoop, an open-source implementation of Google's MapReduce programming model, has emerged as one of the most popular systems for Big Data processing and is supported by major companies like Facebook, Y...
Supporting real-time, cost-effective execution of
Complex Event processing applications in the cloud has been an
important goal for many scientists in recent years. Distributed
Stream Processing Systems (DSPS) have been widely adopted by
major computing companies as a powerful approach for largescale
Complex Event processing (CEP). However, determi...
Real-time, cost-effective execution of ”Big Data”
applications on MapReduce clusters has been an important goal
for many scientists in recent years. The MapReduce paradigm
has been widely adopted by major computing companies as
a powerful approach for large-scale data analytics. However,
running MapReduce workloads in cluster environments has
been...
In recent years we have observed a significant increase in the popularity of location-based social networks for exchanging news and experiences, sharing location information, or publishing real world events. One important challenge in such networks is to understand human crowd mobility behavior based on user social activities and interactions. In t...
Detecting traffic events using the sensor network infrastructure is an important service in urban environments that enables the authorities to handle traffic incidents. However, irregular measurements in such settings can derive either from faulty sensors or from unpredictable events. In this paper, we propose an efficient solution to resolve in re...
We give an overview of an intelligent urban traffic management system. Complex events related to congestions are detected from heterogeneous sources involving fixed sensors mounted on intersections and mobile sensors mounted on public transport vehicles. To deal with data veracity, sensor disagreements are resolved by crowdsourcing. To deal with da...
Event recognition refers to the detection of events that are considered relevant for processing, thereby providing the opportunity to implement reactive measures. There are five research challenges of event recognition, namely, multiscale temporal aggregation, uncertainty, distribution, pattern learning, and event forecasting. The July 2014 Special...
Location-based social networks have evolved into powerful tools in recent years. The ability to embed location information in Social Networks such as Facebook, Foursquare and Twitter creates exciting opportunities for users to disseminate and exchange geolocation information in a variety of domains. The problem of exploiting the social ties between...
Supporting real-time jobs on MapReduce systems is particularly
challenging due to the heterogeneity of the environment,
the load imbalance caused by skewed data
blocks, as well as real-time response demands imposed
by the applications. In this paper we describe our approach
for scheduling real-time, skewed MapReduce
jobs in heterogeneous systems. O...
With the rapid growth of mobile smartphone users, several commercial mobile companies have exploited crowd sourcing as an effective approach to collect and analyze data, to improve their services. In a crowd sourcing system, "human workers" are enlisted to perform small tasks, that are difficult to be automated, in return for some monetary compensa...
Crowdsourcing has emerged as an attractive paradigm in recent years for information collection for disaster response, which utilizes data received from the human crowd, to provide critical information collection and dissemination during emergency situations and visualize this data to generate emergency maps for the human crowd. In this paper we inv...
Understanding human crowd mobility has found important applications in several commercial domains such as marketing, recommendation systems and resource planning. In this paper we investigate users' social activities and interactions developed in "human-centered participatory sensing" groups and perform an analysis to understand human crowd behavio...
Many recent sensor devices are being equipped with flash memories due to their unique advantages: non-volatile storage, small size, shock-resistance, fast read access and power efficiency. The ability of storing large amounts of data in sensor devices necessitates the need for efficient indexing structures to locate required information.
The challe...
In the recent years social networks have undergone explosive growth. They have been used as major tools for the spread of information, ideas and notifications among the members of the network. In this paper we aim at exploiting this new communication channel for emergency notification, to deliver emergency information to all appropriate recipients....
Intelligent transport management involves the use of voluminous amounts of uncertain sensor data to identify and effectively manage issues of congestion and quality of service. In particular, urban traffic has been in the eye of the storm for many years now and gathers increasing interest as cities become bigger, crowded, and “smart”. In this work...