Nicolas Kourtellis

Nicolas Kourtellis
Telefónica I+D | tid · Research Department

Ph.D. Comp. Science & Engin.

About

136
Publications
56,201
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,818
Citations
Introduction
Dr. Nicolas Kourtellis is a Researcher in the Telefonica R&D team, in Barcelona. Previously he was a Postdoctoral Researcher in the Web Mining Research Group at Yahoo Labs, in Barcelona. He holds a Ph.D. in Computer Science and Engineering from the University of South Florida (2012), a MSc in Computer Science from the University of South Florida (2008), and a BSc in Electrical and Computer Engineering from the National Technical University of Athens, Greece (2006). He is currently analyzing the abusive behavior of users and fake news detection, propagation and attacks in online platforms, user data privacy leakage via online advertising, load balancing of distributed streaming processing engines and streaming graph analysis.
Additional affiliations
January 2007 - present
University of South Florida
August 2006 - December 2012
University of South Florida
Position
  • Research Assistant

Publications

Publications (136)
Preprint
Over the past years, advertisement companies have used a variety of tracking methods to persistently track users across the web. Such tracking methods usually include (first-party) cookies, third-party cookies, cookie synchronisation, as well as a variety of fingerprinting mechanisms. To complement these tracking approaches, Facebook recently intro...
Preprint
Federated Learning (FL) has recently emerged as a popular solution to distributedly train a model on user devices improving user privacy and system scalability. Major Internet companies have deployed FL in their applications for specific use cases (e.g., keyboard prediction or acoustic keyword trigger), and the research community has devoted signif...
Article
The Real Time Bidding (RTB) protocol is by now more than a decade old. During this time, a handful of measurement papers have looked at bidding strategies, personal information flow, and cost of display advertising through RTB. In this paper, we present YourAdvalue, a privacy-preserving tool for displaying to end-users in a simple and intuitive man...
Preprint
Federated learning (FL), where data remains at the federated clients, and where only gradient updates are shared with a central aggregator, was assumed to be private. Recent work demonstrates that adversaries with gradient-level access can mount successful inference and reconstruction attacks. In such settings, differentially private (DP) learning...
Preprint
Full-text available
In the last years, hundreds of new Youtube channels have been creating and sharing videos targeting children, with themes related to animation, superhero movies, comics, etc. Unfortunately, many of these videos are inappropriate for consumption by their target audience, due to disturbing, violent, or sexual scenes. In this paper, we study YouTube c...
Article
Mobile networks and devices provide the users with ubiquitous connectivity, while many of their functionality and business models rely on data analysis and processing. In this context, Machine Learning (ML) plays a key role and has been successfully leveraged by the different actors in the mobile ecosystem (e.g., application and Operating System de...
Preprint
Full-text available
Massive progress of mobile wireless telecommunication networks was achieved in the previous decades, with privacy enhancement in each. At present, mobile users are getting familiar with the latest 5G networks, and the discussion for the next generation of Beyond 5G (B5G)/6G networks has already been initiated. It is expected that B5G/6G will push t...
Article
Aggression in online social networks has been studied mostly from the perspective of machine learning, which detects such behavior in a static context. However, the way aggression diffuses in the network has received little attention as it embeds modeling challenges. In fact, modeling how aggression propagates from one user to another is an importa...
Preprint
Full-text available
Misbehavior in online social networks (OSN) is an ever-growing phenomenon. The research to date tends to focus on the deployment of machine learning to identify and classify types of misbehavior such as bullying, aggression, and racism to name a few. The main goal of identification is to curb natural and mechanical misconduct and make OSNs a safer...
Preprint
Full-text available
Digital advertising is the most popular way for content monetization on the Internet. Publishers spawn new websites, and older ones change hands with the sole purpose of monetizing user traffic. In this ever-evolving ecosystem, it is challenging to effectively answer questions such as: Which entities monetize what websites? What categories of websi...
Preprint
Full-text available
Fake news is an age-old phenomenon, widely assumed to be associated with political propaganda published to sway public opinion. Yet, with the growth of social media it has become a lucrative business for web publishers. Despite many studies performed and countermeasures deployed from researchers and stakeholders, unreliable news sites have increase...
Article
The Real Time Bidding (RTB) protocol is by now more than a decade old. During this time, a handful of measurement papers have looked at bidding strategies, personal information flow, and cost of display advertising through RTB. In this paper, we present YourAdvalue, a privacy-preserving tool for displaying to end-users in a simple and intuitive man...
Preprint
Rich offline experience, periodic background sync, push notification functionality, network requests control, improved performance via requests caching are only a few of the functionalities provided by the Service Workers API. This new technology, supported by all major browsers, can significantly improve users' experience by providing the publishe...
Article
Cyberaggression has been studied in various contexts and online social platforms, and modeled on different data using state-of-the-art machine and deep learning algorithms to enable automatic detection and blocking of this behavior. Users can be influenced to act aggressively or even bully others because of elevated toxicity and aggression in their...
Preprint
Full-text available
We propose and implement a Privacy-preserving Federated Learning (PPFL) framework for mobile systems to limit privacy leakages in federated learning. Leveraging the widespread presence of Trusted Execution Environments (TEEs) in high-end and mobile devices, we utilize TEEs on clients for local training, and on servers for secure aggregation, so tha...
Preprint
Full-text available
Blockchain (BC) systems are highly distributed peer-to-peer networks that offer an alternative to centralized services and promise robustness to coordinated attacks. However, the resilience and overall security of a BC system rests heavily on the structural properties of its underlying peer-to-peer overlay. Despite their success, BC overlay network...
Preprint
Over the past decade, we have witnessed the rise of misinformation on the Internet, with online users constantly falling victims of fake news. A multitude of past studies have analyzed fake news diffusion mechanics and detection and mitigation techniques. However, there are still open questions about their operational behavior such as: How old are...
Preprint
Full-text available
Online user privacy and tracking have been extensively studied in recent years, especially due to privacy and personal data-related legislations in the EU and the USA, such as the General Data Protection Regulation, ePrivacy Regulation, and California Consumer Privacy Act. Research has revealed novel tracking and personal identifiable information l...
Preprint
Full-text available
During the past few years, mostly as a result of the GDPR and the CCPA, websites have started to present users with cookie consent banners. These banners are web forms where the users can state their preference and declare which cookies they would like to accept, if such option exists. Although requesting consent before storing any identifiable inf...
Preprint
Full-text available
India is experiencing intense political partisanship and sectarian divisions. The paper performs, to the best of our knowledge, the first comprehensive analysis on the Indian online news media with respect to tracking and partisanship. We build a dataset of 103 online, mostly mainstream news websites. With the help of two experts, alongside data fr...
Preprint
Full-text available
Federated Learning (FL) is emerging as a promising technology to build machine learning models in a decentralized, privacy-preserving fashion. Indeed, FL enables local training on user devices, avoiding user data to be transferred to centralized servers, and can be enhanced with differential privacy mechanisms. Although FL has been recently deploye...
Preprint
Data holders are increasingly seeking to protect their user's privacy, whilst still maximizing their ability to produce machine models with high quality predictions. In this work, we empirically evaluate various implementations of differential privacy (DP), and measure their ability to fend off real-world privacy attacks, in addition to measuring t...
Preprint
Full-text available
The explosive increase in volume, velocity, variety, and veracity of data generated by distributed and heterogeneous nodes such as IoT and other devices, continuously challenge the state of art in big data processing platforms and mining techniques. Consequently, it reveals an urgent need to address the ever-growing gap between this expected exasca...
Preprint
Full-text available
Donations to charity-based crowdfunding environments have been on the rise in the last few years. Unsurprisingly, deception and fraud in such platforms have also increased, but have not been thoroughly studied to understand what characteristics can expose such behavior and allow its automatic detection and blocking. Indeed, crowdfunding platforms a...
Preprint
Full-text available
The rise of online aggression on social media is evolving into a major point of concern. Several machine and deep learning approaches have been proposed recently for detecting various types of aggressive behavior. However, social media are fast paced, generating an increasing amount of content, while aggressive behavior evolves over time. In this w...
Preprint
Full-text available
Aggression in online social networks has been studied up to now, mostly with several machine learning methods which detect such behavior in a static context. However, the way aggression diffuses in the network has received little attention as it embeds modeling challenges. In fact, modeling how aggression propagates from one user to another, is an...
Preprint
Full-text available
Cyberaggression has been found in various contexts and online social platforms, and modeled on different data using state-of-the-art machine and deep learning algorithms to enable automatic detection and blocking of this behavior. Users can be influenced to act aggressively or even bully others because of elevated toxicity and aggression in their o...
Preprint
Full-text available
Websites with hyper-partisan, left or right-leaning focus offer content that is typically biased towards the expectations of their target audience. Such content often polarizes users, who are repeatedly primed to specific (extreme) content, usually reflecting hard party lines on political and socio-economic topics. Though this polarization has been...
Conference Paper
Being able to check whether an online advertisement has been targeted is essential for resolving privacy controversies and implementing in practice data protection regulations like GDPR, CCPA, and COPPA. In this paper we describe the design, implementation, and deployment of an advertisement auditing system called eyeWnder that uses crowdsourcing t...
Article
Full-text available
Video sharing platforms like YouTube are increasingly targeted by aggression and hate attacks. Prior work has shown how these attacks often take place as a result of "raids," i.e., organized efforts by ad-hoc mobs coordinating from third-party communities. Despite the increasing relevance of this phenomenon, however, online services often lack effe...
Chapter
Apache Apache samoa(Scalable Advanced Massive Online Analysis) is an open-source platform for mining big data streams. Big data is defined as datasets whose size is beyond the ability of typical software tools to capture, store, manage and analyze, due to the time and memory complexity. Velocity is one of the main properties of big data. Apache Apa...
Conference Paper
In recent years, Header Bidding (HB) has gained popularity among web publishers, challenging the status quo in the ad ecosystem. Contrary to the traditional waterfall standard, HB aims to give back to publishers control of their ad inventory, increase transparency, fairness and competition among advertisers, resulting in higher ad-slot prices. Alth...
Article
Full-text available
Cyberbullying and cyberaggression are increasingly worrisome phenomena affecting people across all demographics. More than half of young social media users worldwide have been exposed to such prolonged and/or coordinated digital harassment. Victims can experience a wide range of emotions, with negative consequences such as embarrassment, depression...
Preprint
Full-text available
Websites are constantly adapting the methods used, and intensity with which they track online visitors. However, the wide-range enforcement of GDPR since one year ago (May 2018) forced websites serving EU-based online visitors to eliminate or at least reduce such tracking activity, given they receive proper user consent. Therefore, it is important...
Preprint
Recent work has demonstrated that by monitoring the Real Time Bidding (RTB) protocol, one can estimate the monetary worth of different users for the programmatic advertising ecosystem, even when the so-called winning bids are encrypted. In this paper we describe how to implement the above techniques in a practical and privacy preserving manner. Spe...
Preprint
In the last few years, Header Bidding (HB) has gained popularity among web publishers and is challenging the status quo in the ad ecosystem. Contrary to the traditional waterfall standard, HB aims to give back control of the ad inventory to publishers, increase transparency, fairness and competition among advertisers, thus, resulting in higher ad-s...
Preprint
Full-text available
Cyberbullying and cyberaggression are increasingly worrisome phenomena affecting people across all demographics. More than half of young social media users worldwide have been exposed to such prolonged and/or coordinated digital harassment. Victims can experience a wide range of emotions, with negative consequences such as embarrassment, depression...
Preprint
Full-text available
Being able to check whether an online advertisement has been targeted is essential for resolving privacy controversies and implementing in practice data protection regulations like GDPR, CCPA, and COPPA. In this paper we describe the design, implementation, and deployment of an advertisement auditing system called iWnder that uses crowdsourcing to...
Conference Paper
Hate speech, offensive language, sexism, racism, and other types of abusive behavior have become a common phenomenon in many online social media platforms. In recent years, such diverse abusive behaviors have been manifesting with increased frequency and levels of intensity. Despite social media's efforts to combat online abusive behaviors this pro...
Conference Paper
User data is the primary input of digital advertising, fueling the free Internet as we know it. As a result, web companies invest a lot in elaborate tracking mechanisms to acquire user data that can sell to data markets and advertisers. However, with same-origin policy and cookies as a primary identification mechanism on the web, each tracker knows...
Preprint
Full-text available
Modern deep learning approaches have achieved groundbreaking performance in modeling and classifying sequential data. Specifically, attention networks constitute the state-of-the-art paradigm for capturing long temporal dynamics. This paper examines the efficacy of this paradigm in the challenging task of emotion recognition in dyadic conversations...
Preprint
Full-text available
A large number of the most-subscribed YouTube channels target children of very young age. Hundreds of toddler-oriented channels on YouTube feature inoffensive, well produced, and educational videos. Unfortunately, inappropriate content that targets this demographic is also common. YouTube’s algorithmic recommendation system regrettably suggests ina...
Chapter
Full-text available
Apache SAMOA (Scalable Advanced Massive Online Analysis) is an open-source platform for mining big data streams. Big data is defined as datasets whose size is beyond the ability of typical software tools to capture, store, manage, and analyze, due to the time and memory complexity. Apache SAMOA provides a collection of distributed streaming algorit...
Chapter
Although digital advertising fuels much of today’s free Web, it typically do so at the cost of online users’ privacy, due to continuous tracking and leakage of users’ personal data. In search for new ways to optimize effectiveness of ads, advertisers have introduced new paradigms such as cross-device tracking (CDT), to monitor users’ browsing on mu...