Nicolas KourtellisTelefónica I+D | tid · Research Department
Nicolas Kourtellis
Ph.D. Comp. Science & Engin.
About
176
Publications
83,640
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,092
Citations
Introduction
Dr. Nicolas Kourtellis is a Researcher in the Telefonica R&D team, in Barcelona. Previously he was a Postdoctoral Researcher in the Web Mining Research Group at Yahoo Labs, in Barcelona. He holds a Ph.D. in Computer Science and Engineering from the University of South Florida (2012), a MSc in Computer Science from the University of South Florida (2008), and a BSc in Electrical and Computer Engineering from the National Technical University of Athens, Greece (2006). He is currently analyzing the abusive behavior of users and fake news detection, propagation and attacks in online platforms, user data privacy leakage via online advertising, load balancing of distributed streaming processing engines and streaming graph analysis.
Additional affiliations
January 2007 - present
University of South Florida
August 2006 - December 2012
Publications
Publications (176)
Training and deploying Machine Learning models that simultaneously adhere to principles of fairness and privacy while ensuring good utility poses a significant challenge. The interplay between these three factors of trustworthiness is frequently underestimated and remains insufficiently explored. Consequently, many efforts focus on ensuring only tw...
Over the past few years, the European Commission has made significant steps to reduce disinformation in cyberspace. One of those steps has been the introduction of the 2022 "Strengthened Code of Practice on Disinformation". Signed by leading online platforms, this Strengthened Code of Practice on Disinformation is an attempt to combat disinformatio...
Ensuring user privacy remains critical in mobile networks, particularly with the rise of connected devices and denser 5G infrastructure. Privacy concerns have persisted across 2G, 3G, and 4G/LTE networks. Recognizing these concerns, the 3rd Generation Partnership Project (3GPP) has made privacy enhancements in 5G Release 15. However, the extent of...
Ensuring user privacy remains a critical concern within mobile cellular networks, particularly given the proliferation of interconnected devices and services. In fact, a lot of user privacy issues have been raised in 2G, 3G, 4G/LTE networks. Recognizing this general concern, 3GPP has prioritized addressing these issues in the development of 5G, imp...
Despite its enormous economical and societal impact , lack of human-perceived control and safety is redefining the design and development of emerging AI-based technologies. New regulatory requirements mandate increased human control and oversight of AI, transforming the development practices and responsibilities of individuals interacting with AI....
Training and deploying Machine Learning models that simultaneously adhere to principles of fairness and privacy while ensuring good utility poses a significant challenge. The interplay between these three factors of trustworthiness is frequently underestimated and remains insufficiently explored. Consequently, many efforts focus on ensuring only tw...
Toxic sentiment analysis on Twitter (X) often focuses on specific topics and events such as politics and elections. Datasets of toxic users in such research are typically gathered through lexicon-based techniques, providing only a cross-sectional view. his approach has a tight confine for studying toxic user behavior and effective platform moderati...
The upcoming Beyond 5G (B5G) and 6G networks are expected to provide enhanced capabilities such as ultra-high data rates, dense connectivity, and high scalability. It opens many possibilities for a new generation of services driven by Artificial Intelligence (AI) and billions of interconnected smart devices. However, with this expected massive upgr...
Blockchain (BC) systems are highly distributed peer-to-peer networks that offer an alternative to centralized services and promise robustness to coordinated attacks. However, the resilience and overall security of a BC system rests heavily on the structural properties of its underlying peer-to-peer overlay. Despite their success, critical design as...
In blockchain systems, similar to any distributed system, the underlying network plays a crucial role and provides the infrastructure for communication and coordination among the participating peers. As a result, the properties of the network define the level of security, availability, and fault tolerance within a blockchain system. This study aims...
The online advertising market has recently reached the 500 billion dollar mark, and to accommodate the need to match a user with the highest bidder at a fraction of a second, it has moved towards a complex automated model involving numerous agents and middle men. Stimulated by potential revenue and the lack of transparency, bad actors have found wa...
Machine Learning (ML)-powered apps are used in pervasive devices such as phones, tablets, smartwatches and IoT devices. Recent advances in collaborative, distributed ML such as Federated Learning (FL) attempt to solve privacy concerns of users and data owners, and thus used by tech industry leaders such as Google, Facebook and Apple. However, FL sy...
Automatic fake news detection is a challenging problem in misinformation spreading, and it has tremendous real-world political and social impacts. Past studies have proposed machine learning-based methods for detecting such fake news, focusing on different properties of the published news articles, such as linguistic characteristics of the actual c...
The upcoming Beyond 5G (B5G) and 6G networks are expected to provide enhanced capabilities such as ultra-high data rates, dense connectivity, and high scalability. It opens many possibilities for a new generation of services driven by Artificial Intelligence (AI) and billions of interconnected smart devices. However, with this expected massive upgr...
Massive developments in mobile wireless telecommunication networks have been made during the last few decades. At present, mobile users are getting familiar with the latest 5G networks, and the discussion for the next generation of Beyond 5G (B5G)/6G networks has already been initiated. It is expected that B5G/6G will push the existing network capa...
Users are daily exposed to a large volume of harmful content on various social network platforms. One solution is developing online moderation tools using Machine Learning techniques. However, the processing of user data by online platforms requires compliance with privacy policies. Federated Learning (FL) is an ML paradigm where the training is pe...
Over the past years, advertisement companies have used a variety of tracking methods to persistently track users across the web. Such tracking methods usually include (first-party) cookies, third-party cookies, cookie synchronisation, as well as a variety of fingerprinting mechanisms. To complement these tracking approaches, Facebook recently intro...
Federated Learning (FL) has recently emerged as a popular solution to distributedly train a model on user devices improving user privacy and system scalability. Major Internet companies have deployed FL in their applications for specific use cases (e.g., keyboard prediction or acoustic keyword trigger), and the research community has devoted signif...
The Real Time Bidding (RTB) protocol is by now more than a decade old. During this time, a handful of measurement papers have looked at bidding strategies, personal information flow, and cost of display advertising through RTB. In this paper, we present YourAdvalue, a privacy-preserving tool for displaying to end-users in a simple and intuitive man...
Federated learning (FL), where data remains at the federated clients, and where only gradient updates are shared with a central aggregator, was assumed to be private. Recent work demonstrates that adversaries with gradient-level access can mount successful inference and reconstruction attacks. In such settings, differentially private (DP) learning...
In the last years, hundreds of new Youtube channels have been creating and sharing videos targeting children, with themes related to animation, superhero movies, comics, etc. Unfortunately, many of these videos are inappropriate for consumption by their target audience, due to disturbing, violent, or sexual scenes. In this paper, we study YouTube c...
The web ecosystem is based on a market where stakeholders collect and sell personal data, but nowadays users expect stronger guarantees of transparency and privacy. With the PIMCity personal information management system (PIMS) development kit, we provide an open-source development kit for building PIMSs to foster the development of open and user-c...
Mobile networks and devices provide the users with ubiquitous connectivity, while many of their functionality and business models rely on data analysis and processing. In this context, Machine Learning (ML) plays a key role and has been successfully leveraged by the different actors in the mobile ecosystem (e.g., application and Operating System de...
Massive progress of mobile wireless telecommunication networks was achieved in the previous decades, with privacy enhancement in each. At present, mobile users are getting familiar with the latest 5G networks, and the discussion for the next generation of Beyond 5G (B5G)/6G networks has already been initiated. It is expected that B5G/6G will push t...
Aggression in online social networks has been studied mostly from the perspective of machine learning, which detects such behavior in a static context. However, the way aggression diffuses in the network has received little attention as it embeds modeling challenges. In fact, modeling how aggression propagates from one user to another is an importa...
Misbehavior in online social networks (OSN) is an ever-growing phenomenon. The research to date tends to focus on the deployment of machine learning to identify and classify types of misbehavior such as bullying, aggression, and racism to name a few. The main goal of identification is to curb natural and mechanical misconduct and make OSNs a safer...
Digital advertising is the most popular way for content monetization on the Internet. Publishers spawn new websites, and older ones change hands with the sole purpose of monetizing user traffic. In this ever-evolving ecosystem, it is challenging to effectively answer questions such as: Which entities monetize what websites? What categories of websi...
Fake news is an age-old phenomenon, widely assumed to be associated with political propaganda published to sway public opinion. Yet, with the growth of social media it has become a lucrative business for web publishers. Despite many studies performed and countermeasures deployed from researchers and stakeholders, unreliable news sites have increase...
Rich offline experience, periodic background sync, push notification functionality, network requests control, improved performance via requests caching are only a few of the functionalities provided by the Service Worker (SW) API. This new technology, supported by all major browsers, can significantly improve users’ experience by providing the publ...
The Real Time Bidding (RTB) protocol is by now more than a decade old. During this time, a handful of measurement papers have looked at bidding strategies, personal information flow, and cost of display advertising through RTB. In this paper, we present YourAdvalue, a privacy-preserving tool for displaying to end-users in a simple and intuitive man...
Rich offline experience, periodic background sync, push notification functionality, network requests control, improved performance via requests caching are only a few of the functionalities provided by the Service Workers API. This new technology, supported by all major browsers, can significantly improve users' experience by providing the publishe...
This paper seeks to answer the question of whether social ties are important on interest-driven social networks, by analysing 4-years of activities of 50,000 randomly sampled users on Pinterest, a social image discovery website. We find that a non-trivial number of users’ images are copied or repinned from strangers instead of friends, suggesting t...
Cyberaggression has been studied in various contexts and online social platforms, and modeled on different data using state-of-the-art machine and deep learning algorithms to enable automatic detection and blocking of this behavior. Users can be influenced to act aggressively or even bully others because of elevated toxicity and aggression in their...
India is experiencing intense political partisanship and sectarian divisions. The paper performs, to the best of our knowledge, the first comprehensive analysis on the Indian online news media with respect to tracking and partisanship. We build a dataset of 103 online, mostly mainstream news websites. With the help of two experts, alongside data fr...
We propose and implement a Privacy-preserving Federated Learning (PPFL) framework for mobile systems to limit privacy leakages in federated learning. Leveraging the widespread presence of Trusted Execution Environments (TEEs) in high-end and mobile devices, we utilize TEEs on clients for local training, and on servers for secure aggregation, so tha...
Blockchain (BC) systems are highly distributed peer-to-peer networks that offer an alternative to centralized services and promise robustness to coordinated attacks. However, the resilience and overall security of a BC system rests heavily on the structural properties of its underlying peer-to-peer overlay. Despite their success, BC overlay network...
Compliance with public health measures, such as restrictions on movement and socialization, is paramount in limiting the spread of diseases such as the severe acute respiratory syndrome coronavirus 2 (also referred to as COVID-19). Although large population datasets, such as phone-based mobility data, may provide some glimpse into such compliance,...
Over the past decade, we have witnessed the rise of misinformation on the Internet, with online users constantly falling victims of fake news. A multitude of past studies have analyzed fake news diffusion mechanics and detection and mitigation techniques. However, there are still open questions about their operational behavior such as: How old are...
Online user privacy and tracking have been extensively studied in recent years, especially due to privacy and personal data-related legislations in the EU and the USA, such as the General Data Protection Regulation, ePrivacy Regulation, and California Consumer Privacy Act. Research has revealed novel tracking and personal identifiable information l...
During the past few years, mostly as a result of the GDPR and the CCPA, websites have started to present users with cookie consent banners. These banners are web forms where the users can state their preference and declare which cookies they would like to accept, if such option exists. Although requesting consent before storing any identifiable inf...
India is experiencing intense political partisanship and sectarian divisions. The paper performs, to the best of our knowledge, the first comprehensive analysis on the Indian online news media with respect to tracking and partisanship. We build a dataset of 103 online, mostly mainstream news websites. With the help of two experts, alongside data fr...
The multi-volume set LNAI 12975 until 12979 constitutes the refereed proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2021, which was held during September 13-17, 2021. The conference was originally planned to take place in Bilbao, Spain, but changed to an online event due to the COVID-19 pa...
The multi-volume set LNAI 12975 until 12979 constitutes the refereed proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2021, which was held during September 13-17, 2021. The conference was originally planned to take place in Bilbao, Spain, but changed to an online event due to the COVID-19 pa...
Federated Learning (FL) is emerging as a promising technology to build machine learning models in a decentralized, privacy-preserving fashion. Indeed, FL enables local training on user devices, avoiding user data to be transferred to centralized servers, and can be enhanced with differential privacy mechanisms. Although FL has been recently deploye...
Data holders are increasingly seeking to protect their user's privacy, whilst still maximizing their ability to produce machine models with high quality predictions. In this work, we empirically evaluate various implementations of differential privacy (DP), and measure their ability to fend off real-world privacy attacks, in addition to measuring t...
The explosive increase in volume, velocity, variety, and veracity of data generated by distributed and heterogeneous nodes such as IoT and other devices, continuously challenge the state of art in big data processing platforms and mining techniques. Consequently, it reveals an urgent need to address the ever-growing gap between this expected exasca...
Donations to charity-based crowdfunding environments have been on the rise in the last few years. Unsurprisingly, deception and fraud in such platforms have also increased, but have not been thoroughly studied to understand what characteristics can expose such behavior and allow its automatic detection and blocking. Indeed, crowdfunding platforms a...
The rise of online aggression on social media is evolving into a major point of concern. Several machine and deep learning approaches have been proposed recently for detecting various types of aggressive behavior. However, social media are fast paced, generating an increasing amount of content, while aggressive behavior evolves over time. In this w...
A large number of the most-subscribed YouTube channels target children of very young age. Hundreds of toddler-oriented channels on YouTube feature inoffensive, well produced, and educational videos. Unfortunately, inappropriate content that targets this demographic is also common. YouTube's algorithmic recommendation system regrettably suggests ina...