Dali Kaafar

Dali Kaafar
Macquarie University

Professor

About

211
Publications
48,943
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,880
Citations
Citations since 2016
116 Research Items
3071 Citations
20162017201820192020202120220100200300400500
20162017201820192020202120220100200300400500
20162017201820192020202120220100200300400500
20162017201820192020202120220100200300400500
Introduction
Executive Director of the Optus Macquarie University Cyber Security Hub and Full Professor at the Faculty of Science and Engineering at Macquarie University. https://research.csiro.au/ng/about-us/people/dali-kaafar/ * Google Scholar profile: https://goo.gl/qtlp1p * DBLP profile: http://dblp2.uni-trier.de/pers/hd/k/K=acirc=afar:Mohamed_Ali
Additional affiliations
July 2016 - December 2018
The Commonwealth Scientific and Industrial Research Organisation
Position
  • Group Leader

Publications

Publications (211)
Conference Paper
Full-text available
Millions of users worldwide resort to mobile VPN clients to either circumvent censorship or to access geo-blocked content, and more generally for privacy and security purposes. In practice, however, users have little if any guarantees about the corresponding security and privacy settings, and perhaps no practical knowledge about the entities access...
Conference Paper
Full-text available
Networks are vulnerable to disruptions caused by malicious forwarding devices. The situation is likely to worsen in Software Defined Networks (SDNs) with the incompatibility of existing solutions, use of programmable soft switches and the potential of bringing down an entire network through compromised forwarding devices. In this paper, we present...
Article
The increased popularity of smartphones has attracted a large number of developers to offer various applications for the different smartphone platforms via the respective app markets. One consequence of this popularity is that the app markets are also becoming populated with spam apps. These spam apps reduce the users' quality of experience and inc...
Preprint
Full-text available
The short message service (SMS) was introduced a generation ago to the mobile phone users. They make up the world's oldest large-scale network, with billions of users and therefore attracts a lot of fraud. Due to the convergence of mobile network with internet, SMS based scams can potentially compromise the security of internet services as well. In...
Preprint
Full-text available
Privacy-preserving estimation of counts of items in streaming data finds applications in several real-world scenarios including word auto-correction and traffic management applications. Recent works of RAPPOR and Apple's count-mean sketch (CMS) algorithm propose privacy preserving mechanisms for count estimation in large volumes of data using proba...
Preprint
Full-text available
Record linkage algorithms match and link records from different databases that refer to the same real-world entity based on direct and/or quasi-identifiers, such as name, address, age, and gender, available in the records. Since these identifiers generally contain personal identifiable information (PII) about the entities, record linkage algorithms...
Preprint
Sensors embedded in mobile smart devices can monitor users' activity with high accuracy to provide a variety of services to end-users ranging from precise geolocation, health monitoring, and handwritten word recognition. However, this involves the risk of accessing and potentially disclosing sensitive information of individuals to the apps that may...
Preprint
Full-text available
Differential privacy (DP) has been applied in deep learning for preserving privacy of the underlying training sets. Existing DP practice falls into three categories - objective perturbation, gradient perturbation and output perturbation. They suffer from three main problems. First, conditions on objective functions limit objective perturbation in g...
Preprint
Full-text available
Cooperative Intelligent Transportation Systems (C-ITS) enable communications between vehicles, road-side infrastructures, and road-users to improve users' safety and to efficiently manage traffic. Most, if not all, of the intelligent vehicles-to-everything (V2X) applications, often rely on continuous collection and sharing of sensitive information...
Preprint
Full-text available
This paper performs a large-scale study of dependency chains in the web, to find that around 50% of first-party websites render content that they did not directly load. Although the majority (84.91%) of websites have short dependency chains (below 3 levels), we find websites with dependency chains exceeding 30. Using VirusTotal, we show that 1.2% o...
Preprint
Full-text available
Misbehavior in online social networks (OSN) is an ever-growing phenomenon. The research to date tends to focus on the deployment of machine learning to identify and classify types of misbehavior such as bullying, aggression, and racism to name a few. The main goal of identification is to curb natural and mechanical misconduct and make OSNs a safer...
Chapter
Full-text available
This chapter studies the relationship between two important, often conflicting paradigms of online services: personalization and tracking. The chapter initially focuses on the categories and levels of online personalization, briefly overviewing algorithmic methods applied to achieve these. Then, the chapter turns to online tracking specific to mobi...
Article
Differential privacy (DP) has been applied in deep learning for preserving privacy of the underlying training sets. Existing DP practice falls into three categories—objective perturbation (injecting DP noise into the objective function), gradient perturbation (injecting DP noise into the process of gradient descent) and output perturbation (injecti...
Article
Record linkage algorithms match and link records from different databases that refer to the same real-world entity based on direct and/or quasi-identifiers, such as name, address, age, and gender, available in the records. Since these identifiers generally contain personal identifiable information (PII) about the entities, record linkage algorithms...
Article
Privacy-preserving estimation of counts of items in streaming data finds applications in several real-world scenarios including word auto-correction and traffic management applications. Recent works of RAPPOR [1] and Apple's count-mean sketch (CMS) algorithm [2] propose privacy preserving mechanisms for count estimation in large volumes of data usi...
Preprint
Full-text available
Micro-segmentation is an emerging security technique that separates physical networks into isolated logical micro-segments (workloads). By tying fine-grained security policies to individual workloads, it limits the attacker's ability to move laterally through the network, even after infiltrating the perimeter defences. While micro-segmentation is p...
Preprint
Full-text available
Statistical models are often used to forecast and predict time-series data based on past observations [15, 12], with wide-ranging applications including predicting stock prices, seismic events, and electricity demand. In this paper, we investigate the extent to which statistical models such as Auto-regressive (AR), Auto-regressive exogenous (ARX) a...
Experiment Findings
Statistical models are often used to forecast and predict time-series data based on past observations [12, 15], with wide-ranging applications including predicting stock prices, seismic events, and electricity demand. In this paper, we investigate the extent to which statistical models such as Auto-regressive (AR), Auto-regressive exogenous (ARX) a...
Preprint
Full-text available
We investigate the extent to which statistical predictive models leak information about their training data. More specifically, based on the use case of household (electrical) energy consumption, we evaluate whether white-box access to auto-regressive (AR) models trained on such data together with background information, such as household energy da...
Preprint
Full-text available
Statistical models are often used to forecast and predict time-series data based on past observations [15, 12], with wide-ranging applications including predicting stock prices, seismic events, and electricity demand. In this paper, we investigate the extent to which statistical models such as Auto-regressive (AR), Auto-regressive exogenous (ARX) a...
Article
Objective: We conduct a first large-scale analysis of mobile health (mHealth) apps available on Google Play with the goal of providing a comprehensive view of mHealth apps' security features and gauging the associated risks for mHealth users and their data. Materials and methods: We designed an app collection platform that discovered and downloa...
Preprint
Full-text available
Smartphone technology has drastically improved over the past decade. These improvements have seen the creation of specialized health applications, which offer consumers a range of health-related activities such as tracking and checking symptoms of health conditions or diseases through their smartphones. We term these applications as Symptom Checkin...
Article
Full-text available
We investigate the extent to which statistical predictive models leak information about their training data. More specifically, based on the use case of household (electrical)energy consumption, we evaluate whether white-box access to auto-regressive (AR) models trained on such data together with background information, such as household energy dat...
Preprint
Full-text available
Smart meter data is collected and shared with different stakeholders involved in a smart grid ecosystem. The fine-grained energy data is extremely useful for grid operations and maintenance, monitoring and for market segmentation purposes. However, sharing and releasing fine-grained energy data induces explicit violations of private information of...
Preprint
Full-text available
We propose BlockJack, a system based on a distributed and tamper-proof consortium Blockchain that aims at blocking IP prefix hijacking in the Border Gateway Protocol (BGP). In essence, BlockJack provides synchronization among BlockChain and BGP network through interfaces ensuring operational independence and this approach preserving the legacy syst...
Article
Full-text available
Objectives To investigate whether and what user data are collected by health related mobile applications (mHealth apps), to characterise the privacy conduct of all the available mHealth apps on Google Play, and to gauge the associated risks to privacy. Design Cross sectional study Setting Health related apps developed for the Android mobile platf...
Preprint
With an increase in low-cost machine learning APIs, advanced machine learning models may be trained on private datasets and monetized by providing them as a service. However, privacy researchers have demonstrated that these models may leak information about records in the training dataset via membership inference attacks. In this paper, we take a c...
Preprint
Full-text available
Many techniques have been proposed for quickly detecting and containing malware-generated network attacks such as large-scale denial of service attacks; unfortunately, much damage is already done within the first few minutes of an attack, before it is identified and contained. There is a need for an early warning system that can predict attacks bef...
Article
We consider training machine learning models using data located on multiple private and geographically-scattered servers with different privacy settings. Due to the distributed nature of the data, communicating with all collaborating private data owners simultaneously may prove challenging or altogether impossible. We consider differentially-privat...
Article
Observation Resilient Authentication Schemes (ORAS) are a class of shared secret challenge–response identification schemes where a user mentally computes the response via a cognitive function to authenticate herself such that eavesdroppers cannot readily extract the secret. Security evaluation of ORAS generally involves quantifying information leak...
Preprint
Data holders are increasingly seeking to protect their user's privacy, whilst still maximizing their ability to produce machine models with high quality predictions. In this work, we empirically evaluate various implementations of differential privacy (DP), and measure their ability to fend off real-world privacy attacks, in addition to measuring t...
Preprint
Full-text available
Observation Resilient Authentication Schemes (ORAS) are a class of shared secret challenge-response identification schemes where a user mentally computes the response via a cognitive function to authenticate herself such that eavesdroppers cannot readily extract the secret. Security evaluation of ORAS generally involves quantifying information leak...
Article
The large-scale collection of individuals' mobility data poses serious privacy concerns. Instead of perturbing data by adding noise to the raw location data to preserve privacy of individuals, we propose an approach that achieves privacy-preservation at the statistics level of aggregating mobility datasets with the probabilistic data structure Coun...
Article
Full-text available
The web is a tangled mass of interconnected services, whereby websites import a range of external resources from various third-party domains. The latter can also load further resources hosted on other domains. For each website, this creates a dependency chain underpinned by a form of implicit trust between the first-party and transitively connected...
Preprint
Micro-segmentation is a network security technique that requires delivering services for each unique segment. To do so, the first stage is defining these unique segments (a.k.a security groups) and then initializing policy-driven security controls. In this paper, we propose an unsupervised learning technique that covers both the security grouping a...
Conference Paper
Full-text available
Micro-segmentation is a network security technique requires delivering services for each unique segment. To do so, the first stage is defining these unique segments (a.k.a security groups) and then initializing policy-driven security controls. In this paper, we propose an unsupervised learning technique that covers both the security grouping and po...
Preprint
Full-text available
We consider training machine learning models using Training data located on multiple private and geographically-scattered servers with different privacy settings. Due to the distributed nature of the data, communicating with all collaborating private data owners simultaneously may prove challenging or altogether impossible. In this paper, we develo...
Preprint
Full-text available
Machine learning models have been shown to be vulnerable to membership inference attacks, i.e., inferring whether individuals' data have been used for training models. The lack of understanding about factors contributing success of these attacks motivates the need for modelling membership information leakage using information theory and for investi...
Preprint
We assess the security of machine learning based biometric authentication systems against an attacker who submits uniform random inputs, either as feature vectors or raw inputs, in order to find an accepting sample of a target user. The average false positive rate (FPR) of the system, i.e., the rate at which an impostor is incorrectly accepted as t...
Article
Full-text available
Differential privacy provides strong privacy guarantees simultaneously enabling useful insights from sensitive datasets. However, it provides the same level of protection for all elements (individuals and attributes) in the data. There are practical scenarios where some data attributes need more/less protection than others. In this paper, we consid...
Preprint
A number of recent works have demonstrated that API access to machine learning models leaks information about the dataset records used to train the models. Further, the work of \cite{somesh-overfit} shows that such membership inference attacks (MIAs) may be sufficient to construct a stronger breed of attribute inference attacks (AIAs), which given...
Article
In this paper, we present the design and implementation of SplitBox, a system for privacy-preserving processing of network functions outsourced to cloud middleboxes—i.e., without revealing the policies governing these functions. SplitBox is built to provide privacy for a generic network function that abstracts the functionality of a variety of netw...
Conference Paper
Full-text available
This paper focuses on reporting of Internet malicious activity (or mal-activity in short) by public blacklists with the objective of providing a systematic characterization of what has been reported over the years, and more importantly, the evolution of reported activities. Using an initial seed of 22 blacklists, covering the period from January 20...
Preprint
In this paper, we apply machine learning to distributed private data owned by multiple data owners, entities with access to non-overlapping training datasets. We use noisy, differentially-private gradients to minimize the fitness cost of the machine learning model using stochastic gradient descent. We quantify the quality of the trained model, usin...
Preprint
Full-text available
Websites employ third-party ads and tracking services leveraging cookies and JavaScript code, to deliver ads and track users' behavior, causing privacy concerns. To limit online tracking and block advertisements, several ad-blocking (black) lists have been curated consisting of URLs and domains of well-known ads and tracking services. Using Interne...
Conference Paper
Full-text available
With the number of new mobile malware instances increasing by over 50% annually since 2012 [24], malware embedding in mobile apps is arguably one of the most serious security issues mobile platforms are exposed to. While obfuscation techniques are successfully used to protect the intellectual property of apps' developers, they are unfortunately als...
Preprint
Full-text available
With the number of new mobile malware instances increasing by over 50\% annually since 2012 [24], malware embedding in mobile apps is arguably one of the most serious security issues mobile platforms are exposed to. While obfuscation techniques are successfully used to protect the intellectual property of apps' developers, they are unfortunately al...
Conference Paper
Full-text available
The Web is a tangled mass of interconnected services, where websites import a range of external resources from various third-party domains. The latter can also load resources hosted on other domains. For each website, this creates a dependency chain underpinned by a form of implicit trust between the first-party and transitively connected third-par...
Preprint
Full-text available
This paper focuses on reporting of Internet malicious activity (or mal-activity in short) by public blacklists with the objective of providing a systematic characterization of what has been reported over the years, and more importantly, the evolution of reported activities. Using an initial seed of 22 blacklists, covering the period from January 20...
Article
Full-text available
Recently, the development of autonomous vehicles and intelligent driver assistance systems has drawn a significant amount of attention from the general public. One of the most critical issues in the development of autonomous vehicles and driver assistance systems is their poor performance under adverse weather conditions, such as rain, snow, fog, a...