Konrad Rieck

Konrad Rieck
Technische Universität Berlin | TUB · Department of Software Engineering and Theoretical Computer Science

Professor

About

146
Publications
90,182
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
10,815
Citations
Introduction
My research group conducts fundamental research at the intersection of computer security and machine learning. On the one end, we are interested in developing intelligent systems that can learn to protect computers from attacks and identify security problems automatically. On the other end, we explore the security and privacy of machine learning by developing novel attacks and defenses.
Additional affiliations
April 2016 - December 2022
Technische Universität Braunschweig
Position
  • Full Professor
March 2015 - March 2016
University of Göttingen
Position
  • Professor (Associate)
October 2011 - March 2015
University of Göttingen
Position
  • Junior Professor

Publications

Publications (146)
Article
With the growing processing power of computing systems and the increasing availability of massive datasets, machine learning algorithms have led to major breakthroughs in many different areas. This development has influenced computer security, spawning a series of work on learning-based security systems, such as for malware detection, vulnerability...
Article
Full-text available
The source code of a program not only defines its semantics but also contains subtle clues that can identify its author. Several studies have shown that these clues can be automatically extracted using machine learning and allow for determining a program's author among hundreds of programmers. This attribution poses a significant threat to develope...
Article
We identify 10 generic pitfalls that can affect the experimental outcome of AI driven solutions in computer security. We find that they are prevalent in the literature and provide recommendations for overcoming them in the future.
Chapter
The automated discovery of vulnerabilities at scale is a crucial area of research in software security. While numerous machine learning models for detecting vulnerabilities are known, recent studies show that their generalizability and transferability heavily depend on the quality of the training data. Due to the scarcity of real vulnerabilities, a...
Preprint
Full-text available
Backdoors pose a serious threat to machine learning, as they can compromise the integrity of security-critical systems, such as self-driving cars. While different defenses have been proposed to address this threat, they all rely on the assumption that the hardware on which the learning models are executed during inference is trusted. In this paper,...
Preprint
The number of papers submitted to academic conferences is steadily rising in many scientific disciplines. To handle this growth, systems for automatic paper-reviewer assignments are increasingly used during the reviewing process. These systems use statistical topic models to characterize the content of submissions and automate the assignment to rev...
Preprint
Full-text available
The source code of a program not only defines its semantics but also contains subtle clues that can identify its author. Several studies have shown that these clues can be automatically extracted using machine learning and allow for determining a program's author among hundreds of programmers. This attribution poses a significant threat to develope...
Preprint
Full-text available
Generative adversarial networks (GANs) have made remarkable progress in synthesizing realistic-looking images that effectively outsmart even humans. Although several detection methods can recognize these deep fakes by checking for image artifacts from the generation process, multiple counterattacks have demonstrated their limitations. These attacks...
Conference Paper
Full-text available
With the growing processing power of computing systems and the increasing availability of massive datasets, machine learning algorithms have led to major breakthroughs in many different areas. This development has influenced computer security, spawning a series of work on learning-based security systems, such as for malware detection, vulnerability...
Preprint
Full-text available
Imagine interconnected objects with embedded artificial intelligence (AI), empowered to sense the environment, see it, hear it, touch it, interact with it, and move. As future networks of intelligent objects come to life, tremendous new challenges arise for security, but also new opportunities, allowing to address current, as well as future, pressi...
Article
Full-text available
HTTPS is a cornerstone of privacy in the modern Web. The public key infrastructure underlying HTTPS, however, is a frequent target of attacks. In several cases, forged certificates have been issued by compromised Certificate Authorities (CA) and used to spy on users at large scale. While the concept of Certificate Transparency (CT) provides a means...
Preprint
Full-text available
Removing information from a machine learning model is a non-trivial task that requires to partially revert the training process. This task is unavoidable when sensitive data, such as credit card numbers or passwords, accidentally enter the model and need to be removed afterwards. Recently, different concepts for machine unlearning have been propose...
Conference Paper
Full-text available
HTTPS is a cornerstone of privacy in the modern Web. The public key infrastructure underlying HTTPS, however, is a frequent target of attacks. In several cases, forged certificates have been issued by compromised Certificate Authorities (CA) and used to spy on users at large scale. While the concept of Certificate Transparency (CT) provides a means...
Preprint
Full-text available
Physical isolation, so called air-gapping, is an effective method for protecting security-critical computers and networks. While it might be possible to introduce malicious code through the supply chain, insider attacks, or social engineering, communicating with the outside world is prevented. Different approaches to breach this essential line of d...
Chapter
Machine learning is currently successfully used for addressing several cybersecurity detection and classification tasks. Typically, such detectors are modeled through complex learning algorithms employing a wide variety of features. Although these settings allow achieving considerable performances, gaining insights on the learned knowledge turns ou...
Preprint
Machine learning-based systems for malware detection operate in a hostile environment. Consequently, adversaries will also target the learning system and use evasion attacks to bypass the detection of malware. In this paper, we outline our learning-based system PEberus that got the first place in the defender challenge of the Microsoft Evasion Comp...
Preprint
With the growing processing power of computing systems and the increasing availability of massive datasets, machine learning algorithms have led to major breakthroughs in many different areas. This development has influenced computer security, spawning a series of work on learning-based security systems, such as for malware detection, vulnerability...
Preprint
Backdoors and poisoning attacks are a major threat to the security of machine-learning and vision systems. Often, however, these attacks leave visible artifacts in the images that can be visually detected and weaken the efficacy of the attacks. In this paper, we propose a novel strategy for hiding backdoor and poisoning attacks. Our approach builds...
Preprint
For many, social networks have become the primary source of news, although the correctness of the provided information and its trustworthiness are often unclear. The investigations of the 2016 US presidential elections have brought the existence of external campaigns to light aiming at affecting the general political public opinion. In this paper,...
Chapter
Camera sensor noise is one of the most reliable device characteristics in digital image forensics, enabling the unique linkage of images to digital cameras. This so-called camera fingerprint gives rise to different applications, such as image forensics and authentication. However, if images are publicly available, an adversary can estimate the fing...
Conference Paper
With the introduction of memory-bound cryptocurrencies, such as Monero, the implementation of mining code in browser-based JavaScript has become a worthwhile alternative to dedicated mining rigs. Based on this technology, a new form of parasitic computing, widely called cryptojacking or drive-by mining, has gained momentum in the web. A cryptojacki...
Preprint
Camera sensor noise is one of the most reliable device characteristics in digital image forensics, enabling the unique linkage of images to digital cameras. This so-called camera fingerprint gives rise to different applications, such as image forensics and authentication. However, if images are publicly available, an adversary can estimate the fing...
Preprint
Deep learning is increasingly used as a basic building block of security systems. Unfortunately, deep neural networks are hard to interpret, and their decision process is opaque to the practitioner. Recent work has started to address this problem by considering black-box explanations for deep learning in computer security (CCS'18). The underlying e...
Chapter
WebAssembly, or Wasm for short, is a new, low-level language that allows for near-native execution performance and is supported by all major browsers as of today. In comparison to JavaScript it offers faster transmission, parsing, and execution times. Up until now it has, however, been largely unclear what WebAssembly is used for in the wild. In th...
Chapter
Closed-source software is a major hurdle for assessing the security of computer systems. In absence of source code, it is particularly difficult to locate vulnerabilities and malicious functionality, as crucial information is removed by the compilation process. Most notably, binary programs usually lack type information, which complicates spotting...
Preprint
In this paper, we present a novel attack against authorship attribution of source code. We exploit that recent attribution methods rest on machine learning and thus can be deceived by adversarial examples of source code. Our attack performs a series of semantics-preserving code transformations that mislead learning-based attribution but appear plau...
Preprint
Clustering algorithms have become a popular tool in computer security to analyze the behavior of malware variants, identify novel malware families, and generate signatures for antivirus systems. However, the suitability of clustering algorithms for security-sensitive settings has been recently questioned by showing that they can be significantly co...
Conference Paper
Artificial intelligence is increasingly employed in security-critical systems, such as autonomous cars and drones. Unfortunately, many machine learning techniques suffer from vulnerabilities that enable an adversary to thwart their successful application, either during the training or prediction phase. In this talk, we investigate this threat and d...
Chapter
Spear-phishing is an effective attack vector for infiltrating companies and organisations. Based on the multitude of personal information available online, an attacker can craft seemingly legit emails and trick his victims into opening malicious attachments and links. Although anti-spoofing techniques exist, their adoption is still limited and alte...
Preprint
With the introduction of memory-bound cryptocurrencies, such as Monero, the implementation of mining code in browser-based JavaScript has become a worthwhile alternative to dedicated mining rigs. Based on this technology, a new form of parasitic computing, widely called cryptojacking or drive-by mining, has gained momentum in the web. A cryptojacki...
Chapter
The online shopping sector is continuously growing, generating a turnover of billions of dollars each year. Unfortunately, this growth in popularity is not limited to regular customers: Organized crime targeting online shops has considerably evolved in the past years, causing significant financial losses to the merchants. As criminals often use sim...
Conference Paper
Fuzz testing is an effective and scalable technique to perform software security assessments. Yet, contemporary fuzzers fall short of thoroughly testing applications with a high degree of control-flow diversity, such as firewalls and network packet analyzers. In this paper, we demonstrate how static program analysis can guide fuzzing by augmenting...
Article
Full-text available
Taint-style vulnerabilities comprise a majority of fuzzer discovered program faults. These vulnerabilities usually manifest as memory access violations caused by tainted program input. Although fuzzers have helped uncover a majority of taint-style vulnerabilities in software to date, they are limited by (i) extent of test coverage; and (ii) the ava...
Article
Full-text available
To cope with the increasing variability and sophistication of modern attacks, machine learning has been widely adopted as a statistically-sound tool for malware detection. However, its security against well-crafted attacks has not only been recently questioned, but it has been shown that machine learning exhibits inherent vulnerabilities that can b...
Preprint
To cope with the increasing variability and sophistication of modern attacks, machine learning has been widely adopted as a statistically-sound tool for malware detection. However, its security against well-crafted attacks has not only been recently questioned, but it has been shown that machine learning exhibits inherent vulnerabilities that can b...
Conference Paper
Adobe Flash is about to be replaced by alternative technologies, yet Flash-based malware appears to be more common then ever. In this paper we inspect the properties and temporal distribution of this class of malware over a period of three consecutive years and 2.3 million unique Flash animations. In particular, we focus on initially undetected mal...
Conference Paper
Client-side JavaScript has become ubiquitous in web applications to improve user experience and reduce server load. However, since clients are untrusted, servers cannot rely on the confidentiality or integrity of client-side JavaScript code and the data that it operates on. For example, client-side input validation must be repeated at server side,...
Article
Full-text available
The Web is replete with tutorial-style content on how to accomplish programming tasks. Unfortunately, even top-ranked tutorials suffer from severe security vulnerabilities, such as cross-site scripting (XSS), and SQL injection (SQLi). Assuming that these tutorials influence real-world software development, we hypothesize that code snippets from pop...
Conference Paper
Although anti-virus software has significantly evolved over the last decade, classic signature matching based on byte patterns is still a prevalent concept for identifying security threats. Anti-virus signatures are a simple and fast detection mechanism that can complement more sophisticated analysis strategies. However, if signatures are not desig...
Conference Paper
Understanding and fending off attack campaigns against organizations, companies and individuals, has become a global struggle. As today's threat actors become more determined and organized, isolated efforts to detect and reveal threats are no longer effective. Although challenging, this situation can be significantly changed if information about se...
Article
Full-text available
Machine learning is increasingly used in security-critical applications, such as autonomous driving, face recognition and malware detection. Most learning methods, however, have not been designed with security in mind and thus are vulnerable to different types of attacks. This problem has motivated the research field of adversarial machine learning...
Article
The subtleties of correctly processing integers confronts developers with a multitude of pitfalls that frequently result in severe software vulnerabilities. Unfortunately, even code shown to be secure on one platform can be vulnerable on another, such that also the migration of code itself is a notable security challenge. In this paper, we provide...
Article
Full-text available
Zusammenfassung Die systematische Beseitigung von Schwachstellen in Software ist für den sicheren Betrieb von Computersystemen zentral. Nach wie vor werden Schwachstellen größtenteils in aufwendiger manueller Arbeit durch erfahrene Analysten aufgedeckt, ein Vorgehen, das aufgrund der wachsenden Menge sicherheitskritischen Programmcodes zunehmend an...
Conference Paper
Subtle flaws in integer computations are a prime source for exploitable vulnerabilities in system code. Unfortunately, even code shown to be secure on one platform can be vulnerable on another, making the migration of code a notable security challenge. In this paper, we provide the first study on how code that works as expected on 32-bit platforms...
Article
Full-text available
Although anti-virus software has significantly evolved over the last decade, classic signature matching based on byte patterns is still a prevalent concept for identifying security threats. Anti-virus signatures are a simple and fast detection mechanism that can complement more sophisticated analysis strategies. However, if signatures are not desig...
Conference Paper
Adobe Flash is a popular platform for providing dynamic and multimedia content on web pages. Despite being declared dead for years, Flash is still deployed on millions of devices. Unfortunately, the Adobe Flash Player increasingly suffers from vulnerabilities, and attacks using Flash-based malware regularly put users at risk of being remotely attac...
Conference Paper
Eliminating vulnerabilities from low-level code is vital for securing software. Static analysis is a promising approach for discovering vulnerabilities since it can provide developers early feedback on the code they write. But, it presents multiple challenges not the least of which is understanding what makes a bug exploitable and conveying this in...
Article
Full-text available
Comparing strings and assessing their similarity is a basic operation in many application domains of machine learning, such as in information retrieval, natural language processing and bioinformatics. The practitioner can choose from a large variety of available similarity measures for this task, each emphasizing different aspects of the string dat...
Article
Full-text available
Recently, Apple removed access to various device hardware identifiers that were frequently misused by iOS third-party apps to track users. We are, therefore, now studying the extent to which users of smartphones can still be uniquely identified simply through their personalized device configurations. Using Apple’s iOS as an example, we show how a d...
Article
The ability to identify authors of computer programs based on their coding style is a direct threat to the privacy and anonymity of programmers. Previous work has examined attribution of authors from both source code and compiled binaries, and found that while source code can be attributed with very high accuracy, the attribution of executable bina...
Conference Paper
Full-text available
The security of network services and their protocols critically depends on minimizing their attack surface. A single flaw in an implementation can suffice to compromise a service and expose sensitive data to an attacker. The discovery of vulnerabilities in protocol implementations, however, is a challenging task: While for standard protocols this p...
Conference Paper
Full-text available
Despite the security community's best effort, the number of serious vulnerabilities discovered in software is increasing rapidly. In theory, security audits should find and remove the vulnerabilities before the code ever gets deployed. However, due to the enormous amount of code being produced, as well as a the lack of manpower and expertise, not a...
Conference Paper
Full-text available
Taint-style vulnerabilities are a persistent problem in software development, as the recently discovered " Heartbleed " vulnerability strikingly illustrates. In this class of vulnerabil-ities, attacker-controlled data is passed unsanitized from an input source to a sensitive sink. While simple instances of this vulnerability class can be detected a...
Conference Paper
Full-text available
The Tor network has established itself as de-facto standard for anonymous communication on the Internet, providing an increased level of privacy to over a million users worldwide. As a result, interest in the security of Tor is steadily growing, attracting researchers from academia as well as industry and even nation-state actors. While various att...
Conference Paper
Full-text available
Clustering algorithms have become a popular tool in com-puter security to analyze the behavior of malware variants, identify novel malware families, and generate signatures for antivirus systems. However, the suitability of clustering algorithms for security-sensitive settings has been recently questioned by showing that they can be significantly c...
Conference Paper
Full-text available
The vast majority of security breaches encountered today are a direct result of insecure code. Consequently, the protection of computer systems critically depends on the rigorous identification of vulnerabilities in software, a tedious and error-prone process requiring significant expertise. Unfortunately, a single flaw suffices to undermine the se...
Conference Paper
Full-text available
Smartphones have become the standard personal device to store private or sensitive information. Widely used as every day gadget, however, they are susceptible to get lost or stolen. To protect information on a smartphone from being physically accessed by attackers, a lot of authentication methods have been proposed in recent years. Each one of them...
Conference Paper
Full-text available
Malicious applications pose a threat to the security of the Android platform. The growing amount and diversity of these applications render conventional defenses largely ineffective and thus Android smartphones often remain un-protected from novel malware. In this paper, we propose DREBIN, a lightweight method for detection of Android malware that...
Conference Paper
Machine learning has been widely used for defensive security. Numerous approaches have been devised that make use of learning techniques for detecting attacks and malicious software. By contrast, only very few research has studied how machine learning can be applied for offensive security. In this talk, we will explore this research direction and s...
Conference Paper
Full-text available
Detection methods based on n-gram models have been widely studied for the identification of attacks and malicious software. These methods usually build on one of two learning schemes: anomaly detection, where a model of normality is constructed from n-grams, or classification, where a discrimination between benign and malicious n-grams is learned....