Article

Dynamic Program Analysis Tools in GCC and CLANG Compilers

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... As a result, it can be declared that currently programs mostly nunreli-able, and it is seen in the form of various freezes, program crashes, and excessive consumption of system resources. One of the main sources of this behavior is errors in memory operations, including [1]:  access beyond the boundaries of the arrays (buffers);  use of dynamic memory after it is freed, including: ...
... When displaying a reachability graph, the marking of the graph nodes begins with an ellipsis, which replaces the same beginning of the markings of absolutely all nodes of the graph: g 1 3 + g 2 4 . This beginning of marking corresponds to: g 1 Movement along arcs only down to the right corresponds to the execution of the Slave function while the Master function is stopped, and vice versa, only down to the left -to the execution of the Master function when the Slave function is stopped. Thus, starting the movement from the uppermost node down to the right, the program "stops" at the place s 2 , corresponding to the spinlock waiting until the value of the variable DataPrepared becomes non-zero. ...
... For example, Valgrind, one of the most popular dynamic program analysis tools, for this purpose uses the technique of executing a program in a virtual machine to gain maximum control over the operations performed by the processor during the execution of executable code [10]. An alternative approach is demonstrated by clang and gcc [1], which introduce modified operations into the code, instead of real operations of allocation, release and access to memory, that perform a number of checks, thus providing different levels of control over the correctness of operations. When erroneous behavior is detected, dynamic analysis of a program provides maximum information about the necessary initial data, the conditions for obtaining the error, and the execution paths of the computational processes that led to the error. ...
Article
Full-text available
The article discusses an approach to automatical creation of imperative programs models, designed to find errors in memory operations. The approach is based on the model division into compositional parts, including control flow models, variable, data types and pointers models. Models of data types and pointers are developed from the purposes of modeling to a proper level of detail sufficient to detect errors. To analyze the correctness of a program, a reachability graph of the model Petri net is constructed, on which deadlocks, loops, and explicitly marked erroneous events are searched. The article provides an example of a program with an error in accessing a previously freed memory block in a race condition state of parallel program threads. A model constructed for the example program in terms of Petri nets allows to track the moment an error occurs in the reachability graph.
... Изучение косвенных связей (IC) включает в себя определение концепций [17,18], теоретических основ [8], методов графов зависимостей [19][20][21], метрик [22][23][24][25][26][27], методов извлечения и обнаружения графов неявных зависимостей (косвенных связей) [12,28] и создание аналитических инструментов [29][30][31]. Однако многие исследовательские подходы сосредоточены исключительно на идентификации графов косвенных связей и расчете метрических показателей на основе подсчета входящих и исходящих парных связей, передачи параметров и ссылок на элементы данных. Эти подходы не учитывают различия в степени вклада компонентов в исследуемый элемент. ...
Article
Software development can be a time-consuming and costly process that requires a significant amount of effort. Developers are often tasked with completing programming tasks or making modifications to existing code without increasing overall complexity. It is essential for them to understand the dependencies between the program components before implementing any changes. However, as code evolves, it becomes increasingly challenging for project managers to detect indirect coupling links between components. These hidden links can complicate the system, cause inaccurate effort estimates, and compromise the quality of the code. To address these challenges, this study aims to provide a set of measures that leverage measurement theory and hidden links between software components to expand the scope, effectiveness, and utility of accepted software metrics. The research focuses on two primary topics: (1) how indirect coupling measurements can aid developers with maintenance tasks and (2) how indirect coupling metrics can quantify software complexity and size, leveraging weighted differences across techniques. The study presents a comprehensive set of measures designed to assist developers and project managers with project management and maintenance activities. Using the power of indirect coupling measurements, these measures can enhance the quality and efficiency of software development and maintenance processes.
Article
Full-text available
Integer overflow bugs in C and C++ programs are difficult to track down and may lead to fatal errors or exploitable vulnerabilities. Although a number of tools for finding these bugs exist, the situation is complicated because not all overflows are bugs. Better tools need to be constructed— but a thorough understanding of the issues behind these errors does not yet exist. We developed IOC, a dynamic checking tool for integer overflows, and used it to conduct the first detailed empirical study of the prevalence and patterns of occurrence of integer overflows in C and C++ code. Our results show that intentional uses of wraparound behaviors are more common than is widely believed; for example, there are over 200 distinct locations in the SPEC CINT2000 benchmarks where overflow occurs. Although many overflows are intentional, a large number of accidental overflows also occur. Orthogonal to programmers' intent, overflows are found in both well-defined and undefined flavors. Applications executing undefined operations can be, and have been, broken by improvements in compiler optimizations. Looking beyond SPEC, we found and reported undefined integer overflows in SQLite, PostgreSQL, SafeInt, GNU MPC and GMP, Firefox, GCC, LLVM, Python, BIND, and OpenSSL; many of these have since been fixed. Our results show that integer overflow issues in C and C++ are subtle and complex, that they are common even in mature, widely used programs, and that they are widely misunderstood by developers.
Article
Among the many software testing techniques available today, fuzzing has remained highly popular due to its conceptual simplicity, its low barrier to deployment, and its vast amount of empirical evidence in discovering real-world software vulnerabilities. At a high level, fuzzing refers to a process of repeatedly running a program with generated inputs that may be syntactically or semantically malformed. While researchers and practitioners alike have invested a large and diverse effort towards improving fuzzing in recent years, this surge of work has also made it difficult to gain a comprehensive and coherent view of fuzzing. To help preserve and bring coherence to the vast literature of fuzzing, this paper presents a unified, general-purpose model of fuzzing together with a taxonomy of the current fuzzing literature. We methodically explore the design decisions at every stage of our model fuzzer by surveying the related literature and innovations in the art, science, and engineering that make modern-day fuzzers effective.
Article
Memory safety in C and C++ remains largely unresolved. A technique usually called "memory tagging" may dramatically improve the situation if implemented in hardware with reasonable overhead. This paper describes two existing implementations of memory tagging: one is the full hardware implementation in SPARC; the other is a partially hardware-assisted compiler-based tool for AArch64. We describe the basic idea, evaluate the two implementations, and explain how they improve memory safety. This paper is intended to initiate a wider discussion of memory tagging and to motivate the CPU and OS vendors to add support for it in the near future.
Article
Buffer overrun remains one of the main sources of errors and vulnerabilities in the C/C++ source code. To detect such kind of defects, static analysis is widely used. In this paper, we propose a path-sensitive static analysis based on symbolic execution with state merging. For buffers with compile-time-known sizes, we present an interprocedural path- and context-sensitive overrun detection algorithm that finds program points satisfying a proposed error definition. The described approach was implemented in the Svace static analyzer without significant loss of performance. On Android 5.0.2, these detectors generated 351 warnings, 64% of which were true positives. In addition, we describe a prototype of an intraprocedural heap buffer overflow detector and present an example of a defect found by this detector.
Article
Software vulnerabilities are a serious threat for security of information systems. Any software written in C/C++ contain considerable amount of vulnerabilities. Some of them can be used by attackers to seize control of the system. In this paper, for counteracting such vulnerabilities, we propose to use compiler transformations: function reordering by permutation within a module, insertion of additional local variables into the function’s stack, local variables hashing on the stack. By means of these transformations, it is suggested to generate a diversified population of executable files of the application being compiled. Such an approach, for example, complicates planning of the ROP attacks on the entire population. Having obtained a single executable file, the attacker can create an ROP exploit, which works only for this version of the application. The other executable files of the population will remain insensitive to this attack.
Conference Paper
Memory access bugs, including buffer overflows and uses of freed heap memory, remain a serious problem for programming languages like C and C++. Many memory error detectors exist, but most of them are either slow or detect a limited set of bugs, or both. This paper presents AddressSanitizer, a new memory error detector. Our tool finds out-of-bounds accesses to heap, stack, and global objects, as well as use-after-free bugs. It employs a specialized memory allocator and code instrumentation that is simple enough to be implemented in any compiler, binary translation system, or even in hardware. AddressSanitizer achieves efficiency without sacrificing comprehensiveness. Its average slowdown is just 73% yet it accurately detects bugs at the point of occurrence. It has found over 300 previously unknown bugs in the Chromium browser and many bugs in other software.
Article
FUZZINGMaster One of Today's Most Powerful Techniques for Revealing Security Flaws!Fuzzing has evolved into one of today's most effective approaches to test software security. To “fuzz,” you attach a program's inputs to a source of random data, and then systematically identify the failures that arise. Hackers haverelied on fuzzing for years: Now, it's your turn. In this book, renowned fuzzing experts show you how to use fuzzing to reveal weaknesses in your software before someone else does.Fuzzing is the first and only book to cover fuzzing from start to finish, bringing disciplined best practices to a technique that has traditionally been implemented informally. The authors begin by reviewing how fuzzing works and outlining its crucial advantages over other security testing methods. Next, they introduce state-of-the-art fuzzing techniques for finding vulnerabilities in network protocols, file formats, and web applications; demonstrate the use of automated fuzzing tools; and present several insightful case histories showing fuzzing at work. Coverage includes:· Why fuzzing simplifies test design and catches flaws other methods miss· The fuzzing process: from identifying inputs to assessing “exploitability”· Understanding the requirements for effective fuzzing· Comparing mutation-based and generation-based fuzzers· Using and automating environment variable and argument fuzzing· Mastering in-memory fuzzing techniques· Constructing custom fuzzing frameworks and tools· Implementing intelligent fault detectionAttackers are already using fuzzing. You should, too. Whether you're a developer, security engineer, tester, or QA specialist, this book teaches you how to build secure software.Forewordï¾ ï¾ ï¾ ï¾ xix Prefaceï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ xxiAcknowledgmentsï¾ xxvAbout the Authorï¾ ï¾ xxvii PARTIï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ BACKGROUNDï¾ ï¾ ï¾ ï¾ 1Chapter 1ï¾ ï¾ ï¾ Vulnerability Discovery Methodologiesï¾ 3Chapter 2ï¾ ï¾ ï¾ What Is Fuzzing?ï¾ ï¾ 21Chapter 3ï¾ ï¾ ï¾ Fuzzing Methods and Fuzzer Typesï¾ ï¾ ï¾ ï¾ 33Chapter 4ï¾ ï¾ ï¾ Data Representation and Analysisï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ 45Chapter 5ï¾ ï¾ ï¾ Requirements for Effective Fuzzingï¾ ï¾ ï¾ ï¾ ï¾ 61PART IIï¾ ï¾ ï¾ ï¾ ï¾ TARGETS AND AUTOMATIONï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ 71Chapter 6ï¾ ï¾ ï¾ Automation and Data Generationï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ 73Chapter 7ï¾ ï¾ ï¾ Environment Variable and Argument Fuzzing 89Chapter 8ï¾ ï¾ ï¾ Environment Variable and Argument Fuzzing: Automation 103Chapter 9ï¾ ï¾ ï¾ Web Application and Server Fuzzingï¾ ï¾ ï¾ ï¾ 113Chapter 10ï¾ Web Application and Server Fuzzing: Automationï¾ ï¾ ï¾ 137Chapter 11ï¾ File Format Fuzzingï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ 169Chapter 12ï¾ File Format Fuzzing: Automation on UNIXï¾ ï¾ ï¾ ï¾ 181Chapter 13ï¾ File Format Fuzzing: Automation on Windowsï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ 197Chapter 14ï¾ Network Protocol Fuzzingï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ 223Chapter 15ï¾ Network Protocol Fuzzing: Automation on UNIXï¾ ï¾ ï¾ ï¾ 235Chapter 16ï¾ Network Protocol Fuzzing: Automation on Windowsï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ 249Chapter 17ï¾ Web Browser Fuzzingï¾ ï¾ ï¾ ï¾ ï¾ 267Chapter 18ï¾ Web Browser Fuzzing: Automationï¾ ï¾ ï¾ ï¾ 283Chapter 19ï¾ In-Memory Fuzzingï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ 301Chapter 20ï¾ In-Memory Fuzzing: Automationï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ ï¾ 315PART IIIï¾ ï¾ ï¾ ADVANCED FUZZING TECHNOLOGIESï¾ ï¾ ï¾ ï¾ ï¾ 349Chapter 21ï¾ Fuzzing Frameworksï¾ ï¾ ï¾ ï¾ ï¾ ï¾ 351Chapter 22ï¾ Automated Protocol Dissectionï¾ 419Chapter 23ï¾ Fuzzer Trackingï¾ ï¾ ï¾ ï¾ 437Chapter 24ï¾ Intelligent Fault Detection 471PART IVï¾ ï¾ ï¾ ï¾ LOOKING FORWARDï¾ ï¾ ï¾ 495Chapter 25ï¾ Lessons Learnedï¾ ï¾ ï¾ 497Chapter 26ï¾ Looking Forwardï¾ ï¾ ï¾ 507Index 519
Conference Paper
Current software attacks often build on exploits that subvert machine-code execution. The enforcement of a basic safety property, Control-Flow Integrity (CFI), can prevent such attacks from arbitrarily controlling program behavior. CFI enforcement is simple, and its guarantees can be established formally even with respect to powerful adversaries. Moreover, CFI enforcement is practical: it is compatible with existing software and can be done efficiently using software rewriting in commodity systems. Finally, CFI provides a useful foundation for enforcing further security policies, as we demonstrate with efficient software implementations of a protected shadow call stack and of access control for memory regions.
Article
Debugging applications programs can be very time consuming. However, having good software tools can greatly decrease this time. While some program errors can be found at compile time, there are other program errors than cannot be detected until run-time. We call these errors run-time errors. Observe that we assume the language syntax of the program is correct and an executable was created. For example, suppose the value of an integer variable n is not known at compile time. If n is outside the declared bounds of an array A, then an out-of-bounds memory access error will occur during run-time when the program reads A[n]. The value of n may have been read from an input file or the value of n may have been the result of a calculation not performed at compile time.
What every C programmer should know about undefined behavior
  • C Lattner
Lattner, C., What every C programmer should know about undefined behavior, http://blog.llvm.org/2011/05/ what-every-c-programmer-should-know_14.html
A universal fuzzer combining 15 different fuzzing applications developed
  • A Alizar
Alizar, A., A universal fuzzer combining 15 different fuzzing applications developed, 2011. https://xakep.ru/2011/04/25/55501/
Code-pointer integrity
  • V Kuznetsov
  • L Szekeres
  • M Payer
  • G Candea
  • R Sekar
  • D Song
Kuznetsov, V., Szekeres, L., Payer, M., Candea, G., Sekar, R., and Song, D., Code-pointer integrity, Proc. of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2014, pp. 147-163.
Context-aware failure-oblivious computing as a means of preventing buffer overflows
  • M Rigger
  • D Pekarek
  • H Mossenbock
Rigger, M., Pekarek, D., and Mossenbock, H., Context-aware failure-oblivious computing as a means of preventing buffer overflows, Proc. of the 12th International Conference, NSS, 2018, pp. 376-390.
Address Sanitizer on Myriad
  • W Lee
Lee, W., Address Sanitizer on Myriad. https://docs.google.com/document/d/1oxmk0xUo jybDaQDAuTEVpHVMi5xQX74cJPyMJbaSaRM
Understanding C/C++ strict aliasing
  • P Horgan
Horgan, P., Understanding C/C++ strict aliasing. http://dbp-consulting.com/tutorials/StrictAliasing.html Translated by A. Klimontovich
OSS-Fuzz -Continuous fuzzing of open source software
  • American
  • Lop
American fuzzy lop. http://lcamtuf.coredump.cx/afl 18. OSS-Fuzz -Continuous fuzzing of open source software. https://github.com/google/oss-fuzz
Updated field experience with Annex K -Bounds checking interfaces
  • D Song
  • J Lettner
  • P Rajasekaran
  • Y Na
  • S Volckaert
  • P Larsen
Song, D., Lettner, J., Rajasekaran, P., Na, Y., Volckaert, S., Larsen, P., and Franz, M., SoK: Sanitizing for security, 2019. https://oaklandsok.github.io/papers/song2019.pdf 32. Updated field experience with Annex K -Bounds checking interfaces. http://www.openstd.org/jtc1/sc22/wg14/www/docs/n1969.htm
Dynamic analysis of ARINC 653 RTOS with LLVM, Ivannikov Inst. for System Programming Open Conference
  • V Cheptsov
  • A Khoroshilov