Goran Piskachev’s research while affiliated with Fraunhofer Institute for Mechatronic Systems Design and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (18)


Compositional Taint Analysis for Enforcing Security Policies at Scale
  • Conference Paper

November 2023

·

12 Reads

·

2 Citations

Subarno Banerjee

·

·

·

[...]

·

Jingbo Wang

IDE view of the workspace in which participants worked on the given task. Below are shown the findings that the tool reports in default configuration
Configuration page used in the tool for the user study
Screenshot of Q25 of the questionnaire
BoxPlots for percentage of participants that marked each option as understandable and useful for each tool, Fortify (blue and yellow) and CheckMarx (green and red)
Tools that participants use or used in the past and number of participants that named each tool

+2

Can the configuration of static analyses make resolving security vulnerabilities more effective? - A user study
  • Article
  • Full-text available

September 2023

·

120 Reads

·

4 Citations

Empirical Software Engineering

The use of static analysis security testing (SAST) tools has been increasing in recent years. However, previous studies have shown that, when shipped to end users such as development or security teams, the findings of these tools are often unsatisfying. Users report high numbers of false positives or long analysis times, making the tools unusable in the daily workflow. To address this, SAST tool creators provide a wide range of configuration options, such as customization of rules through domain-specific languages or specification of the application-specific analysis scope. In this paper, we study the configuration space of selected existing SAST tools when used within the integrated development environment (IDE). We focus on the configuration options that impact three dimensions, for which a trade-off is unavoidable, i.e., precision, recall, and analysis runtime. We perform a between-subjects user study with 40 users from multiple development and security teams - to our knowledge, the largest population for this kind of user study in the software engineering community. The results show that users who configure SAST tools are more effective in resolving security vulnerabilities detected by the tools than those using the default configuration. Based on post-study interviews, we identify common strategies that users have while configuring the SAST tools to provide further insights for tool creators. Finally, an evaluation of the configuration options of two commercial SAST tools, Fortify and CheckMarx, reveals that a quarter of the users do not understand the configuration options provided. The configuration options that are found most useful relate to the analysis scope.

Download


Shifting Left for Early Detection of Machine-Learning Bugs

March 2023

·

11 Reads

·

4 Citations

Lecture Notes in Computer Science

Computational notebooks are widely used for machine learning (ML). However, notebooks raise new correctness concerns beyond those found in traditional programming environments. ML library APIs are easy to misuse, and the notebook execution model raises entirely new problems concerning reproducibility. It is common to use static analyses to detect bugs and enforce best practices in software applications. However, when configured with new types of rules tailored to notebooks, these analyses can also detect notebook-specific problems. We present our initial efforts in understanding how static analysis for notebooks differs from analysis of traditional application software. We created six new rules for the CodeGuru Reviewer based on discussions with ML practitioners. We ran the tool on close to 10,000 experimentation notebooks, resulting in an average of approximately 1 finding per 7 notebooks. Approximately 60% of the findings that we reviewed are real notebook defects. (Due to confidentiality limitations, we cannot disclose the exact number of notebook files and findings.) KeywordsStatic analysisComputational notebooksJupyter notebookMachine-learning bugsBug findingMachine learningPyTorchCodeGuru reviewer




Fluently specifying taint-flow queries with fluentTQL

September 2022

·

175 Reads

·

10 Citations

Empirical Software Engineering

Previous work has shown that taint analyses are only useful if correctly customized to the context in which they are used. Existing domain-specific languages (DSLs) allow such customization through the definition of deny-listing data-flow rules that describe potentially vulnerable or malicious taint-flows. These languages, however, are designed primarily for security experts who are expected to be knowledgeable in taint analysis. Software developers, however, consider these languages to be complex. This paper thus presents fluent TQL, a query specification language particularly for taint-flows. fluent TQL is internal Java DSL and uses a fluent-interface design. fluent TQL queries can express various taint-style vulnerability types, e.g. injections, cross-site scripting or path traversal. This paper describes fluent TQL’s abstract and concrete syntax and defines its runtime semantics. The semantics are independent of any underlying analysis and allows evaluation of fluent TQL queries by a variety of taint analyses. Instantiations of fluent TQL, on top of two taint analysis solvers, Boomerang and FlowDroid, show and validate fluent TQL expressiveness. Based on existing examples from the literature, we have used fluent TQL to implement queries for 11 popular security vulnerability types in Java. Using our SQL injection specification, the Boomerang-based taint analysis found all 17 known taint-flows in the OWASP WebGoat application, whereas with FlowDroid 13 taint-flows were found. Similarly, in a vulnerable version of the Java Spring PetClinic application, the Boomerang-based taint analysis found all seven expected taint-flows. In seven real-world Android apps with 25 expected malicious taint-flows, 18 taint-flows were detected. In a user study with 26 software developers, fluent TQL reached a high usability score. In comparison to CodeQL , the state-of-the-art DSL by Semmle/GitHub, participants found fluent TQL more usable and with it they were able to specify taint analysis queries in shorter time.


How far are German companies in improving security through static program analysis tools?

August 2022

·

47 Reads

As security becomes more relevant for many companies, the popularity of static program analysis (SPA) tools is increasing. In this paper, we target the use of SPA tools among companies in Germany with a focus on security. We give insights on the current issues and the developers' willingness to configure the tools to overcome these issues. Compared to previous studies, our study considers the companies' culture and processes for using SPA tools. We conducted an online survey with 256 responses and semi-structured interviews with 17 product owners and executives from multiple companies. Our results show a diversity in the usage of tools. Only half of our survey participants use SPA tools. The free tools tend to be more popular among software developers. In most companies, software developers are encouraged to use free tools, whereas commercial tools can be requested. However, the product owners and executives in our interviews reported that their developers do not request new tools. We also find out that automatic security checks with tools are rarely performed on each release.


To what extent can we analyze Kotlin programs using existing Java taint analysis tools?

July 2022

·

47 Reads

As an alternative to Java, Kotlin has gained rapid popularity since its introduction and has become the default choice for developing Android apps. However, due to its interoperability with Java, Kotlin programs may contain almost the same security vulnerabilities as their Java counterparts. Hence, we question: to what extent can one use an existing Java static taint analysis on Kotlin code? In this paper, we investigate the challenges in implementing a taint analysis for Kotlin compared to Java. To answer this question, we performed an exploratory study where each Kotlin construct was examined and compared to its Java equivalent. We identified 18 engineering challenges that static-analysis writers need to handle differently due to Kotlin's unique constructs or the differences in the generated bytecode between the Kotlin and Java compilers. For eight of them, we provide a conceptual solution, while six of those we implemented as part of SecuCheck-Kotlin, an extension to the existing Java taint analysis SecuCheck.


Fluently specifying taint-flow queries with fluentTQL

April 2022

·

103 Reads

Previous work has shown that taint analyses are only useful if correctly customized to the context in which they are used. Existing domain-specific languages (DSLs) allow such customization through the definition of deny-listing data-flow rules that describe potentially vulnerable taint-flows. These languages, however, are designed primarily for security experts who are knowledgeable in taint analysis. Software developers consider these languages to be complex. This paper presents fluentTQL, a query language particularly for taint-flow. fluentTQL is internal Java DSL and uses a fluent-interface design. fluentTQL queries can express various taint-style vulnerability types, e.g. injections, cross-site scripting or path traversal. This paper describes fluentTQL's abstract and concrete syntax and defines its runtime semantics. The semantics are independent of any underlying analysis and allows evaluation of fluentTQL queries by a variety of taint analyses. Instantiations of fluentTQL, on top of two taint analysis solvers, Boomerang and FlowDroid, show and validate fluentTQL expressiveness. Based on existing examples from the literature, we implemented queries for 11 popular security vulnerability types in Java. Using our SQL injection specification, the Boomerang-based taint analysis found all 17 known taint-flows in the OWASP WebGoat application, whereas with FlowDroid 13 taint-flows were found. Similarly, in a vulnerable version of the Java PetClinic application, the Boomerang-based taint analysis found all seven expected taint-flows. In seven real-world Android apps with 25 expected taint-flows, 18 were detected. In a user study with 26 software developers, fluentTQL reached a high usability score. In comparison to CodeQL, the state-of-the-art DSL by Semmle/GitHub, participants found fluentTQL more usable and with it they were able to specify taint analysis queries in shorter time.


Citations (8)


... Croft et al. (2021) concluded that although learning-based approaches had better precision, both learning-based and SAST tools approaches should be used independently. Piskachev et al. (2023) did a user study of SAST tools in resolving security vulnerabilities and provided a list of recommendations for software security professionals and practitioners. Scandariato et al. (2013) studied users' experiences of nine participants using a SAST tool and an automated tool for penetration testing on two blogging applications. ...

Reference:

Comparing effectiveness and efficiency of Interactive Application Security Testing (IAST) and Runtime Application Self-Protection (RASP) tools in a large java-based system
Can the configuration of static analyses make resolving security vulnerabilities more effective? - A user study

Empirical Software Engineering

... Bugs in the code can lead to program failures after prolonged execution, resulting in reduced productivity and wasted resources [25]. Silent bugs that do not cause the program to crash at run-time are also common but difficult to detect [10,12,22], causing misleading predictions. Effective static analysis techniques are primary means for detecting such ML bugs prior to execution. ...

Shifting Left for Early Detection of Machine-Learning Bugs
  • Citing Chapter
  • March 2023

Lecture Notes in Computer Science

... Taint analysis is to determine how taint data can influence the program through both data flow and control flow. In order to fluently customize various taint analysis problems, a query language fluentTQL is presented [144]. Firstly, it enables a developer to establish the relationship between functions and the three fundamental elements of taint analysis: source, sanitizer, and sink. ...

Fluently specifying taint-flow queries with fluentTQL

Empirical Software Engineering

... While recent work has shown that clever program abstractions and algorithms can improve analysis precision and speed at the same time [172], scalability remains a challenge when it comes to analyzing large code bases. Another major shortcoming of static analysis, shared with other techniques, is that analyzers need to be configured: they only report what they are configured to report, which is why they must be told, e.g., which particular API calls in which combination can lead to which kinds of vulnerabilities [157,158]. In other words: static analysis also does not completely forego a specification, yet here one specifies vulnerability types, not program functionality. ...

SecuCheck: Engineering configurable taint analysis for software developers
  • Citing Conference Paper
  • September 2021

... Instead of high-effort random sampling, Alice (cf. Section II) can now use SliceViz to generate program slices originating from all privacy-relevant data sources, and visualize [48]. and explain these slices to Bob. ...

TaintBench: Automatic real-world malware benchmarking of Android taint analyses

Empirical Software Engineering

... An exploration into existing methodologies revealed an absence of a technique that could statically identify all function calls in a notebook with acceptable precision, recall, and execution time. This can be attributed to Python's inherent complexities such as duck typing, dynamic code execution, reflection, among others that are challenging to static analysis (Salis et al. 2021;Kummita et al. 2021). Moreover, in contrast to other programming languages like Java, Python lacks a lot of tool-support for state-of-the-art static analysis (SA) techniques (Yang et al. 2022b). ...

Qualitative and Quantitative Analysis of Callgraph Algorithms for Python
  • Citing Conference Paper
  • March 2021

... Many previous studies on taint analysis [25,31,32] have used predefined sets of methods which are potential sources of system API privacy-related data. Follow-up work then proposed machine-learning approaches for automatically classifying and categorizing methods into sources [24,[33][34][35]. In 2023, Kober et al. [36] provided a sound definition of sensitive data derived from the definition of personal data of several legal frameworks (including GDPR). ...

SWAN_ASSIST: Semi-Automated Detection of Code-Specific, Security-Relevant Methods
  • Citing Conference Paper
  • November 2019

... Many previous studies on taint analysis [25,31,32] have used predefined sets of methods which are potential sources of system API privacy-related data. Follow-up work then proposed machine-learning approaches for automatically classifying and categorizing methods into sources [24,[33][34][35]. In 2023, Kober et al. [36] provided a sound definition of sensitive data derived from the definition of personal data of several legal frameworks (including GDPR). ...

Codebase-adaptive detection of security-relevant methods
  • Citing Conference Paper
  • July 2019