Tevfik Bultan

University of California, Santa Barbara, Santa Barbara, California, United States

Are you Tevfik Bultan?

Claim your profile

Publications (133)38.82 Total impact

  • Jaideep Nijjar · Ivan Bocić · Tevfik Bultan
    [Show abstract] [Hide abstract]
    ABSTRACT: Most software systems nowadays are Web-based applications that are deployed over compute clouds using a three-tier architecture, where the persistent data for the application is stored in a backend datastore and is accessed and modified by the server-side code based on the user interactions at the client-side. The data model forms the foundation of these three tiers, and identifies the sets of objects (object classes) and the relations among them (associations among object classes) stored by the application. In this article, we present a set of property patterns to specify properties of a data model, as well as several heuristics for automatically inferring them. We show that the specified or inferred data model properties can be automatically verified using bounded and unbounded verification techniques. For the properties that fail, we present techniques that generate fixes to the data model that establish the failing properties. We implemented this approach for Web applications built using the Ruby on Rails framework and applied it to ten open source applications. Our experimental results demonstrate that our approach is effective in automatically identifying and fixing errors in data models of real-world web applications.
    No preview · Article · Sep 2015 · ACM Transactions on Software Engineering and Methodology
  • Muath Alkhalaf · Abdulbaki Aydin · Tevfik Bultan
    [Show abstract] [Hide abstract]
    ABSTRACT: Correct validation and sanitization of user input is crucial in web applications for avoiding security vulnerabilities and erroneous application behavior. We present an automated differential repair technique for input validation and sanitization functions. Differential repair can be used within an application to repair client and server-side code with respect to each other, or across applications in order to strengthen the validation and sanitization checks. Given a reference and a target function, our differential repair technique strengthens the validation and sanitization operations in the target function based on the reference function. It does this by synthesizing three patches: a validation, a length, and a sanitization patch. Our automated patch synthesis algorithms are based on forward and backward symbolic string analyses that use automata as a symbolic representation. Composition of the three automatically synthesized patches with the original target function results in the repaired function, which provides stronger validation and sanitization than both the target and the reference functions.
    No preview · Article · Jul 2014
  • Ivan Bocić · Tevfik Bultan
    [Show abstract] [Hide abstract]
    ABSTRACT: Modern software applications store their data in remote cloud servers. Users interact with these applications using web browsers or thin clients running on mobile devices. A key issue in dependability of these applications is the correctness of the actions that update the data store, which are triggered by user requests. In this paper, we present techniques for au- tomatically checking if the actions of an application preserve the data model invariants. Our approach first automatically extracts a data model specification, which we call an abstract data store, from a given application using instrumented exe- cution. The abstract data store identifies the sets of objects and relations (associations) used by the application, and the actions that update the data store by deleting or creating objects or by changing the relations among the objects. We show that checking invariants of an abstract data store corre- sponds to inductive invariant verification, and can be done using a mapping to First Order Logic (FOL) and using a FOL theorem prover. We implemented this approach for the Rails framework and applied it to three open source applications. We found four previously unknown bugs and reported them to the developers, who confirmed and imme- diately fixed two of them.
    No preview · Article · May 2014
  • Abdulbaki Aydin · Muath Alkhalaf · Tevfik Bultan
    [Show abstract] [Hide abstract]
    ABSTRACT: Web applications need to validate and sanitize user inputs in order to avoid attacks such as Cross Site Scripting (XSS) and SQL Injection. Writing string manipulation code for input validation and sanitization is an error-prone process leading to many vulnerabilities in real-world web applications. Automata-based static string analysis techniques can be used to automatically compute vulnerability signatures (represented as automata) that characterize all the inputs that can exploit a vulnerability. However, there are several factors that limit the applicability of static string analysis techniques in general: 1) undesirability of static string analysis requires the use of approximations leading to false positives, 2) static string analysis tools do not handle all string operations, 3) dynamic nature of the scripting languages makes static analysis difficult. In this paper, we show that vulnerability signatures computed for deliberately insecure web applications (developed for demonstrating different types of vulnerabilities) can be used to generate test cases for other applications. Given a vulnerability signature represented as an automaton, we present algorithms for test case generation based on state, transition, and path coverage. These automatically generated test cases can be used to test applications that are not analyzable statically, and to discover attack strings that demonstrate how the vulnerabilities can be exploited.
    No preview · Conference Paper · Mar 2014
  • [Show abstract] [Hide abstract]
    ABSTRACT: As scalable information technology evolves to a more cloud-like model, digital assets (code, data and software environments) increasingly require curation as web-accessible services. "Service-izing" digital assets consists of encapsulating assets in software that exposes them to web and mobile applications via well-defined yet flexible, network accessible, application programming interfaces (APIs). In this paper, we postulate that recent advances in cloud computing make cloud platforms as-a-service (PaaS) ideal for deployment, lifecycle management, and policy-based control i.e. API governance - for extant and future digital assets. Toward this end, we overview API governance as a PaaS technology and outline some early results generated by our investigation of a prototype we are developing, called EAGER, for implementing API governance at scale.
    No preview · Conference Paper · Mar 2014
  • Fang Yu · Muath Alkhalaf · Tevfik Bultan · Oscar H. Ibarra
    [Show abstract] [Hide abstract]
    ABSTRACT: Verifying string manipulating programs is a crucial problem in computer security. String operations are used extensively within web applications to manipulate user input, and their erroneous use is the most common cause of security vulnerabilities in web applications. We present an automata-based approach for symbolic analysis of string manipulating programs. We use deterministic finite automata (DFAs) to represent possible values of string variables. Using forward reachability analysis we compute an over-approximation of all possible values that string variables can take at each program point. Intersecting these with a given attack pattern yields the potential attack strings if the program is vulnerable. Based on the presented techniques, we have implemented Stranger, an automata-based string analysis tool for detecting string-related security vulnerabilities in PHP applications. We evaluated Stranger on several open-source Web applications including one with 350,000+ lines of code. Stranger is able to detect known/unknown vulnerabilities, and, after inserting proper sanitization routines, prove the absence of vulnerabilities with respect to given attack patterns.
    No preview · Article · Feb 2014 · Formal Methods in System Design
  • Source
    Meriem Ouederni · Gwen Salaün · Tevfik Bultan

    Preview · Article · Oct 2013
  • Jaideep Nijjar · Tevfik Bultan
    [Show abstract] [Hide abstract]
    ABSTRACT: Nowadays many software applications are deployed over compute clouds using the three-tier architecture, where the persistent data for the application is stored in a backend datastore and is accessed and modified by the server-side code based on the user interactions at the client-side. The data model forms the foundation of these three tiers, and identifies the set of objects stored by the application and the relations (associations) among them. In this paper, we present techniques for automatically inferring properties about the data model by analyzing the relations among the object classes. We then check the inferred properties with respect to the semantics of the data model using automated verification techniques. For the properties that fail, we present techniques that generate fixes to the data model that establish the inferred properties. We implemented this approach for web applications built using the Ruby on Rails framework and applied it to five open source applications. Our experimental results demonstrate that our approach is effective in automatically identifying and fixing errors in data models of real-world web applications.
    No preview · Conference Paper · Jul 2013
  • Jaideep Nijjar · Ivan Bocic · Tevfik Bultan
    [Show abstract] [Hide abstract]
    ABSTRACT: Most modern web applications are built using development frameworks based on the Model-View-Controller (MVC) pattern. In MVC-based web applications the data model specifies the types of objects used by the application and the relations among them. Since the data model forms the foundation of such applications, its correctness is crucial. In this paper we present a tool, IDAVER, that 1) automatically extracts a formal data model specification from applications implemented using the Ruby on Rails framework, 2) provides templates for specifying data model properties, 3) automatically translates the verification of properties specified using these templates to satisfiability queries in three different logics, and 4) uses automated decision procedures and theorem provers to identify which properties are satisfied by the data model, and 5) reports counterexample instances for the properties that fail. Our tool achieves scalable automated verification by exploiting the modularity in the MVC pattern. IDAVER does not require formal specifications to be written manually; thus, our tool enables automated verification and increases the usability by combining automated data model extraction with template-based property specification.
    No preview · Conference Paper · May 2013
  • Tevfik Bultan
    [Show abstract] [Hide abstract]
    ABSTRACT: Since software systems are becoming increasingly more concurrent and distributed, modeling and analysis of interactions among their components is a crucial problem. In several application domains, message-based communication is used as the interaction mechanism, and the communication contract among the components of the system is specified semantically as a state machine. In the service-oriented computing domain this type of message-based communication contracts are called “choreography” specifications. A choreography specification identifies allowable ordering of message exchanges in a distributed system. A fundamental question about a choreography specification is determining its realizability, i.e., given a choreography specification, is it possible to build a distributed system that communicates exactly as the choreography specifies? In this short paper we give an overview of this problem, summarize some of the recent results and discuss its application to web service choreographies, Singularity OS channel contracts, and UML collaboration (communication) diagrams.
    No preview · Chapter · Jan 2013
  • Source
    Muath Alkhalaf · Tevfik Bultan · Jose L. Gallegos
    [Show abstract] [Hide abstract]
    ABSTRACT: Client-side computation in web applications is becoming increasingly common due to the popularity of powerful client-side programming languages such as JavaScript. Clientside computation is commonly used to improve an application's responsiveness by validating user inputs before they are sent to the server. In this paper, we present an analysis technique for checking if a client-side input validation function conforms to a given policy. In our approach, input validation policies are expressed using two regular expressions, one specifying the maximum policy (the upper bound for the set of inputs that should be allowed) and the other specifying the minimum policy (the lower bound for the set of inputs that should be allowed). Using our analysis we can identify two types of errors 1) the input validation function accepts an input that is not permitted by the maximum policy, or 2) the input validation function rejects an input that is permitted by the minimum policy. We implemented our analysis using dynamic slicing to automatically extract the input validation functions from web applications and using automata-based string analysis to analyze the extracted functions. Our experiments demonstrate that our approach is effective in finding errors in input validation functions that we collected from real-world applications and from tutorials and books for teaching JavaScript.
    Preview · Article · Jun 2012 · Proceedings - International Conference on Software Engineering
  • Source
    Samik Basu · Tevfik Bultan · Meriem Ouederni
    [Show abstract] [Hide abstract]
    ABSTRACT: Message-based communication is an increasingly common interaction mechanism used in concurrent and distributed systems where components interact with each other by sending and receiving messages. It is well-known that verification of systems that use asynchronous message-based communication with unbounded FIFO queues is undecidable even when the component behaviors are expressed using finite state machines. In this paper we show that there is a sub-class of such systems, called synchronizable systems, for which certain reachability properties (over send actions and over states with no pending receives) remain unchanged when asynchronous communication is replaced with synchronous communication. Hence, if a system is synchronizable, then the verification of these reachability properties can be done on the synchronous version of the system and the results hold for the asynchronous case. We present a technique for deciding if a given system is synchronizable. Our results are applicable to a variety of domains including verification and analysis of interactions among processes at the OS level, coordination in service-oriented computing and interactions among distributed programs. In this paper we focus on analysis of channel contracts in the Singularity OS. Our experimental results show that almost all channel contracts in the Singularity OS are synchronizable, and, hence, their properties can be analyzed using synchronous communication semantics.
    Full-text · Conference Paper · Jan 2012
  • Source
    Samik Basu · Tevfik Bultan · Meriem Ouederni
    [Show abstract] [Hide abstract]
    ABSTRACT: Since software systems are becoming increasingly more concurrent and distributed, modeling and analysis of interactions among their components is a crucial problem. In several application domains, message-based communication is used as the interaction mechanism, and the communication contract among the components of the system is specified semantically as a state machine. In the service-oriented computing domain such communication contracts are called "choreography" specifications. A choreography specification identifies allowable ordering of message exchanges in a distributed system. A fundamental question about a choreography specification is determining its realizability, i.e., given a choreography specification, is it possible to build a distributed system that communicates exactly as the choreography specifies? Checking realizability of choreography specifications has been an open problem for several years and it was not known if this was a decidable problem. In this paper we give necessary and sufficient conditions for realizability of choreographies. We implemented the proposed realizability check and our experiments show that it can efficiently determine the realizability of 1) web service choreographies, 2) Singularity OS channel contracts, and 3) UML collaboration (communication) diagrams.
    Full-text · Conference Paper · Jan 2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Since web applications are easily accessible, and often store a large amount of sensitive user information, they are a common target for attackers. In particular, attacks that focus on input validation vulnerabilities are extremely effective and dangerous. To address this problem, we developed ViewPoints--a technique that can identify erroneous or insufficient validation and sanitization of the user inputs by automatically discovering inconsistencies between client- and server-side input validation functions. Developers typically perform redundant input validation in both the front-end (client) and the back-end (server) components of a web application. Client- side validation is used to improve the responsiveness of the application, as it allows for responding without communicating with the server, whereas server-side validation is necessary for security reasons, as malicious users can easily circumvent client-side checks. ViewPoints (1) automatically extracts client- and server-side input validation functions, (2) models them as deterministic finite automata (DFAs), and (3) compares client- and server-side DFAs to identify and report the inconsistencies between the two sets of checks. Our initial evaluation of the technique is promising: when applied to a set of real-world web applications, ViewPoints was able to automatically identify a large number of inconsistencies in their input validation functions.
    Full-text · Article · Jan 2012
  • Source
    Jaideep Nijjar · Tevfik Bultan
    [Show abstract] [Hide abstract]
    ABSTRACT: The growing influence of web applications in every aspect of society makes their dependability an immense concern. A fundamental building block of web applications that use the Model-View-Controller (MVC) pattern is the data model, which specifies the object classes and the relations among them. We present an approach for unbounded, automated verification of data models that 1) extracts a formal data model from an Object Relational Mapping, 2) converts verification queries about the data model to queries about the satisfiability of formulas in the theory of uninterpreted functions, and 3) uses a Satisfiability Modulo Theories (SMT) solver to check the satisfiability of the resulting formulas. We implemented this approach and applied it to five open-source Rails applications. Our results demonstrate that the proposed approach is feasible, and is more efficient than SAT-based bounded verification.
    Preview · Article · Jan 2012
  • Source
    Fang Yu · Tevfik Bultan · Ben Hardekopf
    [Show abstract] [Hide abstract]
    ABSTRACT: Verifying string manipulating programs is a crucial problem in computer security. String operations are used extensively within web applications to manipulate user input, and their erroneous use is the most common cause of security vulnerabilities in web applications. Unfortunately, verifying string manipulating programs is an undecidable problem in general and any approximate string analysis technique has an inherent tension between efficiency and precision. In this paper we present a set of sound abstractions for strings and string operations that allow for both efficient and precise verification of string manipulating programs. Particularly, we are able to verify properties that involve implicit relations among string variables. We first describe an abstraction called regular abstraction which enables us to perform string analysis using multi-track automata as a symbolic representation. We then introduce two other abstractions—alphabet abstraction and relation abstraction—that can be used in combination to tune the analysis precision and efficiency. We show that these abstractions form an abstraction lattice that generalizes the string analysis techniques studied previously in isolation, such as size analysis or non-relational string analysis. Finally, we empirically evaluate the effectiveness of these abstraction techniques with respect to several benchmarks and an open source application, demonstrating that our techniques can improve the performance without loss of accuracy of the analysis when a suitable abstraction class is selected.
    Preview · Conference Paper · Jul 2011
  • Source
    Samik Basu · Tevfik Bultan
    [Show abstract] [Hide abstract]
    ABSTRACT: Choreography analysis has been a crucial problem in service oriented computing. Interactions among services involve message exchanges across organizational boundaries in a distributed computing environment, and in order to build such systems in a reliable manner, it is necessary to develop techniques for analyzing such interactions. Choreography conformance involves verifying that a set of services behave according to a given choreography specification that characterizes their interactions. Unfortunately this is an undecidable problem when services interact with asynchronous communication. In this paper we present techniques that identify if the interaction behavior for a set of services remain the same when asynchronous communication is replaced with synchronous communication. This is called the synchronizability problem and determining the synchronizability of a set of services has been an open problem for several years. We solve this problem in this paper. Our results can be used to identify synchronizable services for which choreography conformance can be checked efficiently. Our results on synchronizability are applicable to any software infrastructure that supports message-based interactions.
    Full-text · Conference Paper · Jan 2011
  • Source
    Fang Yu · Muath Alkhalaf · Tevfik Bultan
    [Show abstract] [Hide abstract]
    ABSTRACT: We present automata-based static string analysis techniques that automatically generate sanitization statements for patch- ing vulnerable web applications. Our approach consists of three phases: Given an attack pattern we first conduct a vul- nerability analysis to identify if strings that match the attack pattern can reach the security-sensitive functions. Next, we compute vulnerability signatures that characterize all input strings that can exploit the discovered vulnerability. Given the vulnerability signatures, we then construct sanitization statements that 1) check if a given input matches the vul- nerability signature and 2) modify the input in a minimal way so that the modified input does not match the vul- nerability signature. Our approach is capable of generating relational vulnerability signatures (and corresponding sani- tization statements) for vulnerabilities that are due to more than one input.
    Preview · Conference Paper · Jan 2011
  • Source
    Jaideep Nijjar · Tevfik Bultan
    [Show abstract] [Hide abstract]
    ABSTRACT: The use of scripting languages to build web applications has increased programmer productivity, but at the cost of degrading dependability. In this paper we focus on a class of bugs that appear in web applications that are built based on the Model-View-Controller architecture. Our goal is to automatically discover data model errors in Ruby on Rails applications. To this end, we created an automatic translator that converts data model expressions in Ruby on Rails applications to formal specifications. In particular, our translator takes Active Records specifications (which are used to specify data models in Ruby on Rails applications) as input and generates a data model in Alloy language as output. We then use bounded verification techniques implemented in the Alloy Analyzer to look for errors in these formal data model specifications. We applied our approach to two open source web applications to demonstrate its feasibility.
    Preview · Conference Paper · Jan 2011
  • Source
    Aysu Betin-Can · Sylvain Hallé · Tevfik Bultan
    [Show abstract] [Hide abstract]
    ABSTRACT: A crucial problem in service oriented computing is the specification and analysis of interactions among multiple peers that communicate via messages. We propose a design pattern that enables the specification of behavioral interfaces acting as communication contracts between peers. This "peer controller pattern" provides a modular, assume-guarantee style verification strategy that consists of three phases. 1) Each individual peer is statically verified for conformance to its part of the contract, using software model checking. 2) Alternately, a runtime enforcement mechanism blocks the communication events that violate the interface specification at runtime. 3) Using either of these two mechanisms, it can be assumed that the participating peers behave according to their interfaces and safety and liveness properties about the global behavior of the composite web service can then be verified directly on the communication contract. The interface verification of each peer and the behavior verification are hence handled in separate steps. A Java implementation of this pattern is developed and tested on a series of examples; we show that by working in such a modular fashion, it is possible to automatically and efficiently verify properties about service interactions that would otherwise be impossible to verify.
    Full-text · Article · Jan 2011 · IEEE Transactions on Services Computing

Publication Stats

3k Citations
38.82 Total Impact Points

Institutions

  • 1970-2015
    • University of California, Santa Barbara
      • Department of Computer Science
      Santa Barbara, California, United States
  • 1998-1999
    • University of Maryland, College Park
      • Department of Computer Science
      Maryland, United States
  • 1992
    • Bilkent University
      • Department of Computer Engineering
      Engüri, Ankara, Turkey