Wolfram Schulte

Wolfram Schulte
Meta · Data Infrastructure

Phd (+ Habilitation)

About

192
Publications
27,609
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,895
Citations
Introduction

Publications

Publications (192)
Conference Paper
Tens of thousands of Microsoft engineers build and test hundreds of software products several times a day. It is essential that this continuous integration scales, guarantees short feedback cycles, and functions reliably with minimal human intervention. During the past three years TSE's charter has been to shorten this cycle time. We went after thi...
Conference Paper
Thousands of Microsoft engineers build and test hundreds of software products several times a day. It is essential that this continuous integration scales, guarantees short feedback cycles, and functions reliably with minimal human intervention. This paper describes CloudBuild, the build service infrastructure developed within Microsoft over the la...
Conference Paper
Full-text available
Testing distributed systems is challenging due to multiple sources of nondeterminism. Conventional testing techniques, such as unit, integration and stress testing, are ineffective in preventing serious but subtle bugs from reaching production. Formal techniques, such as TLA+, can only verify high-level specifications of systems at the level of log...
Patent
A tracing just-in-time (TJIT) compiler system is described for performing parallelization of code in a runtime phase in the execution of code. Upon detecting a hot loop during the execution of the code, the compiler system extracts trace information from sequentially recorded traces. In a first phase, the compiler system uses the trace information...
Book
Fields of Logic and Computation II This Festschrift has been published in honor of Yuri Gurevich, on the occasion of his 75th birthday. Yuri Gurevich has made a number of fundamental contributions to the broad spectrum of logic and computer science, including decision procedures, the monadic theory of order, abstract state machines, formal method...
Article
Full-text available
The efficiency of a build system is an important factor for developer productivity. As a result, developer teams have been increasingly adopting new build systems that allow higher build parallelization. However, migrating the existing legacy build scripts to new build systems is a tedious and error-prone process. Unfortunately, there is insufficie...
Article
Full-text available
Waiting for Godot is a famous play by Samuel Beckett, in which two men occupy their time while waiting, indefinitely, for the arrival of their friend, Godot. As the play progresses we learn that while both men claim Godot as an acquaintance, they really do not know him and further, would not recognize him even if they were to see him. As a discipli...
Conference Paper
Full-text available
CloudMake is a software utility that automatically builds executable programs and libraries from source code—a modern Make utility. Its design gives rise to a number of possible optimizations, like cached builds, and the executables to be built are described using a functional programming language. This paper formally and mechanically verifies the...
Article
A finite-state machine (FSM) is an important abstraction for solving several problems, including regular-expression matching, tokenizing text, and Huffman decoding. FSM computations typically involve data-dependent iterations with unpredictable memory-access patterns making them difficult to parallelize. This paper describes a parallel algorithm fo...
Article
A finite-state machine (FSM) is an important abstraction for solving several problems, including regular-expression matching, tokenizing text, and Huffman decoding. FSM computations typically involve data-dependent iterations with unpredictable memory-access patterns making them difficult to parallelize. This paper describes a parallel algorithm fo...
Conference Paper
A finite-state machine (FSM) is an important abstraction for solving several problems, including regular-expression matching, tokenizing text, and Huffman decoding. FSM computations typically involve data-dependent iterations with unpredictable memory-access patterns making them difficult to parallelize. This paper describes a parallel algorithm fo...
Book
This book constitutes the refereed proceedings of the 17th International Conference on Model Driven Engineering Languages and Systems, MODELS 2014, held in Valencia, Spain, in September/October 2014. The 41 full papers presented in this volume were carefully reviewed and selected from a total of 126 submissions. The scope of the conference series i...
Article
Traditionally software development tools, such as compilers, linkers and build engines, were designed for use by individual engineers on dedicated desktop machines. As these tools evolved, they scaled-up by exploiting more powerful development machines. The same tools were adapted for team development to run on small, project-specific clusters by d...
Patent
Full-text available
A finite domain approximation for symbolic terms of a symbolic state is derived, given some finite domains for basic terms of the symbolic state. A method is executed recursively for symbolic sub-terms of a symbolic term, providing a domain over-approximation that can then be provided to a solver for determining a more accurate domain. The method c...
Article
The scale and speed of today's software development efforts impose unprecedented constraints on the pace and quality of decisions made during planning, implementation, and postrelease maintenance and support for software. Decisions during the planning process include level of staffing and choosing a development model given the scope of a project an...
Patent
Full-text available
A design space exploration (DSE) system automatically discovers viable solutions within a design space. The DSE system operates by creating or receiving a design specification that is described using a design language. The design specification contains a collection of constraints that an acceptable architecture is expected to satisfy. The DSE syste...
Patent
Full-text available
A state component saves a present state of a program or model. This state component can be invoked by the program or model itself, thereby making state a first-class citizen. As the state of the program evolves from the saved state, the saved state remains for reflection and recall, for example, for testing, verification, transaction processing, et...
Patent
Full-text available
An extension of symbolic execution for programs involving contracts with quantifiers over large and potentially unbounded domains is described. Symbolic execution is used to generate, from a program, concrete test cases that exhibit mismatches between the program code and its contracts with quantifiers. Quantifiers are instantiated using symbolic v...
Conference Paper
Full-text available
Fine-grained data parallelism is increasingly common in mainstream processors in the form of longer vectors and on-chip GPUs. This paper develops support for exploiting such data parallelism for a class of non-numeric, non-graphic applications, which perform computations while traversing many independent, irregular data structures. While the traver...
Chapter
FORMULA 2.0 is a novel formal specification language based on open-world logic programs and behavioral types. Its goals are (1) succinct specifications of domain-specific abstractions and compilers, (2) efficient reasoning and compilation of input programs, (3) diverse synthesis and fast verification. We take a unique approach towards achieving the...
Conference Paper
Declarative specification languages with constraints are used in model-driven engineering to specify formal semantics, define model transformations, and describe domain constraints. While these languages support concise specifications, they are nevertheless prone to difficult semantic errors. In this paper we present a type-theoretic approach to th...
Conference Paper
Modern program analysis and model-based tools are increasingly complex and multi-faceted software systems. They analyze models and programs using advanced type systems, model checking or model finding, abstract interpretation, symbolic verification or a combination thereof. In this talk I will discuss and compare 10 program analysis tools, which MS...
Conference Paper
Full-text available
Fine-grain data parallelism is increasingly common in mainstream processors in the form of long vectors and on-chip GPUs. This paper develops compiler and runtime support to exploit such data parallelism for non-numeric, non-graphic, irregular parallel tasks that perform simple computations while traversing many independent, irregular data structur...
Article
Automated code analysis is technology aimed at locating, describing and repairing areas of weakness in code. Code weaknesses range from security vulnerabilities, logic errors, concurrency violations, to improper resource usage, violations of architectures or coding guidelines. Common to all code analysis techniques is that they build abstractions o...
Article
This paper describes the ongoing development of ATTENTION, a new kind of clinical decision support system for synthesizing and managing longitudinal treatment plans, such as cancer treatment plans. ATTENTION combines stateof- the-art formal modeling and constraint solving with clinical information systems to synthesize complex cancer treatment plan...
Conference Paper
This paper studies the design of specification languages through their model theory. We show how language constructs and specification idioms are deeply rooted in the underlying model theory. We also show that some problems are fundamentally difficult to specify due to the underlying foundation of the language. The languages we study are Alloy, Mau...
Conference Paper
Full-text available
The practices of industrial and academic data mining are very different. These differences have significant implications for (a) how we manage industrial data mining projects; (b) the direction of academic studies in data mining; and (c) training programs for engineers who seek to use data miners in an industrial setting.
Conference Paper
Software is a designed artifact. In other design disciplines, such as building architecture, there is a well-established tradition of design studies which inform not only the discipline itself but also tool design, processes, and collaborative work. ...
Article
Full-text available
Spec # is a programming system that facilitates the development of correct software. The Spec # language extends C # with contracts that allow programmers to express their design intent in the code. The Spec # tool suite consists of a compiler that emits run-time checks for contracts, a static program verifier that attempts to mathematically prove...
Article
The computer industry is experiencing a major shift: improved single processor performance via higher clock rates has reached its technical limits due to overheating. Fortunately, Moore's law still holds, so chip makers use transistors to boost performance through parallelism in multicore and manycore processors. However, exploiting the full potent...
Conference Paper
Full-text available
Regular types represent sets of structured data, and have been used in logic programming (LP) for verification. However, first-class regular type systems are uncommon in LP languages. In this paper we present a new approach to regular types, based on type canonization, aimed at providing a practical first-class regular type system.
Article
Full-text available
The common conception of a (client-side) web application is some collection of HTML, CSS and JavaScript (JS) that is hosted within a web browser and that interacts with the user in some non-trivial ways. The common conception of a web browser is a monolithic program that can render HTML, execute JS, and gives the user a portal to navigate the web....
Article
Full-text available
The adoption of data parallel primitives to increase multicore utilization presents an opportunity for aggressive compiler optimization. We examine computations over the tree abstract datatype (ADT) in particular. For better utilization than approaches like flattening, we argue that transformations should specialize for common data and computation...
Conference Paper
Full-text available
Developer testing is a type of testing where developers test their code as they write it, as opposed to testing done by a separate quality assurance organization. Developer testing has been widely recognized as an important and valuable means of improving software reliability, as it exposes faults early in the software development life cycle. Effec...
Article
Tracing just-in-time compilers (TJITs) determine frequently executed traces (hot paths and loops) in running programs and focus their optimization effort by emitting optimized machine code specialized to these traces. Prior work has established this strategy to be especially beneficial for dynamic languages such as JavaScript, where the TJIT interf...
Conference Paper
Full-text available
Although much progress has been made in software verification, software testing remains by far the most widely used technique for improving software reliability. Among various types of testing, developer testing is a type of testing where developers test their code as they write it, as opposed to testing done by a separate quality assurance organiz...
Conference Paper
Full-text available
Tracing just-in-time compilers (TJITs) determine frequently executed traces (hot paths and loops) in running programs and focus their optimization effort by emitting optimized machine code specialized to these traces. Prior work has established this strategy to be especially beneficial for dy- namic languages such as JavaScript, where the TJIT inte...
Article
Full-text available
This tutorial provides basic information about developing specifications and annotations for concurrent C programs, so that they can be verified with VCC. [TODO: add more] 1
Conference Paper
Full-text available
We describe a practical method for reasoning about realistic concurrent programs. Our method allows global two-state invariants that restrict update of shared state. We provide simple, sufficient conditions for checking those global invariants modularly. The method has been implemented in VCC 3, an automatic, sound, modular verifier for concurrent...
Conference Paper
Full-text available
This paper introduces matching logic, a novel framework for defining axiomatic semantics for programming languages, inspired from operational semantics. Matching logic specifications are particular first-order formulae with constrained algebraic structure, called patterns. Program configurations satisfy patterns iff they match their algebraic struc...
Article
Full-text available
Framing in the presence of data abstraction is a challenging and important problem in the verification of object-oriented programs Leavens et al. (Formal Aspects Comput (FACS) 19:159–189, 2007). The dynamic frames approach is a promising solution to this problem. However, the approach is formalized in the context of an idealized logical framework....
Conference Paper
Full-text available
Design space exploration (DSE) refers to the activity of exploring design alternatives prior to implementation. The power to operate on the space of potential design candidates renders DSE useful for many engineering tasks, including rapid prototyping, optimization, and system integration. The main challenge in DSE arises from the sheer size of the...
Conference Paper
Full-text available
Model transformations are indispensable to model-based development (MBD) where they act as translators between domain-specific languages (DSLs). As a result, transformations must be verified to ensure they behave as desired. Simultaneously, transformations may be reused as requirements evolve. In this paper we present novel algorithms to determine...
Article
Full-text available
Automatic parallelization of modern object-oriented languages, like Java, C#, Python or JavaScript, is considered to be a grand challenge. But what is the challenge exactly? Let us simplify the discussion by focusing on loop parallelization only. As usual loop parallelization requires answering two questions: (1) is it worthwhile to parallelize a l...
Conference Paper
Full-text available
Test coverage such as branch coverage is com- monly measured to assess the sufficiency of test inputs. To reduce tedious manual efforts in generating high-covering test inputs, various automated techniques have been proposed. Some recent effective techniques include Dynamic Symbolic Execution (DSE) based on path exploration. However, these existing...
Conference Paper
Full-text available
The Task Parallel Library (TPL) is a library for .NET that makes it easy to take advantage of potential parallelism in a program. The library relies heavily on generics and delegate expressions to provide custom control structures expressing structured parallelism such as map-reduce in user programs. The library implementation is built around the n...
Article
Full-text available
We present a methodology for automated modular verification of C programs against specifications written in separation logic. Main features of our approach are a faithful representation of the C memory model and use of a SMT solver behind the separation logic prover. The methodology has been implemented in a prototype tool and used to automatically...
Article
Full-text available
Verification for OO programs typically starts from a strongly typed object model in which distinct objects/fields are guaranteed not to overlap. This model simplifies verification by eliminating all “uninteresting” aliasing and allowing the use of more efficient frame axioms. Unfortunately, this model is unsound and incomplete for languages like C,...
Conference Paper
Full-text available
C is the most widely used imperative system’s implementation language. While C provides types and high-level abstractions, its design goal has been to provide highest performance which often requires low-level access to memory. As a consequence C supports arbitrary pointer arithmetic, casting, and explicit allocation and deallocation. These operati...
Conference Paper
Full-text available
VCC is an industrial-strength verification environment for low-level concurrent system code written in C. VCC takes a program (annotated with function contracts, state assertions, and type invariants) and attempts to prove the correctness of these annotations. It includes tools for monitoring proof attempts and constructing partial counterex- ample...
Conference Paper
Full-text available
Non-functional requirements encompass important design con-cerns such as schedulability, security, and communication constraints. In model-based development they non-locally impact admissible platform-mappings and design spaces. In this paper we present a novel and for-mal approach for specifying non-functional requirements as constraint-systems ov...
Conference Paper
Full-text available
The quest for modular concurrency reasoning has led to re- cent proposals that extend program assertions to include not just knowl- edge about the state, but rights to access the state. We argue that these rights are really just sugar for knowledge that certain updates preserve certain invariants.
Conference Paper
Full-text available
Dynamic symbolic execution is a structural testing technique that systematically explores feasible paths of the program under test by running the program with different test inputs to improve code coverage. To address the space-explosion issue in path exploration, we propose a novel approach called Fitnex, a search strategy that uses state-dependen...
Conference Paper
Full-text available
Unit testing is a technique of testing a single unit of a program in isolation. The testability of the unit under test can be reduced when the unit interacts with its environment. The construction of high-covering unit tests and their execution require appropriate interactions with the environment such as a file system or database. To help set up t...
Article
Full-text available
Boogie is a verification condition generator for an imperative core language. It has front-ends for the programming languages C# and C enriched by annotations in first-order logic, i.e. pre- and postconditions, assertions, and loop invariants. Moreover, concepts like ghost fields, ghost variables, ghost code and specification functions have been in...
Article
Full-text available
Abstract State Machines (ASMs) allow modeling system behaviors at any desired level of abstraction, including a level with rich data types, such as sets, sequences, maps, and user-deflned data types. The availability of high-level data types allow state elements to be represented both abstractly and faithfully at the same time. In this paper we loo...
Conference Paper
Full-text available
Most system level software is written in C and executed concurrently. Because such software is often critical for system reliability, it is an ideal target for formal verification. Annotated C and the Verified C Compiler (VCC) form the first modular sound verification methodology for concurrent C that scales to real-world production code. VCC is in...
Conference Paper
Full-text available
Regression test generation aims at generating a test suite that can detect behavioral differences between the original and the modified versions of a program. Regression test generation can be automated by using dynamic symbolic execution (DSE), a state-of-the-art test generation technique, to generate a test suite achieving high structural coverag...
Conference Paper
Full-text available
An objective of unit testing is to achieve high structural cov- erage of the code under test. Achieving high structural cov- erage of object-oriented code requires desirable method-call sequences that create and mutate objects. These sequences help generate target object states such as argument or re- ceiver object states (in short as target states...
Article
Full-text available
Recently parameterized unit testing has emerged as a promising and effective methodology to allow the separa-tion of (1) specifying external, black-box behavior (e.g., as-sumptions and assertions) by developers and (2) generat-ing and selecting internal, white-box test inputs (i.e., high-code-covering test inputs) by tools. A parameterized unit tes...
Article
Full-text available
In this talk I will report on two operating system (OS) efforts at Microsoft Research: Singularity [1, 2] and Service OS [3, 4]. Singularity focuses on the construction of dependable multi-user operating systems through innovation in the areas of systems, languages, and tools. One of Singularity's major innovations is for example that Singularity u...
Article
Full-text available
We are interested in object-oriented programming methodologies that enable static verification of object-invariants. Reasoning soundly and effectively about the consistency of objects is still one of the main stumbling blocks to push-ing object-oriented program verification into the mainstream. In this paper we explore a simple model of invariants...
Article
Full-text available
Reasoning about multithreaded object-oriented programs is difficult, due to the nonlocal nature of object aliasing and data races. We propose a programming regime (or programming model) that rules out data races, and enables local reasoning in the presence of object aliasing and concurrency. Our programming model builds on the multithreading and sy...
Article
Full-text available
During the last 10 years, code inspection for standard programming errors has largely been automated with static code analysis. During the next 10 years, we expect to see similar progress in automating testing, and specifically test generation, thanks to advances in program analysis, efficient constraint solvers, and powerful computers. Three new t...
Conference Paper
Full-text available
One of the most challenging problems in deductive program verification is to find inductive program invariants typically expressed using quantifiers. With strong-enough invariants, existing provers can often prove that a program satisfies its specification. However, provers by themselves do not find such invariants. We propose to automatically gene...
Conference Paper
Full-text available
Designing and interoperability testing of distributed, ap plication-level network protocols is complex. Windows, for example, supports currently more than 200 protocols, ranging from simple protocols for email exchange to com- plex ones for distributed file replication or real time commu nication. To fight this increasing complexity problem, we int...
Conference Paper
Full-text available
Model generation is an important formal technique for nd- ing interesting instances of computationally hard problems. In this pa- per we study model generation over Horn logic under the closed world assumption extended with stratied negation. We provide a novel three- stage algorithm that solves this problem: First, we reduce the relevant Horn clau...
Conference Paper
Full-text available
Data-centric business applications comprise an important class of distributed systems that includes on-line stores, document man- agement systems, and patient portals. However, their complexity makes it dicult to design and implement them. We address these issues from a model-driven perspective by developing a formal, compositional, and domain-spec...
Conference Paper
Full-text available
Data abstraction is crucial in the construction of modular programs, since it ensures that internal changes in one module do not propagate to other modules. In object-oriented programs, classes typically enforce data abstraction by providing access to their internal state only through methods. By using method calls in method contracts, data abstrac...
Conference Paper
SSEAT 2008 is a workshop that focuses on the recent research approaches to automated testing using state-space exploration techniques. The goal of the workshop is to bring together researchers from both industry and academia to informally discuss the latest successes and remaining challenges in this domain. One important aspect of the workshop is t...
Conference Paper
Full-text available
Testing is one of the costliest aspects of commercial software development. Model-based testing is a promising approach addressing these deficits. At Mi- crosoft, model-based testing technology developed by the Foundations of Software Engineering group in Microsoft Research has been used since 2003. The second generation of this tool set, Spec Expl...
Article
Full-text available
Creating ultra-large-scale systems requires technological ad-vances across the board [13]. The challenge is so grand that emerging technologies address small subproblems, such as: providing a service or-chestration layer, guaranteeing quality of service (QoS), and facilitating decentralized discovery. Engineers wishing to implement a complete sys-t...
Article
Full-text available
A unit test for an object-oriented program involves a sequence of method calls, which create, mutate, or observe objects. Some method arguments may have primitive types. Recent advances in symbolic execution enable the effective generation of relevant primitive val-ues, given a fixed sequence of method calls. However, there exists a challenge in te...
Article
Full-text available
Cloud applications are web-based distributed systems deployed over a fluctuating set of computing nodes and ser-vices. The design of cloud applications is particularly chal-lenging because few assumptions can be made about the connectivity of nodes, the availability of services, as well as how the computing fabric will evolve in the long term. In t...
Article
Full-text available
In a traditional approach to program verification, the cor- rectness of each procedure of a given program is encoded as a logical for- mula called the verification condition. It is then up to a theorem prover, like an automatic SMT solver, to analyze the verification condition in the attempt to either establish the validity of the formula (thus pro...