Mark Marron

Mark Marron
Microsoft · Research in Software Engineering (RiSE)

About

38
Publications
2,492
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
771
Citations

Publications

Publications (38)
Chapter
The financial technology sector is undergoing a transformation in moving to open-source and collaborative approaches as it works to address increasing compliance and assurance needs in its software stacks. Programming languages and validation technologies are a foundational part of this change. Based on this viewpoint, a consortium of leaders from...
Preprint
Strings are ubiquitous in code. Not all strings are created equal, some contain structure that makes them incompatible with other strings. CSS units are an obvious example. Worse, type checkers cannot see this structure: this is the latent structure problem. We introduce SafeStrings to solve this problem and expose latent structure in strings. Once...
Conference Paper
Logging is a fundamental part of the software development and deployment lifecycle but logging support is often provided as an afterthought via limited library APIs or third-party modules. Given the critical nature of logging in modern cloud, mobile, and IoT development workflows, the unique needs of the APIs involved, and the opportunities for opt...
Article
To write code, developers stitch together patterns, like API protocols or data structure traversals. Discovering these patterns can identify inconsistencies in code or opportunities to replace these patterns with an API or a language construct. We present coiling, a technique for automatically mining code for semantic idioms: surprisingly probable,...
Conference Paper
Time-traveling in the execution history of a program during debugging enables a developer to precisely track and understand the sequence of statements and program values leading to an error. To provide this functionality to real world developers, we embarked on a two year journey to create a production quality time-traveling debugger in Microsoft's...
Conference Paper
Interacting with computers is a ubiquitous activity for millions of people. Repetitive or specialized tasks often require creation of small, often one-off, programs. End-users struggle with learning and using the myriad of domain-specific languages (DSLs) to effectively accomplish these tasks. We present a general framework for constructing program...
Article
Programming by Examples (PBE) has the potential to revolutionize end-user programming by enabling end users, most of whom are non-programmers, to create small scripts for automating repetitive tasks. However, examples, though often easy to provide, are an ambiguous specification of the user's intent. Because of that, a key impedance in adoption of...
Article
Full-text available
Interacting with computers is a ubiquitous activity for millions of people. Repetitive or specialized tasks often require creation of small, often one-off, programs. End-users struggle with learning and using the myriad of domain-specific languages (DSLs) to effectively accomplish these tasks. We present a general framework for constructing program...
Article
Developers who set a breakpoint a few statements too late or who are trying to diagnose a subtle bug from a single core dump often wish for a time-traveling debugger. The ability to rewind time to see the exact sequence of statements and program values leading to an error has great intuitive appeal but, due to large time and space overheads, time-t...
Article
Developers who set a breakpoint a few statements too late or who are trying to diagnose a subtle bug from a single core dump often wish for a time-traveling debugger. The ability to rewind time to see the exact sequence of statements and program values leading to an error has great intuitive appeal but, due to large time and space overheads, time t...
Article
Millions of computer end users need to perform tasks over tabular spreadsheet data, yet lack the programming knowledge to do such tasks automatically. This paper describes the design and implementation of a robust natural language based interface to spreadsheet programming. Our methodology involves designing a typed domain-specific language (DSL) t...
Conference Paper
Existing pattern-based compiler technology is unable to effectively exploit the full potential of SIMD architectures. We present a new program synthesis based technique for auto-vectorizing performance critical innermost loops. Our synthesis technique is applicable to a wide range of loops, consistently produces performant SIMD code, and generates...
Conference Paper
Full-text available
A large gap exists between the wide range of admissible heap structures and those that programmers actually build. To understand this gap, we empirically study heap structures and their sharing relations in real-world programs. Our goal is to characterize these heaps. Our study rests on a heap abstraction that uses structural indistinguishability p...
Article
Use-after-free vulnerabilities are rapidly growing in popularity, especially for exploiting web browsers. Use-after-free (and double-free) vulnerabilities are caused by a program operating on a dangling pointer. In this work we propose early detection, a novel runtime approach for finding and diagnosing use-after-free and double-free vulnerabilitie...
Article
Full-text available
The identification, isolation, and correction of program defects re-quire the understanding of both the algorithmic structure of the code as well as the data structures that are being manipulated. While modern development environments provide substantial sup-port for examining the program source code (the algorithmic aspect of the program), they pr...
Conference Paper
The computational cost and precision of a shape style heap analysis is highly dependent on the way method calls are handled. This paper introduces a new approach to analyzing method calls that leverages the fundamental object-oriented programming concepts of encapsulation and invariants. The analysis consists of a novel partial context-sensitivity...
Article
This paper introduces a new hybrid memory analysis, Structural Analysis, which combines an expressive shape analysis style abstract domain with efficient and simple points-to style transfer functions. Using data from empirical studies on the runtime heap structures and the programmatic idioms used in modern object-oriented languages we construct a...
Article
Full-text available
Modern programming environments provide extensive support for inspecting, analyzing, and testing programs based on the algorithmic structure of a program. Unfortunately, support for inspecting and understanding runtime data structures during execution is typically much more limited. This paper provides a general purpose technique for abstracting an...
Article
Software rarely uses all the potential performance available in a modern microprocessor. For example, on an Intel Core 2 class workstation—a microprocessor capable of executing 4 instructions per cycle—the average instructions per cycle for the single-threaded DaCapo benchmark suite is 0.98. In other words, even in the multi-core era, there is stil...
Conference Paper
Tracking subset relations between the contents containers on the heap is fundamental to modeling the semantics of many common programing idioms such as applying a function to a subset of objects and maintaining multiple views of the same set of objects. We introduce a relation, must reference sets, which subsumes the concept of must-aliasing and en...
Article
Full-text available
This paper introduces a general purpose method, write invariant prop-erties, for improving the precision of heap analysis techniques at a minimal com-putational cost. This method is specifically focused on eliminating the impreci-sion introduced when program states from multiple call paths are merged at call sites when using partially call-context...
Article
Full-text available
Modeling the evolution of the state of program memory during program execution is critical to many parallelization techniques. Current memory analysis techniques either provide very accurate information but run prohibitively slowly or produce very conservative results. An approach based on abstract interpretation is presented for analyzing programs...
Article
Precise modeling of the structure of the heap and how objects are shared between various arrays or data structures is fundamental to understanding the behavior of a program. This paper introduces a novel higher order relation, reference set dominance, which subsumes the concept of aliasing and enables ex- isting shape analysis techniques to, effici...
Conference Paper
Full-text available
This paper introduces a novel set of heuristics for identify ing logi- cally related sections of the heap such as recursive data str uctures, objects that are part of the same multi-component structure, and re- lated groups of objects stored in the same collection/array. When combined with lifetime properties of these structures, thi s infor- matio...
Conference Paper
Dependence information between program values is extensively used in many program optimization techniques. The ability to identify statements, calls and loop iterations that do not depend on each other enables many transformations which increase the instruction and thread-level parallelism in a program. When program variables contain complex data s...
Conference Paper
The performance of heap analysis techniques has a significant impact on their utility in an optimizing compiler.Most shape analysis techniques perform interprocedural dataflow analysis in a context-sensitive manner, which can result in analyzing each procedure body many times (causing significant increases in runtime even if the analysis results ar...
Conference Paper
Full-text available
Precise modeling of the program heap is fundamental for understanding the behavior of a program, and is thus of significant interest for many optimization applications. One of the fundamental properties of the heap that can be used in a range of optimization techniques is the sharing relationships between the elements in an array or collection. If...
Article
A number of papers have used predicate languages over sets of abstract locations to model the heap (decorating a heap graph with the predicates, or in conjunction with an access path abstraction). In this work we introduce a new predicate, dominance, which is a generalization of aliasing and is used to model how objects are shared in the heap (e.g....
Conference Paper
Full-text available
Memory analysis techniques have become sophisticated enough to model, with a high degree of accuracy, the manipulation of simple memory structures (finite structures, single/double linke d lists and trees). However, modern programming languages provide exten- sive library support including a wide range of generic collection objects that make use of...
Conference Paper
Full-text available
Modeling the evolution of the state of program memory during program execution is critical to many parallelization techniques. Current memory analysis techniques either provide very accurate information but run prohibitively slowly or produce very conservative results. An approach based on abstract interpretation is presented for analyzing programs...
Article
Full-text available
The ability to accurately model the state of program memory and how it evolves during program execution is critical to many optimization and verification techniques. Current mem-ory analysis techniques either provide very accurate information but run prohibitively slowly or run in an acceptable time but produce very conservative results. This paper...
Conference Paper
Full-text available
As more and more genomes are sequenced, evolution- ary biologists are becoming increasingly interested in evolution at the level of whole genomes, in scenarios in which the genome evolves through insertions, du- plications, deletions, and movements of genes along its chromosomes. In the mathematical model pioneered by Sankoff and others, a unichrom...
Article
As more and more genomes are sequenced, evolutionary biologists are becoming increasingly interested in evolution at the level of whole genomes, in scenarios in which the genome evolves through insertions, deletions, and movements of genes along its chromosomes. In the mathematical model pioneered by Sankoff and others, a unichromosomal genome is r...
Article
Abstract This whitepaper describes the basics of using and interpreting the results of the MTSA analysis tool. It also contains the results of several detailed case s tudies describing the results produced by the analysis for a number of interesting benchmarks, some of the features of the benchmarks that make them particularly interesting problems...
Article
Memory analysis techniques have become sophisticated enough to model, with a high degree of accu- racy, the manipulation of simple memory structures (finite s tructures, single/double linked lists and trees). However, modern programming languages provide extensive library support including a wide range of generic collection ob- jects that make use...
Article
When analyzing a program via an abstract interpretation (dataflow analysis) framework we would like to examine the program in a context-sensitive interprocedural manner. Analyzing the entire pro-gram in a manner that precisely considers interprocedural flow can lead to much more accurate results than local or context insensi-tive analyses (particul...

Network

Cited By

Projects