
Susan Sim- PhD
- Lecturer at University of Toronto
Susan Sim
- PhD
- Lecturer at University of Toronto
About
103
Publications
79,551
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,621
Citations
Introduction
Current institution
Additional affiliations
September 2012 - present
September 2003 - February 2012
Publications
Publications (103)
Background:
Millions of traumatized refugees worldwide have resettled in the United States. For one of the largest, the Cambodian community, having their mental health needs met has been a continuing challenge. A multicomponent health information technology screening tool was designed to aid provider recognition and treatment of major depressive d...
Asian Americans are understudied in health research and often aggregated into one homogenous group, thereby disguising disparities across subgroups. Cambodian Americans, one of the largest refugee communities in the United States, may be at high risk for adverse health outcomes. This study compares the health status and healthcare experiences of Ca...
The prevalence rate of depression in primary care is high.Primary care providers serve as the initial point of contact for the majority of patients with depression, yet, approximately 50% of cases remain unrecognized.The under-diagnosis of depression may be further exacerbated in limited English-language proficient (LEP) populations.Language barrie...
In this introductory chapter, we map out code retrieval on the web as a research area and the organizational of this book. Code retrieval on the web is concerned with the algorithms, systems, and tools to allow programmers to search for source code on the web and the empirical studies of these inventions and practices. It is a label that we apply t...
Developers use the Web as a tool to find information to help them solve their software development problems. However, little was known about what kinds of problems motivate developers to do searches on the Web. We observed 24 developers at 3 software companies. In our analysis, we found that there are four main kinds of problems. When Remembering,...
It has become common practice for developers to search the Web for source code. In this paper, we report on our analysis of a laboratory experiment with 24 subjects. They were given a programming scenario and asked to find source code using five different search engines. The scenarios varied in terms of size of search target (block or subsystem) an...
In recent years, searching for source code on the web has become increasingly common among professional software developers and is emerging as an area of academic research. This volume surveys past research and presents the state of the art in the area of "code retrieval on the web." This work is concerned with the algorithms, systems, and tools to...
Collaborative brainstorming does not always result in more ideas or higher quality ideas than working individually. We designed a system with game elements to incent participation in a collaborative creative idea generation processes of brainstorming followed by a convergence activity. We compared teams using the system with and without game elemen...
Intellectual property law affects anyone who is engaged in source reuse and remix. In this chapter, we use four thought experiments to explain and discuss aspects of patents, licensing, and current issues in IP law. These thought experiments are largely based on scenarios taken from our own empirical research and from contemporary events. Along wit...
Programmers often look for a snippet, that is, a small piece of example code, to remind themselves of how to solve a problem or to quickly learn about a new resource. However, existing tools such as general-purpose search engines and code-specific search engines do not deal well with searches for snippets. In this chapter, we present a prototype se...
We present the results of our field study that describe how requirements knowledge was shared at an industrial software company using agile software practices. As is common in agile processes, the team did not capture requirements knowledge in a comprehensive specification document. Instead, requirements knowledge was captured in user stories, auto...
This special section contains three papers that are substantially revised and extended from the versions that appeared in the conference proceedings of the 19th IEEE International Conference on Program Comprehension (ICPC 2011). In this short introduction, we sketch what program comprehension is, summarize the aims of ICPC, report briefly on the co...
Every method for developing software is a prescriptive model. Applying a deconstructionist analysis to methods reveals that there are two texts, or sets of assumptions and ideals: a set that is privileged by the method and a second set that is left out, or marginalized by the method. We apply this analytical lens to software reuse, a technique in s...
Software developers search the Web for various kinds of source code for diverse reasons. In a previous study, we found that searches varied along two dimensions: the size of the search target (e.g., block, subsystem, or system) and the motivation for the search (e.g., reference example or as-is reuse). Would each of these kinds of searches require...
When analyzing data elicited using the “war stories” technique, previously introduced by Lutters and Seaman (Inf Softw Technol
49(6):576–587, 2007), we encountered unexpected challenges in applying standard qualitative analysis techniques. After reviewing the literature
on stories and storytelling, we realized that a richer analysis would be possib...
Software developers frequently search for source code on the Web to solve problems. Their ability to correctly evaluate the matches returned by a source code search engine is key to the success of the search, and in turn the project. We conducted a laboratory experiment to gain understanding on the kinds of information used and their usefulness dur...
Developers use the Web as a tool to find information to help them solve their software development problems. However, little is known about what kinds of problems motivate developers to do searches on the Web. We asked twenty-five developers to record their Web searches at a medium-sized software company. We also observed twelve developers. In our...
Research on feature location that applies information retrieval techniques have experimented the kinds of inputs to the corpus and the algorithms that could be used. At first, only source code was used. Later extraction techniques were improved, and data from other software tools and analyses were used to expand or augment the repository. But, does...
In this paper, we present a view of design methods as discourse on practice. We consider how the deployment of a particular set of design methods enables and constrains not only practical action but also discursive action within the design practice. A case study of agile software development methods illustrates the ways that methods establish condi...
Looking for source code on the Web is a common practice among software developers. Previous research has shown that developers use social cues over technical cues to evaluate source code candidates. However, current source code search engines do not take full advantage of social information. We present a prototype, an extension of Sourcerer, that s...
In this paper, we present a new construct, called Transitive Changeset, that can be used for feature location. Transitive Changesets are created by extending changesets from revision control systems with additional information. A changeset temporally associate changes and conceptual descriptions provided in a commit transaction. By following transi...
In software development, there is an interplay between Software Process models and Software Process enactments. The former
tends to be abstract descriptions or plans. The latter tends to be specific instantiations of some ideal procedure. In this
paper, we examine the role of work artifacts and conversations in negotiating between prescriptions fro...
We present the results of our field study that describe how requirements validation was performed at an industrial software company using agile software practices. As is common in agile processes, the team did not capture requirements knowledge in a comprehensive specification document. Instead, requirements knowledge was captured in user stories,...
Agile software development involves continuously making iterative and incremental changes to source code. When making changes, developers quickly focus on parts of code that they consider to be important, and sometimes miss other relevant parts. Therefore, tool support is needed to help developers locate conceptually related sections of code. In th...
Much research in traceability has focused on following requirements and features over the early phases of the software lifecycle. There has been comparatively little work on traceability into later phases and artifacts. In this paper, we tackle the problem of traceability across artifacts, including documents and source code, and maintaining tracea...
This paper presents an idea of using a structure traversal graph (STG) to characterize whether program comprehension is progressing smoothly. Inspired by electrocardiograms that are used to measure heart rhythms, STGs are an attempt to depict the rhythm of program navigation. STGs are created by abstracting navigation between files to the level of...
On any project, it is not possible to have complete and accurate concern maps for all possible tasks. We present an approach to create concern maps from available secondary software work artifacts produced by common software tools, such as revisions control. We mine and index concern fragments from repositories of those tools. Developers can search...
Internet-scale code search is the problem of finding source on the Internet. Developers are typically searching for code to reuse as-is on a project or as a reference example. This phenomenon has emerged due to the increasing availability and quality of open source and resources on the Web. Solutions to this problem will involve more than the simpl...
Science and technology studies (STS) is a discipline concerned with examining how social and technological worlds shape each other. In this paper, we argue that STS can be used to study the work of software development as a complex, interacting system of people, organizations, culture, practices, and technology, or in STS terms, an assemblage. We i...
Reusing software through copying and pasting is a continuous plague in software development despite the fact that it creates serious maintenance problems. Various techniques have been proposed to find duplicated redundant code (also known as software ...
Software developers often need to understand a large body of unfamiliar code with little or no documentation, no experts to consult, and little time to do it. A post appeared in January 2008 on Slashdot, a technology news Web site, asking for tools and techniques that could help. This article analyzes 301 often passionate and sometimes articulate r...
Programmers often search for Open Source code to use in their projects. To understand how and why programmers search for source code, we conducted a web-based survey and collected data from 69 respondents, including 58 specific examples of searches. Analyzing these anecdotes, we found that they could be categorized along two orthogonal dimensions:...
Requirements engineers with many years of experience have a distinct perspective on the field. To sample this knowledge, we interviewed 34 requirements researchers and practitioners, each with up to 42 years of experience. We used open-ended, structured interviews in which we asked them to reflect on their experiences and professional development a...
Information technology (IT) is often promoted as a socially and culturally agnostic tool that will allow emerging economies to leap into the digital age and reap the wealth that accompanies it. But in addition to the programming language, software tools, and books, know-how is needed to turn bright ideas into innovative, marketable solutions. This...
This paper argues for the application of authorship analysis to technology design. It extends techniques used in Science Studies to investigate scientific authorship in order to define a concept of technical authorship. To illustrate the potential of this approach, authorship analysis is applied to particular prescriptive software design methodolog...
The CHASE 2008 workshop is concerned with exploring the cooperative and human aspects of software engineering, and providing a forum for discussing high-quality research. Accepted papers reflect diversity of the field of software engineering ranging from requirements to testing, and from ethnographic research to experiments. Moreover, the backgro...
Software engineering is an intensely people-oriented activity, yet little is known about how software engineers perform their
work. In order to improve software engineering tools and practice, it is therefore essential to conduct field studies, i.e., to study real practitioners as they solve real problems. To aid this goal, we describe a series of...
It has become common practice among developers to construct new software systems by reusing existing source code, components, and libraries. However, examining only the artifacts are insufficient to understand the practices of sharing that have emerged around software mashups. Combining information studies with techniques from the human, and social...
This paper presents the results of an empirical study aimed at examining the extent to which software engineers follow a software process and the extent to which they improvise during the process. Our subjects tended to classify processes into two groups. In the first group are the processes that are formal, strict, and well-documented. In the seco...
We studied the clarity of three requirements forms, operationalized as ease of problem detection, freedom from obstructions to understanding, and understandability by a variety of stakeholders. A set of use cases for an industrial system was translated into ScenarioML scenarios and into sequence diagrams; problems identified during each translation...
Agile teams commonly use User Stories, conversations with On-Site Customers, and Test Cases to gather requirements. Some Agile teams like to add other artifacts, such as Use Cases to provide more detail to the Agile Requirements. This paper presents the results of a controlled experiment aimed to learn whether Use Cases could help Agile Requirement...
A number of approaches for spanning the requirements-archi-tecture gap have been published in recent years, and we sought to rigorously characterize the gap and to conduct a comparative evaluation of approaches to span the gap using a case study method on a realistic problem. However, our in-tentions were impeded by the problem of finding appropria...
The design of empirical experiments involves making design decisions to trade off what is ideal against what is achievable. Researchers must weigh limitations on resources, metrics, and the current state of knowledge, against the validity of the results. In this paper, we report on the design decisions we made in a small controlled experiment and t...
Large distributed networked software systems are built to provide competing qualities such as reliability, availability, security, performance and scalability to their clients. In certain situations these qualities must be traded off, sacrificing some qualities to some extent to improve others. This paper presents TOMCAD (a Tradeoff a Model with Ca...
The success of a task relies on the expertise of its owner to a great degree. But what is the anticipated expertise? In what areas? Our research intends to investigate the skill sets anticipated of requirements engineers. This paper presents a model of the varying levels of RE expertise within the context of the main RE domains. Identifying the lay...
Case studies are an empirical method with established design principles for conducting scientific investigations. The topic of this half-day tutorial was to give an introduction to case studies as an empirical research method. Our goal is to bring attention to this method as an option in the pantheon of empirical methods.
The field of requirements engineering (RE) is rich with a myriad of technologies that aim to support activities within the requirements engineering process. These range from elicitation to management of RE work products. As such, the range and nature of these technologies are broad; this makes an evaluation of their effectiveness challenging. This...
GXL (Graph eXchange Language) is an XML-based standard exchange format for sharing data between tools. Formally, GXL represents typed, attributed, directed, ordered graphs which are extended to represent hypergraphs and hierarchical graphs. This flexible data model can be used for object-relational data and a wide variety of graphs. An advantage of...
Agile processes enable software development projects to react to rapid changes in the development environment. However, they are often criticized for not creating and maintaining standard documentation such as requirements and design documentation. The lack of documentation can be detrimental for maintaining knowledge, especially in the long run, b...
As software systems and networks continue to evolve, so do threats to their security. Unfortunately, most security issues come to light only after completion of the system because security is often managed in an ad hoc fashion late in the software lifecycle. There are many advantages to incorporating security specification into the requirements pha...
Expertise is the consistently superior performance on a set of tasks in some area of human activity. Software engineering expertise is difficult to define, and characterize empirically. In this paper, we present and evaluate three candidate criteria for assessing software engineering expertise. These three criteria are: experience; characteristics...
Clarity is underappreciated as a requirements specifi- cation quality attribute. We studied the clarity of require- ments forms, operationalized as ease of problem detection, least obstructive to understanding, and understandability by stakeholders. A set of use cases for an industrial system was translated into sequence diagrams and ScenarioML; pr...
Software engineering is an intensely people-oriented activity, yet too little is known about how designers, maintainers, requirements analysts and all other types of software engineers perform their work. In order to improve software engineering tools and practice, it is therefore essential to conduct field studies, i.e., to study real practitioner...
This paper presents an analysis of software architecture as social artifact, that is, something that software developers talk about and use in their work. This analysis is historical in nature, relying on interviews with software developers with experience spanning four decades and the software engineering literature. We found that 1) only large te...
The topic of this paper was the correct use and interpretation of case studies as an empirical research method. Using an equal blend of lecture and discussion, it gave a foundation for conducting, reviewing, and reading case studies. There were lessons for software engineers as researchers who conduct and report case studies, reviewers who evaluate...
Requirements Engineering (RE) research is believed to be mature enough for the community to be able to make comparative evaluations of alternative tools, techniques, approaches and methods. Commonly used exemplars in RE that have emerged over the years all suffer from well-defined and widely accepted evaluation criteria which makes comparison of th...
This paper is triggered by a concern for the methodological soundness
of research papers in RE.
We propose a number of criteria for methodological
soundness, and apply
these to a random sample of 37 submissions to the RE'03
conference.
From this application, we draw a number of conclusions that we claim
are valid for a larger sample than just these...
Benchmarks have been used in computer science to compare the performance of computer systems, information retrieval algorithms, databases, and many other technologies. The creation and widespread use of a benchmark within a research area is frequently accompanied by rapid technical progress and community building. These observations have led us to...
Benchmarks have been used in computer science to compare the performance of computer systems, information retrieval algorithms, databases, and many other technologies. The creation and widespread use of a benchmark within a research area is frequently accompanied by rapid technical progress and community building. These observations have led us to...
Developing a standard schema at the abstract syntax tree level for C/C++ to be used by reverse engineering and reengineering tools is a complex and difficult problem. In this paper, we present a catalogue of issues that need to be considered in order to design a solution. Three categories of issues are discussed. Lexical structure is the first cate...
The purpose of this paper is to propose a benchmark for comparing fact extractors for Web sites and to invite interested researchers and practitioners to participate in its development. Fact extraction is a fundamental and difficult problem in both traditional software reverse engineering and Web site reverse engineering. In both domains, there are...
In this position paper I argue that RE practice is the problem analysis part of a design problem, and that this problem analysis part is a knowledge problem in which the requirements engineer tries to build a theory of a problem domain. RE research is a knowledge problem too, in which the researcher tries to build partial theories of a class of dom...
This paper describes a collaborative structured demonstration of reverse engineering tools that was presented at a working session at WCRE 2001 in Stuttgart, Germany. A structured demonstration is a hybrid tool evaluation technique that combines elements from experiments, case studies, technology demonstrations, and benchmarking. The essence of the...
There are many things to be learned about software engineering by working directly with software practitioners in companies. In this paper, we present an overview of some of the techniques for performing such field studies, focusing on such issues as: What kind of problems can be addressed by field studies? What techniques are available for gatheri...
In this paper, we take the concept of benchmarking, as used extensively in computing, and apply it to the evaluation of C++ fact extractors. We demonstrate the efficacy of this approach by developing a prototype benchmark, CppETS 1.0 (C++ Extractor Test Suite, pronounced 'see-pets') and collecting feedback in a workshop setting. The CppETS benchmar...
were extensive discussions on the nature of evidence necessary to construct convincingargumnci and theme5+zj required to produce this evidence. Keywocu Software engineering,emgineer mginee mgineering,58)jj research. 1. BACKGROUND In recent years there has been an increase in interest in studying software engineering emzjW5G(+z . There are now sever...
Developing a standard schema at the abstract syntax tree (AST) level for C/C++ to be used by reverse engineering and reengineering tools is a complex and difficult problem. In this paper we present a catalogue of issues that need to be considered in order to design a solution. Three categories of issues are discussed. Lexical structure is the first...
A workshop was held at ICSE 2000 in Limerick, Ireland to further efforts in the development of a standard exchange format (SEF) for data extracted from and about source code. WoSEF (Workshop on Standard Exchange Format) brought together people with expertise in a variety of formats, such as RSF, TA, GraX, FAMIX, XML, and XMI, from across the softwa...
Data interchange in the form of a standard exchange format(SEF) is only a first step towards tool interoperability. Inter-tool communication using files is slow and cumbersome; a better approach would be an application program interface, or API, that allowed tools to communicate with each other directly. This paper argues such an AP is a logical ne...
At the turn of the millennium, the revolution in information technology has ushered in a new economy. This economy, originated in the United States, and more specifically in the American West Coast, is spreading throughout the world, in an uneven, yet ...
The purpose of the article is to report on a structured demonstration for comparing program comprehension tools. Five teams of program comprehension tool designers applied their tools to a set of maintenance tasks on a common subject system. By applying a variety of reverse engineering techniques to a predefined set of tasks, the tools can be compa...
Data interchange in the form of a standard exchange format (SEF) is only a first step rewards tool interoperability. Inter-tool communication using files is slow and cumbersome; a better approach would be an application program interface, or API, that allowed tools to communicate with each other directly. The paper argues that such an API is a logi...
This paper describes a structured tool demonstration, a hybrid
evaluation technique that combines elements from experiments, case
studies and technology demonstrations. Developers of program
understanding tools were invited to bring their tools to a common
location to participate in a scenario with a common subject system.
Working simultaneously th...
Software repositories are often based on object-oriented or relational databases, usually with extensions to accommodate the special requirements of software. Here, we discuss a software repository based on a structured text retrieval system, which avoids some of the limitations of previous approaches, including language dependence and poor scalabi...
The goal of this workshop is to provide an interactive forum for software engineers and empirical researchers to investigate the feasibility of applying proven methods from other research disciplines to software engineering research. Participants submitted position papers describing problems that might benefit from a multidisciplinary approach. Exp...
Software repositories, used to support program development and maintenance, invariably require an abstract model of the source code. This requirement restricts the repository user to the analyses and queries supported by the data model of the repository. In this work, we present a software repository system based on an existing information retrieva...
Joining a software development team is like moving to a new country to start employment; the immigrant has a lot to learn about the job, the local customs, and sometimes a new language. In an exploratory case study, we interviewed four software immigrants, in order to characterize their naturalization process. Seven patterns in four major categorie...
2 There are many things to be learned about software engineering by working directly with software practitioners in companies. In this paper, we present an overview of some of the techniques for performing such field studies, focusing on such issues as: What kind of problems can be addressed by field studies? What techniques are available for gathe...
There are many disciplines that have well-defined theoretical foundations and techniques for studying human behaviour. One such technique is ethnography which originates from anthropology and sociology. Ethnography is an inductive, qualitative technique suitable for investigating complex human phenomena in an open-ended manner. Results from these s...
This paper reports on some experiments undertaken using two learners, Nave Bayes and Nearest Neighbour, on three software systems, the C488 compiler used in CSC 488 undergraduate course, the Linux Operating System Kernel, and an optimizing back-end from IBM. The best results were found with the Nearest Neighbour learner on the Linux kernel and IBM...
this paper: the spring model algorithm, simulated annealing method and the Sugiyama algorithm. Applications of Graph Drawing Algorithms
This paper will focus on these in particular because they are well documented in the literature. In January, 1991, Curtis began running LambdaMOO in his personal Sun workstation at Xerox PARC, a MOO which was based on the metaphor of a large rambling house with enough rooms all the guests who would ever want to visit.
Six programmers with experience spanning four decades were interviewed about their use of architecture, or lack thereof, in developing software systems. The data collected included few surprises and a history of hardware and software. The two biggest factors in the use of design are management endorsement and technological sophistication of problem...
Software architecture visualization tools tend to support
browsing, that is, exploration by following concepts. If architectural
diagrams are to be used during daily software maintenance tasks, these
tools also need to support specific fact-finding through searching.
Searching is essential to program comprehension and hypothesis testing.
Furthermor...
Software repositories are often based on object-oriented or relational databases, usually with extensions to accommodate the special requirements of software. Here, we discuss a software repository based on a structured text retrieval system, which avoids some of the limitations of previous approaches, including language dependence and poor scalabi...
We have conducted a survey to generate archetypes of source code
searching by programmers across maintenance tasks. Using a questionnaire
on a web page, we obtained 69 responses from readers of 7 newsgroups.
Respondents were asked about their source code searching habits: what
tools they used, why they searched, and what they searched for. The four...
Joining a software development team is like moving to a new
country to start employment; the immigrant has a lot to learn about the
job, the local customs, and sometimes a new language. In an exploratory
case study, we interviewed four software immigrants, in order to
characterize their naturalization process. Seven patterns in four major
categorie...
This report summarizes the presentations and discussion of a half-day workshop held at CASCON98. The theme of the workshop, "Building Software Tools for People," indicates the importance of considering users, their habits, their work environments, and their organizations when constructing software engineering tools. There were five presenters from...
Joining a software development t eam is like moving to a new country to start employment; t he immigrant has a lot to learn about the job, the local customs, and sometimes a new language. In an exploratory case study, we interviewed four software immigrants, in order to characterize their naturalization process. Seven patterns in four major categor...