ArticlePDF Available

Abstract and Figures

Software engineering is a human task, and as such we must study what software engineers do and think. Understanding the normative practice of software engineering is the first step toward developing realistic solutions to better facilitate the engineering process. We conducted three studies using several data-gathering approaches to elucidate the patterns by which software engineers (SEs) use and update documentation. Our objective is to more accurately comprehend and model documentation use, usefulness, and maintenance, thus enabling better decision making and tool design by developers and project managers. Our results confirm the widely held belief that SEs typically do not update documentation as timely or completely as software process personnel and managers advocate. However, the results also reveal that out-of-date software documentation remains useful in many circumstances.
Content may be subject to copyright.
focus
0740-7459/03/$17.00 © 2003 IEEE
Published by the IEEE Computer Society
IEEE SOFTWARE 35
documentation use, usefulness, and mainte-
nance, thus enabling better decision making
and tool design by developers and project
managers. Our studies’ results confirm the
widely held belief that SEs typically do not up-
date documentation as timely or completely as
software process personnel and managers ad-
vocate. However, the results also reveal that
out-of-date software documentation remains
useful in many circumstances.
The studies
The first study consisted of interviews at 12
corporate sites and one government site.
1
We in-
terviewed participants in pairs to make the situ-
ation more comfortable and natural for them.
The interview questionnaire (see wwwsel.iit.nrc.
ca/~singer/main.html) had three parts: back-
ground information, task analysis, and tools
wish list. The sidebar lists some of the interview-
ees’ assessments of documentation.
We conducted the second study at a large
telecommunications company.
2,3
It involved
software engineers maintaining and enhancing
a large, profitable telecommunications system
written in a proprietary high-level language.
This study had four components. First, using a
Web questionnaire, we asked the SEs what
How Software Engineers
Use Documentation:
The State of the Practice
S
oftware engineering is a human task, and as such we must study
what software engineers do and think. Understanding the norma-
tive practice of software engineering is the first step toward devel-
oping realistic solutions to better facilitate the engineering process.
We conducted three studies using several data-gathering approaches to
elucidate the patterns by which software engineers (SEs) use and update
documentation. Our objective is to more accurately comprehend and model
the state of the practice
Most software documentation is not updated consistently, but out-
of-date documentation might remain useful. We must find powerful
yet simple documentation strategies and formats that software
engineers will likely maintain.
Timothy C. Lethbridge, University of Ottawa
Janice Singer, National Research Council
Andrew Forward, Deloitte Consulting
they do. Second, we followed a single SE for
14 weeks as he worked. Third, we shadowed
nine SEs individually for one hour as they
worked. Finally, we obtained companywide
tool use statistics.
The final study comprised a 50-question
survey that focused on many documentation
aspects.
4,5
We solicited survey participants di-
rectly from several high-tech companies and
by using software engineering email lists.
Of the three studies, only the final one fo-
cused exclusively on documentation issues.
The former two focused on gaining an overall
understanding of the software maintenance
process. Even so, they uncovered important
information that relates directly to documen-
tation use and usefulness.
Documentation maintenance and usefulness
First, and probably foremost, our studies
confirm that SEs often do not maintain docu-
mentation. Figure 1 shows survey respondents’
answers to the question, “In your experience,
when changes are made to a software system,
how long does it take for the supporting docu-
mentation to be updated to reflect the
changes?” (Note that only 25 of the 48 survey
respondents answered this question.) With the
exception of testing and quality documentation
(such as test cases and plans), SEs rarely update
the documents. When SEs do update docu-
ments, they usually do so several weeks after
changes are made to the code. Forty-four per-
cent of the 45 respondents somewhat agreed
and 24 percent strongly agreed with the state-
ment, “Documentation is always outdated rela-
tive to the current state of a software system.”
The interview responses mirrored the survey
36
IEEE SOFTWARE http://computer.org/software
During our interviews, software engineers expressed the following general
attitudes about documentation.
The good
Architecture and other abstract documentation information is often valid
or at least provides historical guidance that can be useful for maintainers.
Inline comments are often good enough to greatly assist detailed main-
tenance work.
The bad
Documentation of all types is frequently out of date.
Systems often have too much documentation.
Documentation is often poorly written.
Finding useful content in documentation can be so challenging that peo-
ple might not try to do so.
Much mandated documentation is so time consuming to create that its
cost can outweigh its benefits.
The ugly
A considerable fraction of documentation is untrustworthy.
Documentation:
The Good, the Bad, and the Ugly
Percentage of survey respondents
0
10
20
30
40
50
60
Document type
Requirements Specifications Detailed design Low-level design Architectural Testing or quality
Never
Rarely
Response
Few months
Few weeks
Few days
Figure 1. The time
between system changes
and documentation
updates for different
documentation types.
results. SEs were likely to update documenta-
tion continually only when it was attached to
the process for completing a change request. In
other contexts, SEs sometimes would and
sometimes wouldn’t update documentation.
The documentation’s accuracy seemed to de-
pend largely on its recentness and the amount
of change that had occurred in the relevant
code sections (with greater source code change
corresponding to greater discrepancy between
the code and the documentation).
In important contrast to the lack of docu-
mentation maintenance is the usefulness of even
outdated or inaccurate documentation. Fifty-
three percent of the 45 survey respondents
somewhat agreed and 28 percent strongly
agreed that “Software documentation can be
useful, even though it might not always be the
most up-to-date.” The survey provided data on
the SEs’ perceptions of different document
types’ accuracy and the frequency with which
they consulted the documentation. Table 1
shows the correlation between the document
type’s perceived accuracy and the frequency
with which SEs consulted the documentation.
The relationship between accuracy and
consultation frequency is the highest for test-
ing and quality documents and the second
highest for low-level design documents. To a
lesser degree, the accuracy of requirements, ar-
chitectural design documents, and detailed de-
sign documents also correlate with consulta-
tion frequency. Specifications have almost no
such correlation. The relationships seem to in-
dicate that the closer you get to the real code,
the more accurate the documentation must be
for SEs to use it.
The interviews support the survey findings.
SEs were more likely to use and trust docu-
mentation that describes a particular feature’s
design or the system architecture. The more
abstract a piece of documentation, the more
likely SEs were to consider it accurate and use-
ful. One software engineer stated, “The docu-
mentation is good [for giving] you a high-level
understanding of how the feature is really in-
tended to work.”
Documentation applicability
Table 2 shows the tasks for which the sur-
vey respondents rated the available software
documentation effective or extremely effective.
More than one-half of the respondents
found the available software documentation
effective when learning a new software sys-
tem, testing a system, or working with a new
system. Fifty percent found it effective when
other developers are unavailable or when
looking for big-picture information. Only ap-
proximately one-third of the respondents
found the documentation effective for main-
taining a system, answering questions about
the system, looking for in-depth information,
or working with an established system. The
results indicate that documentation satisfies
particular roles for particular tasks.
Words versus actions
We must recognize that our survey results re-
flect how SEs perceive documentation, not nec-
essarily how they actually use it. For example,
at the telecommunications company, 6 percent
of the respondents said they spent considerable
time reading documentation, and 50 percent re-
ported spending considerable time consulting
November/December 2003
IEEE SOFTWARE 37
Table 1
The correlation between a document type’s
perceived accuracy and its consultation frequency
Document type Correlation
Testing or quality 0.67 (p < .005)
Low-level design 0.58 (p < .005)
Requirements 0.43 (p < .05)
Architectural 0.41 (p < .05)
Detailed design 0.39 (p < .05)
Specifications 0.03
Table 2
Percentage of survey respondents who rated
documentation effective or extremely effective
for particular tasks
Task Percent
Learning a software system 61
Testing a software system 58
Working with a new software system 54
Solving problems when other developers are unavailable to answer questions 50
Looking for big-picture information about a software system 46
Maintaining a software system 35
Answering questions about a system for management or customers 33
Looking for in-depth information about a software system 32
Working with an established software system 32
source code. In our observational studies at the
same company, however, SEs consulted the doc-
umentation only 3 percent of the time: 12 times
over 357 logged events. This suggests that doc-
umentation use might be even less than re-
ported. The higher perceived use might
nonetheless suggest that SEs place considerable
value on the documentation.
Discussion
Our studies raise two main issues.
Timeliness
The first is, should we force software engi-
neers to keep documentation meticulously up-
to-date? Formal-process theorists would cer-
tainly argue that we should. In fact, most
published methodologies prescribe the docu-
mentation types that SEs should write and use.
But where’s the real evidence that the pre-
scribed processes work? Most of it is based on
opinion or conjecture. Many software proj-
ects fail or run over budget, but evidence
hints that the fault lies mostly with poor man-
agement and failure to gather requirements,
not with out-of-date or incomplete documen-
tation. As we mentioned before, our studies
suggest that out-of-date documentation has
value, particularly if the high-level abstrac-
tions remain valid.
Judging value: The simple-and-powerful rule
The second issue is why SEs adopt particular
practices and tools. Our results indicate that
software engineers create and use simple yet
powerful documentation, and tend to ignore
complex and time-consuming documentation.
Consider bug-tracking systems. The inter-
views revealed that SEs perceive them as im-
portant repositories for historical information.
Documentation in bug-tracking systems stays
up-to-date because SEs recognize its value—
adding a simple comment as you fix a bug re-
quires little effort, and maintenance is semi-
automatic. Similarly, code-level comments stay
current because they are short and “right
there,” resulting in relatively little mainte-
nance work. Test cases also stay up-to-date be-
cause each one has a simple structure and ob-
vious operational value for verifying the
system. Specifications and requirements, how-
ever, are big, complex, and of varied structure.
So, SEs consider updating these documents
less worthwhile, especially because the high-
level abstractions tend to remain useful when
the details become outdated.
The necessity of designing simple, powerful
design tools is evident in other software engi-
neering areas. For example, ultrasimple yet
powerful tools such as grep are still among the
most widely used. They’re far more popular
and enduring than many CASE (computer-
aided software engineering) tools that are
more powerful but much more complex. SEs
also widely use processes such as code inspec-
tions, which have obvious power and which
they can describe in a page or two.
All this suggests that to achieve greater doc-
umentation relevance, we need to find ways to
increase its power, simplicity, or preferably
both. We must find ways to express the most
useful information in less space and to make
documentation easier to update, perhaps
semiautomatically.
Some people will argue that SEs fail to up-
date documentation because they’re lazy.
Many managers have responded to this asser-
tion by trying to impose more discipline on
software engineers—forcing them to update
documents. We suggest that most SEs aren’t
lazy; they have discipline of a different
sort. They consciously or subconsciously
make value judgments and conclude that it’s
worthwhile to update only certain types of
documentation.
So, rather than forcing SEs to perform cost-
ineffective work, we should strive for simple
yet powerful documentation formats and
tools, as we just mentioned. Also, we need to
better understand the various roles of software
documentation and more closely match our
prescribed processes to fit those roles.
A
n additional lesson from our research
is that you can learn a lot from study-
ing SEs in the real world—both what
they do and how they think. Further studies
like ours could provide rich data that can
serve to help formulate research questions.
Those research questions, in turn, could aid in
other types of empirical studies, such as test-
ing specific hypotheses in more constrained
(artificial) settings.
38
IEEE SOFTWARE http://computer.org/software
To achieve
greater
documentation
relevance,
we need to
find ways
to increase
its power,
simplicity, or
preferably
both.
References
1. J. Singer, “Practices of Software Maintenance,” Proc.
Int’l Conf. Software Maintenance (ICSM 98), IEEE CS
Press, 1998, pp. 139–145.
2. J. Singer et al., “An Examination of Software Engineer-
ing Work Practices,” Proc. Centers for Advanced Stud-
ies Conf. (CASCON 97), IBM, pp. 209–223.
3. J. Singer and T.C. Lethbridge, “Studying Work Practices
to Assist Tool Design,” Proc. Int’l Workshop Program
Comprehension (IWPC 98), IEEE CS Press, pp. 173–
179.
4. A. Forward, Software Documentation: Building and
Maintaining Artefacts of Communication, master’s the-
sis, School of Information Technology and Eng., Univ.
Ottawa, 2002; www.site.uottawa.ca/~tcl/gradtheses/
aforward.
5. A. Forward and T.C. Lethbridge, “The Relevance
of Software Documentation, Tools and Technologies:
A Survey,” Proc. ACM Symp. Documentation Eng.
(DocEng 2002), ACM Press, pp. 26–33.
For more information on this or any other computing topic, please visit our
Digital Library at http://computer.org/publications/dlib.
About the Authors
Timothy C. Lethbridge is an associate professor at the University of Ottawa. He in-
vestigates ways that people can more easily understand and manipulate complex information,
including software. He’s on the steering committee of Computing Curriculum—Software Engi-
neering, sponsored by the IEEE Computer Society and the ACM. He also coauthored Object Ori-
ented Software Engineering: Practical Software Development Using UML and Java (McGraw
Hill, 2001). He received his PhD in Computer Science from the University of Ottawa. He’s a
senior member of the IEEE. Contact him at SITE, 800 King Edward Ave., Ottawa, ON K1N 6N5,
Canada; tcl@site.uottawa.ca.
Janice Singer is a cognitive psychologist working in the National Research Council of
Canada’s Software Engineering Group. She also heads the NRC’s Human-Computer Interaction
program. Her interests lie in collaboration, cognition, and improving software processes and
tools by understanding the cognitive and social demands of work. She received her PhD in
cognitive psychology from the University of Pittsburgh. Contact her at the NRC Inst. for Infor-
mation Technology, M50, 1200 Montreal Rd., Ottawa, ON K1A 0R6, Canada; janice.singer@
nrc-cnrc.gc.ca.
Andrew Forward is a systems analyst with Deloitte Consulting, working in their tech-
nology practice. His academic interests are software engineering, automated testing, and docu-
mentation. He has an MS in computer science from the University of Ottawa. He’s a member of
the IEEE and ACM. Contact him at 106 Melrose Ave., Apt 1, Ottawa, ON K1Y 1V1, Canada;
aforward@dc.com.
EXECUTIVE STAFF
Executive Director: DAVID W. HENNAGE
Assoc. Executive Director:
ANNE MARIE KELLY
Publisher: ANGELA BURGESS
Assistant Publisher: DICK PRICE
Director, Administration: VIOLET S. DOAN
Director, Information Technology & Services:
ROBERT CARE
Manager, Research & Planning: JOHN C. KEATON
COMPUTER SOCIETY OFFICES
Headquarters Office
1730 Massachusetts Ave. NW
Washington, DC 20036-1992
Phone: +1 202 371 0101 • Fax: +1 202 728 9614
E-mail: hq.ofc@computer.org
Publications Office
10662 Los Vaqueros Cir., PO Box 3014
Los Alamitos, CA 90720-1314
Phone:+1 714 821 8380
E-mail: help@computer.org
Membership and Publication Orders:
Phone: +1 800 272 6657 Fax: +1 714 821 4641
E-mail: help@computer.org
Asia/Pacific Office
Watanabe Building
1-4-2 Minami-Aoyama,Minato-ku,
Tokyo107-0062, Japan
Phone: +81 3 3408 3118 • Fax: +81 3 3408 3553
E-mail: tokyo.ofc@computer.org
PURPOSE
The IEEE Computer Society is the
world’s largest association of computing profes-
sionals, and is the leading provider of technical
information in the field.
MEMBERSHIP Members receive the
monthly magazine COM PUTER, discounts, and
opportunities to serve (all activities are led by
volunteer members). Membership is open to all
IEEE members, affiliate society members, and
others interested in the computer field.
BOARD OF GOVERNORS
Term Expiring 2003: Fiorenza C. Albert-
Howard, Manfred Broy, Alan Clements, Richard A.
Kemmerer, Susan A. Mengel, James W. Moore,
Christina M. Schober
Term Expiring 2004: Jean M. Bacon, Ricardo
Baeza-Yates, Deborah M. Cooper, George V. Cybenko,
Haruhisha Ichikawa, Lowell G. Johnson, Thomas W.
Williams
Term Expiring 2005: Oscar N. Garcia, Mark A
Grant, Michel Israel, Stephen B. Seidman, Kathleen
M. Swigger, Makoto Takizawa, Michael R. Williams
Next Board Meeting: 22 Nov. 2003, Tampa, FL
IEEE OFFICERS
President: MICHAEL S. ADLER
President-Elect: ARTHUR W. WINSTON
Past President: RAYMOND D. FINDLAY
Executive Director: DANIEL J. SENESE
Secretary: LEVENT ONURAL
Treasurer: PEDRO A. RAY
VP, Educational Activities: JAMES M. TIEN
VP, Publications Activities:MICHAEL R. LIGHTNER
VP, Regional Activities: W. CLEON ANDERSON
VP, Standards Association: GERALD H. PETERSON
VP, Technical Activities: RALPH W. WYNDRUM JR.
IEEE Division VIII Director JAMES D. ISAAK
President, IEEE-USA: JAMES V. LEONARD
EXECUTIVE COMMITTEE
President:
STEPHEN L. DIAMOND*
Picosoft, Inc.
P.O.Box 5032
San Mateo, CA 94402
Phone: +1 650 570 6060
Fax: +1 650 345 1254
s.diamond@computer.org
President-Elect: CARL K. CHANG*
Past President: WILLIS. K. KING*
VP, Educational Activities: DEBORAH K. SCHERRER
(1ST VP)*
VP, Conferences and Tutorials: CHRISTINA
SCHOBER*
VP, Chapters Activities: MURALI VARANASI†
VP, Publications: RANGACHAR KASTURI †
VP, Standards Activities: JAMES W. MOORE†
VP, Technical Activities: YERVANT ZORIAN†
Secretary: OSCAR N. GARCIA*
Treasurer:WOLFGANG K. GILOI* (2ND VP)
2002–2003 IEEE Division VIII Director: JAMES D.
ISAAK†
2003–2004 IEEE Division V Director: GUYLAINE M.
POLLOCK†
2003 IEEE Division V Director-Elect: GENE H. HOFF-
NAGLE
Computer Editor in Chief: DORIS L. CARVER†
Executive Director: DAVID W. HENNAGE†
* voting member of the Board of Governors
nonvoting member of the Board of Governors
COMPUTER SOCIETY WEB SITE
The IEEE Computer Society’s Web site, at
http://computer.org, offers information and
samples from the society’s publications and con-
ferences, as well as a broad range of information
about technical committees, standards, student
activities, and more.
... However, documentation and code examples are usually added only as an afterthought to comply with regulations, often rendering them out of sync or incomplete [28,30]. Even when they exist, the documentation content and code examples are not updated in a timely manner [12]. Therefore, insufficient quantity and variation [30] in examples and incorrect examples [1,2] remain to be the major obstacles for developers learning to use an API. ...
Preprint
Full-text available
Continuous evolution in modern software often causes documentation, tutorials, and examples to be out of sync with changing interfaces and frameworks. Relying on outdated documentation and examples can lead programs to fail or be less efficient or even less secure. In response, programmers need to regularly turn to other resources on the web such as StackOverflow for examples to guide them in writing software. We recognize that this inconvenient, error-prone, and expensive process can be improved by using machine learning applied to software usage data. In this paper, we present our practical system which uses machine learning on large-scale telemetry data and documentation corpora, generating appropriate and complex examples that can be used to improve documentation. We discuss both feature-based and transformer-based machine learning approaches and demonstrate that our system achieves 100% coverage for the used functionalities in the product, providing up-to-date examples upon every release and reduces the numbers of PRs submitted by software owners writing and editing documentation by >68%. We also share valuable lessons learnt during the 3 years that our production quality system has been deployed for Azure Cloud Command Line Interface (Azure CLI).
... Yngve et al. [80] described some of the pioneering research in programming languages documentation, recognizing the importance and discussing the challenges of maintaining this type of documentation. Interestingly, more recent research recognizes (general) documentation maintenance as an often overlooked software engineering practice [52,56], pointing to the cost of maintaining documentation as one of the drivers of this scenario. ...
Preprint
Full-text available
Programming language documentation refers to the set of technical documents that provide application developers with a description of the high-level concepts of a language. Such documentation is essential to support application developers in the effective use of a programming language. One of the challenges faced by documenters (i.e., personnel that produce documentation) is to ensure that documentation has relevant information that aligns with the concrete needs of developers. In this paper, we present an automated approach to support documenters in evaluating the differences and similarities between the concrete information need of developers and the current state of documentation (a problem that we refer to as the topical alignment of a programming language documentation). Our approach leverages semi-supervised topic modelling to assess the similarities and differences between the topics of Q&A posts and the official documentation. To demonstrate the application of our approach, we perform a case study on the documentation of Rust. Our results show that there is a relatively high level of topical alignment in Rust documentation. Still, information about specific topics is scarce in both the Q&A websites and the documentation, particularly related topics with programming niches such as network, game, and database development. For other topics (e.g., related topics with language features such as structs, patterns and matchings, and foreign function interface), information is only available on Q&A websites while lacking in the official documentation. Finally, we discuss implications for programming language documenters, particularly how to leverage our approach to prioritize topics that should be added to the documentation.
... Another challenge reported in prior work is that developers' needs for collecting and organizing information are often not discovered until part of the way through an investigation process [16,81]. This could be due to several major reasons, including but not limited to: 1) additional external requirements, constraints, or user feedback are discovered or introduced in the middle of a project which significantly complicates the original decision making problem [23,30,31]; 2) developers discover many more options, criteria, and their trade-offs than they anticipated at the beginning [81]; and/or 3) developers are required to explain or document their decisions and design rationale after the fact for the long-term maintainability and success of a software project [25,39,75,76,79,104,112]. In these situations, it is hard and involves duplicate work for developers to recall and retrace their steps for reaching their current state of sensemaking (the linear history visualization in almost all current browsers is known to be not particularly effective [16,67,124]) and recollect all the relevant evidence again. ...
Preprint
Full-text available
Developers perform online sensemaking on a daily basis, such as researching and choosing libraries and APIs. Prior research has introduced tools that help developers capture information from various sources and organize it into structures useful for subsequent decision-making. However, it remains a laborious process for developers to manually identify and clip content, maintaining its provenance and synthesizing it with other content. In this work, we introduce a new system called Crystalline that attempts to automatically collect and organize information into tabular structures as the user searches and browses the web. It leverages natural language processing to automatically group similar criteria together to reduce clutter as well as passive behavioral signals such as mouse movement and dwell time to infer what information to collect and how to visualize and prioritize it. Our user study suggests that developers are able to create comparison tables about 20% faster with a 60% reduction in operational cost without sacrificing the quality of the tables.
... Interactive Documentation. A grammar that is automatically inferred will always be up-to-date-a significant advantage over manually written documentation, which tends to quickly drift from the object it documents [35]. Furthermore, an inferred grammar could be closely linked directly to the underlying source code, making productions traceable to their origins. ...
Preprint
Full-text available
Ad hoc parsers are everywhere: they appear any time a string is split, looped over, interpreted, transformed, or otherwise processed. Every ad hoc parser gives rise to a language: the possibly infinite set of input strings that the program accepts without going wrong. Any language can be described by a formal grammar: a finite set of rules that can generate all strings of that language. But programmers do not write grammars for ad hoc parsers -- even though they would be eminently useful. Grammars can serve as documentation, aid program comprehension, generate test inputs, and allow reasoning about language-theoretic security. We propose an automatic grammar inference system for ad hoc parsers that would enable all of these use cases, in addition to opening up new possibilities in mining software repositories and bi-directional parser synthesis.
... According to Lethbridge [35], two frequently mentioned reasons for outdated or poorly written documentation are the time and effort required to maintain the documentation on the one hand, and the experience that excessive documentation is not exploited to its full extent anyway on the other. ...
Book
Full-text available
Our future power grid is probably one of the most complex and most sophisticated critical infrastructures. Given the complexity of the smart grid and its safety and security challenges, requirements elicitation for smart grid solutions requires a systematic process to address these crucial non-functional system properties. The research described in this report was motivated by work carried out in the context of the "enera" project, a public research project funded by the German Federal Ministry for Economic Affairs and Energy (BMWi) within the funding program "Smart Energy Showcase - Digital Agenda for the Energy Transition" (SINTEG). Among the many project partners, the two research institutes Fraunhofer Institute for Experimental Software Engineering IESE and OFFIS - Institute for Information Technology conducted research on methods for systems and software engineering for complex systems and systems-of-systems. The outcome is a process for eliciting system-of-systems requirements, as well as an extension of the IEC 62559-2 use case template to address non-functional requirements. The activities in the use case elicitation process are accompanied and supported by an assurance case.
Article
Automatic code documentation generation has been a crucial task in the field of software engineering. It not only relieves developers from writing code documentation but also helps them to understand programs better. Specifically, deep-learning-based techniques that leverage large-scale source code corpora have been widely used in code documentation generation. These works tend to use automatic metrics (such as BLEU, METEOR, ROUGE, CIDEr, and SPICE) to evaluate different models. These metrics compare generated documentation to reference texts by measuring the overlapping words. Unfortunately, there is no evidence demonstrating the correlation between these metrics and human judgment. We conduct experiments on two popular code documentation generation tasks, code comment generation and commit message generation, to investigate presence or absence of correlations between these metrics and human judgements. For each task, we replicate three state-of-the-art approaches and the generated documentation is evaluated automatically in terms of BLEU, METEOR, ROUGE-L, CIDEr, and SPICE. We also ask 24 participants to rate the generated documentation considering three aspects (i.e., language, content, and effectiveness). Each participant is given Java methods or commit diffs along with the target documentation to be rated. The results show that the ranking of generated documentation from automatic metrics is different from that evaluated by human annotators. Thus, these automatic metrics are not reliable enough to replace human evaluation for code documentation generation tasks. In addition, METEOR shows the strongest correlation (with moderate Pearson correlation r about 0.7) to human evaluation metrics. However, it is still much lower than the correlation observed between different annotators (with high Pearson correlation r about 0.8) and correlations that are reported in the literature for other tasks (e.g., Neural Machine Translation [39]
Article
Modern software systems are commonly built on top of frameworks. To accelerate the learning process of features provided by frameworks, code samples are made available to assist developers. However, we know little about how code samples are developed and consumed. In this paper, we aim to fill this gap by assessing the characteristics of framework code samples. We provide insights into how code samples are maintained and used by developers. We analyze over 230 code samples provided by Android and Spring Boot, and assess aspects related to their code, evolution, popularity, and client usage. We find that most code samples are small and simple, provide a working environment for the clients, and rely on automated build tools. They frequently change, for example, to adapt to new framework versions. We also detect that clients commonly fork the code samples, however, they rarely modify them. To further understand the problems faced by developers, we analyze 614 Stack Overflow questions about the code samples and 269 issues from code sample repositories. We find that developers face problems when trying to modify the code samples and the most common issue is related to improvement. Finally, we propose implications to creators and clients of code samples to improve maintenance and usage activities.
Conference Paper
Full-text available
This paper highlights the results of a survey of software professionals. One of the goals of this survey was to uncover the perceived relevance (or lack thereof) of software documentation, and the tools and technologies used to maintain, verify and validate such documents. The survey results highlight the preferences for and aversions against software documentation tools. Participants agree that documentation tools should seek to better extract knowledge from core resources. These resources include the system's source code, test code and changes to both. Resulting technologies could then help reduce the effort required for documentation maintenance, something that is shown to rarely occur. Our data reports compelling evidence that software professionals value technologies that improve automation of the documentation process, as well as facilitating its maintenance.
Conference Paper
Full-text available
This paper reports our experiences studying the work practices of professional software engineers (SEs). We provide our reasons for following this approach, and describe details such as the discovery of work patterns, and the use of synchronized shadowing. We outline several studies we are currently conducting in a large telecommunications company and explain how these studies influenced the design of a software engineering exploration environment
Article
Full-text available
This paper describes the results of an interview study conducted at ten industrial sites. The interview focused on the work practices of software engineers engaged in maintaining large scale systems. Five `truths' emerged from this study. First, software maintenance engineers are experts in the systems they are maintaining. Second, source code is the primary source of information about systems. Third, the documentation is used, but not necessarily trusted. Fourth, maintenance control systems are important repositories of information about systems. Finally, reproduction of problems and/or problem scenarios is essential to problem solutions. These truths confirm much of the conventional wisdom in the field. However, in fleshing them out, details were elaborated, and additionally new knowledge was acquired. These results are discussed with respect to tool design.
Conference Paper
This paper describes the results of an interview study conducted at ten industrial sites. The interview focused on the work practices of software engineers engaged in maintaining large scale systems. Five `truths' emerged from this study. First, software maintenance engineers are experts in the systems they are maintaining. Second, source code is the primary source of information about systems. Third, the documentation is used, but not necessarily trusted. Fourth, maintenance control systems are important repositories of information about systems. Finally, reproduction of problems and/or problem scenarios is essential to problem solutions. These truths confirm much of the conventional wisdom in the field. However, in fleshing them out, details were elaborated, and additionally new knowledge was acquired. These results are discussed with respect to tool design
Article
This paper presents work practice data of the daily activities of software engineers. Four separate studies are presented; one looking longitudinally at an individual SE; two looking at a software engineering group; and one looking at company-wide tool usage statistics. We also discuss the advantages in considering work practices in designing tools for software engineers, and include some requirements for a tool we have developed as a result of our studies. 1. Introduction The Knowledge Based Reverse Engineering Project's goal is to provide software engineers (SEs) in an industrial telecommunications group with a toolset to help them maintain their system more effectively. To achieve this goal, we have adopted a user-centered design approach to tool development [6, 7, 8]. However, unlike traditional user-centered approaches, we have focused on the SEs' work-practices. This represents a new approach [15] to tool design. This approach borrows from several different fields in an effort...
Article
This paper reports our experiences studying the work practices of professional software engineers (SEs). We provide our reasons for following this approach, and describe details such as the discovery of work patterns, and the use of synchronized shadowing. We outline several studies we are currently conducting in a large telecommunications company and explain how these studies influenced the design of a software engineering exploration environment.
Williams Term Expiring
  • Jean M Bacon
  • Ricardo Baeza-Yates
  • Deborah M Cooper
  • George V Cybenko
  • Haruhisha Ichikawa
  • Lowell G Johnson
  • Michael S Adler President-Electmichael
  • R Lightner
  • Vp
For more information on this or any other computing topic, please visit our Digital Library at http://computer.org/publications/dlib. BOARD OF GOVERNORS Term Expiring 2003: Fiorenza C. Albert- Howard, Manfred Broy, Alan Clements, Richard A. Kemmerer, Susan A. Mengel, James W. Moore, Christina M. Schober Term Expiring 2004: Jean M. Bacon, Ricardo Baeza-Yates, Deborah M. Cooper, George V. Cybenko, Haruhisha Ichikawa, Lowell G. Johnson, Thomas W. Williams Term Expiring 2005: Oscar N. Garcia, Mark A Grant, Michel Israel, Stephen B. Seidman, Kathleen M. Swigger, Makoto Takizawa, Michael R. Williams Next Board Meeting: 22 Nov. 2003, Tampa, FL IEEE OFFICERS President: MICHAEL S. ADLER President-Elect: ARTHUR W. WINSTON Past President: RAYMOND D. FINDLAY Executive Director: DANIEL J. SENESE Secretary: LEVENT ONURAL Treasurer: PEDRO A. RAY VP, Educational Activities: JAMES M. TIEN VP, Publications Activities:MICHAEL R. LIGHTNER VP, Regional Activities: W. CLEON ANDERSON VP, Standards Association: GERALD H. PETERSON VP, Technical Activities: RALPH W. WYNDRUM JR. IEEE Division VIII Director JAMES D. ISAAK President, IEEE-USA: JAMES V. LEONARD EXECUTIVE COMMITTEE President: STEPHEN L. DIAMOND* Picosoft, Inc. P.O.Box 5032