SUITE 2009: First international workshop on search-driven development - users, infrastructure, tools and evaluation.
ABSTRACT SUITE is a new workshop series that specifically
focuses on exploring the notion of search as a
fundamental activity during software development.
The goal of the workshop is to bring researchers and
practitioners with special interest on search
technology for software developers together.
Participants will have broad range of expertise in
topics ranging from building software tools and
infrastructure, Information Retrieval, user studies
and Human-computer interaction, benchmarking and
evaluation. The first edition of SUITE is held in
conjunction with the 31st International Conference
in Software Engineering (May 16th, 2009. Vancouver,
- SourceAvailable from: Massimiliano Penta
Conference Paper: Identifying licensing of jar archives using a code-search approach[Show abstract] [Hide abstract]
ABSTRACT: Free and open source software strongly promotes the reuse of source code. Some open source Java components/libraries are distributed as jar archives only containing the bytecode and some additional information. For whoever wanting to integrate this jar in her own project, it is important to determine the license(s) of the code from which the jar archive was produced, as this affects the way that such component can be used. This paper proposes an automatic approach to determine the license of jar archives, combining the use of a code-search engine with the automatic classification of licenses contained in textual flies enclosed in the jar. Results of an empirical study performed on 37 jars - from 17 different systems - indicate that this approach is able to successfully infer the jar licenses in over 95% of the cases, but that in many cases the license in textual flies may differ from the one of the classes contained in the jar.Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on; 06/2010
Conference Paper: SNIPR: Complementing code search with code retargeting capabilities[Show abstract] [Hide abstract]
ABSTRACT: This paper sketches a research path that seeks to examine the search for suitable code problem, based on the observation that when code retargeting is included within a code search activity, developers can justify the suitability of these results upfront and thus reduce their searching efforts looking for suitable code. To support this observation, this paper introduces the Snippet Retargeting Approach, or simply SNIPR. SNIPR complements code search with code retargeting capabilities. These capabilities' intent is to help expedite the process of determining if a found example is a best fit. They do that by allowing developers to explore code modification ideas in place, without requiring to leave the search interface. With SNIPR, developers engage in a virtuous loop where they find code, retarget code, and select only code choices they can justify as suitable. This assures immediate feedback on retargeted examples and thus saves valuable time searching for appropriate code.Software Engineering (ICSE), 2013 35th International Conference on; 01/2013
- [Show abstract] [Hide abstract]
ABSTRACT: This paper presents an analysis of a year long usage log of Koders, the first commercially available Internet-Scale code search engine (http://www.koders.com). The usage log comprises about ten million activities from more than three million users. Analysis of the usage data shows that despite of attracting a large number of visitors, Koders has a very sparse usage and that it lacks regular usage from many of its users. When compared to Web search, search behavior in Koders showed many similar patterns. A topic modeling analysis of the usage data shows what topics users of Koders are looking for. Observations on the prevalence of these topics among the users, and observations on how search and download activities vary across topics, lead to the conclusion that users who find code search engines usable are those who already know to a high level of specificity what to look for. This paper also presents a general categorization of these topics that provides insights on the different ways code search engine users express their queries. It identifies various forms of queries in Koders’s log and the kinds of results addressed by the queries. It also provides several suggestions for improvements in code search engines based on the analysis of usage, topics, and query forms. The work presented in this paper is the first of its kind that reveals several insights on the usage of an Internet-Scale code search engine. KeywordsCode search engine-Usage log analysis-Mining topicsEmpirical Software Engineering 08/2012; 17(4-5):1-43. · 1.18 Impact Factor
SUITE 2009: First International Workshop on Search-Driven Development
– Users, Infrastructure, Tools and Evaluation
University of California, Irvine
University of Bern
Software Research Associates, Inc.
SUITE is a new workshop series that specifically focuses
on exploring the notion of search as a fundamental activity
during software development. The goal of the workshop is
to bring researchers and practitioners with special interest
on search technology for software developers together. Par-
ticipants will have broad range of expertise in topics rang-
ing from building software tools and infrastructure, Infor-
mation Retrieval, user studies and Human-computer inter-
action, benchmarking and evaluation.
The first edition of SUITE is held in conjunction with
the 31stInternational Conference in Software Engineering
(May 16-24, 2009. Vancouver, Canada).
The workshop is motivated by the observation that soft-
ware developers spend most of their times in searching per-
tinent information they need to solve their task at hand
. Past research has shown that code search is the
most frequent activity software developers engage in .
They spend most of their time in navigation and search
tools in their IDE.More recently there has been
some significant efforts both from academia and the in-
dustry in building specialized search engines for develop-
ers [2, 3, 1, 5, 4, 9, 6, 10, 13, 7]. Most of these leverage
the huge amount of source code available in open source
repositories. However, these tools are still exploring the
tip of the iceberg. We know that source code is not the
only artifact that developers need to search and that tradi-
tional search engine interfaces have limitations to serve as
ideal tools for searching pertinent information for develop-
ers. Furthermore, along with the tools we still need a solid
understanding of how developers are really using these sys-
As software development is a process of both informa-
tion creation and information gathering, software develop-
ers are constantly searching for the right information and
person to solve their problems at hand. This workshop will
focus specifically on exploring the notion of search as a fun-
damental activity during software development. The goal of
the workshop is to bring researchers and practitioners with
special interest on search technology for software develop-
ers together. Participants will have broad range of expertise
in topics ranging from building software tools and infras-
tructure, information retrieval, user studies and HCI, bench-
marking and evaluation.
The workshop will facilitate interested researchers to
share their ideas and experience in understanding the search
need and behavior of developers, building tools that ad-
dresses these various needs, and scientific ways to evaluate
The workshop addresses the problem of search as it oc-
curs during software development. Search is related to soft-
ware mining, but differs in its problems and challenges. For
example, two of the important topics the workshop focuses
on are: a) search-engines for public software repositories on
the internet, and b) specialized search-engines for IDEs.
Areas of interests include, but are not limited to:
• Application of natural language processing on source
code and related artifacts.
• Approaches, applications, and tools for software
• Case studies on setting up and running large software
• Empirical studies of search and navigation in IDEs.
ICSE’09, May 16-24, 2009, Vancouver, Canada
978-1-4244-3494-7/09/$25.00 © 2009 IEEE Companion Volume445
• How can industry and researchers collaborate?
• Information retrieval and machine learning techniques
to search source code.
• Integration of specialized search engines into IDEs.
• Methods of integrating indexed data from various
sources and histories.
• Query languages to search software and repositories.
• Search techniques to assist developers in finding suit-
able components and code fragments for reuse.
• Techniques for indexing large software repositories
(and their history) efficiently.
• Static analysis and parsing of internet-scale code
• Crawling source code in the internet and code reposi-
• The use of visualizations to support software search.
• Validation of tools and software searching benchmarks
• Ranking strategies and heuristics for code search.
• Slicing and generative techniques for code extraction
This year’s submissions to the workshop touches var-
ious themes as seen across the topics presented above.
They range from tools and infrastructure to user stud-
ies and experiments. All, in one way or another, mo-
tivated by the goal of enhancing the search experience
of developers during software development. The list of
accepted papers is available from the workshop’s web-
site http://smallwiki.unibe.ch/suite2009/. Final
versions of the papers appear in the ICSE proceedings.
Sushil Bajracharya1is a PhD candidate in the Depart-
ment of Informatics, Donald Bren School of Information
and Computer Sciences, University of California Irvine,
Adrian Kuhn2is a PhD candidate at the Software Com-
position Group, University of Bern, Switzerland.
Yunwen Ye is a manager in the Technology Strategy Di-
vision in Software Research Associates, Inc. Japan.
 Koders web site. http://www.koders.com.
 Krugle web site. http://www.krugle.com.
 S. Bajracharya, T. Ngo, E. Linstead, Y. Dou, P. Rigor,
P. Baldi, and C. Lopes. Sourcerer: a search engine for open
source code supporting structure-based search.
SLA ’06: Companion to the 21st ACM SIGPLAN symposium
on Object-oriented programming systems, languages, and
applications, pages 681–682, New York, NY, USA, 2006.
 R. Hoffmann, J. Fogarty, and D. S. Weld. Assieme: finding
and leveraging implicit references in a web search interface
for programmers. In UIST ’07: Proceedings of the 20th an-
nual ACM symposium on User interface software and tech-
nology, pages 13–22, New York, NY, USA, 2007. ACM.
 R. Holmes and G. C. Murphy. Using structural context to
recommend source code examples. In ICSE ’05: Proceed-
ings of the 27th international conference on Software engi-
neering, pages 117–125, New York, NY, USA, 2005. ACM.
 O. Hummel, W. Janjic, and C. Atkinson. Code conjurer:
Pulling reusable software out of thin air.
 A. J. Ko, R. DeLine, and G. Venolia. Information needs in
collocated software development teams. In ICSE ’07: Pro-
ceedings of the 29th international conference on Software
Engineering, pages 344–353, Washington, DC, USA, 2007.
IEEE Computer Society.
 O. A. L. Lemos, S. K. Bajracharya, J. Ossher, R. S. Morla,
P. C. Masiero, P. Baldi, and C. V. Lopes. Codegenie: us-
ing test-cases to search and reuse source code. In ASE ’07:
Proceedings of the twenty-second IEEE/ACM international
conference on Automated software engineering, pages 525–
526, New York, NY, USA, 2007. ACM.
 D. Mandelin, L. Xu, R. Bod´ ık, and D. Kimelman. Jungloid
mining: helping to navigate the api jungle. In PLDI ’05:
Proceedings of the 2005 ACM SIGPLAN conference on Pro-
gramming language design and implementation, pages 48–
61, New York, NY, USA, 2005. ACM.
 G. C. Murphy, M. Kersten, and L. Findlater. How are java
software developers using the eclipse ide?
 J. Singer, T. Lethbridge, N. Vinson, and N. Anquetil. An ex-
amination of software engineering work practices. In CAS-
CON ’97: Proceedings of the 1997 conference of the Centre
for Advanced Studies on Collaborative research, page 21.
IBM Press, 1997.
 S. Thummalapenta and T. Xie. Parseweb: a programmer as-
sistant for reusing open source code on the web. In ASE ’07:
Proceedings of the twenty-second IEEE/ACM international
conference on Automated software engineering, pages 204–
213, New York, NY, USA, 2007. ACM.