Qing Wang

Qing Wang
Chinese Academy of Sciences | CAS · Institute of Software

Ph.D

About

229
Publications
20,295
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,608
Citations

Publications

Publications (229)
Preprint
Full-text available
Manual testing, as a complement to automated GUI testing, is the last line of defense for app quality especially in spotting usability and accessibility issues. However, the repeated actions and easy missing of some functionalities make manual testing time-consuming, labor-extensive and inefficient. Inspired by the game candy crush with flashy cand...
Preprint
Full-text available
Graphical User Interface (GUI) provides a visual bridge between a software application and end users, through which they can interact with each other. With the upgrading of mobile devices and the development of aesthetics, the visual effects of the GUI are more and more attracting, and users pay more attention to the accessibility and usability of...
Preprint
In community-based software development, developers frequently rely on live-chatting to discuss emergent bugs/errors they encounter in daily development tasks. However, it remains a challenging task to accurately record such knowledge due to the noisy nature of interleaved dialogs in live chat data. In this paper, we first formulate the task of ide...
Chapter
One challenging problem with crowdsouced testing is optimizing crowd workers’ participation. Crowd resources, while cheap, are not free. Hence, when scaling up crowdsouced testing, it is necessary to maximize the information gain from every member of the crowds. Also, not all crowd workers are equally skilled at finding bugs. Inappropriate workers...
Chapter
While the crowdsourced testing often calls for a large number of end-users, who may have no knowledge on programming, to conduct the black-box testing tasks. It can also gathers a group of professional testers to write test code for white-box testing. To efficiently process the crowdsourced testing reports that consist of textual descriptions and s...
Chapter
The benefit of crowdsourced testing must be carefully assessed with respect to the cost of the technique. At first place, crowdsourced testing is a scalable testing method under which large software systems can be tested with appropriate results. This is particular true when the testing is related with the feedback on GUI systems, or subjective opi...
Chapter
The growing ubiquity of data-driven learning models in algorithmic decision-making has recently boosted concerns about the issues of fairness and bias. Friedman defined that a computer system is biased “if it systematically and unfairly discriminates against certain individuals or groups of individuals in favor of others”. For example, job recommen...
Chapter
Figure 2.1 presents the overall procedure of crowdsourced testing. The project manager provides a test task for crowdsourced testing, including the software under test and test requirements. The crowdsourced testing task is usually in the format of open call, so a large number of crowd workers can sign in to perform the task based on its test requi...
Chapter
In order to attract workers, testing tasks are often financially compensated, especially for these failed reports. Under this context, workers may submit thousands of test reports. However, these test reports often have many false positives, i.e., a test report marked as failed that actually involves correct behavior or behavior that was considered...
Chapter
This chapter presents how we characterize the crowd workers so as to conduct the task recommendation and worker recommendation to facilitate the crowdsourced testing practice. We propose a data-driven crowd worker characterization which characterizes the crowd workers automatically mining from the historical crowdsourced testing repositories. We fi...
Chapter
In Chaps. 8 and 9, we have mentioned that it is often time-consuming and tedious to manually inspect all received test reports. Besides automatic classification of test reports introduced in Chap. 8 and duplicate detection of test reports introducted in Chap. 9, this chapter seeks for other alternatives for managing these test reports, i.e., to pri...
Chapter
Trade-offs such as “how much testing is enough” are critical yet challenging project decisions in software engineering. Insufficient testing can lead to unsatisfying software quality, while excessive testing can result in potential schedule delays and low cost-effectiveness. This is especially true for crowdsourced testing given the complexity of m...
Chapter
However, while all of these techniques are built on the assumption that duplicate reports are harmful to software maintenance and aim at filtering out this information, Zimmermann et al. and Bettenburg et al. empirically found that duplicate reports are helpful for report comprehension and debugging.
Chapter
A wealth of previous literature has shown the inequalities gap between task requesters’ and workers’ decision support provided in crowdsourcing platforms. On the one hand, many platforms allow requesters to assess worker performance data and support the gauging of qualification criterion in order to control the quality of crowd submissions. On the...
Article
Full-text available
Requirements are usually written in natural language and evolve continuously during the process of software development, which involves a large number of stakeholders. Stakeholders with diverse backgrounds and skills might refer to the same real-world entity with different linguistic expressions in the natural-language requirements, resulting in re...
Preprint
Full-text available
Mobile apps are indispensable for people's daily life. Complementing with automated GUI testing, manual testing is the last line of defence for app quality. However, the repeated actions and easily missing of functionalities make manual testing time-consuming and inefficient. Inspired by the game candy crush with flashy candies as hint moves for pl...
Preprint
In open source software (OSS) communities, existing leadership indicators are dominantly measured by code contribution or community influence. Recent studies on emergent leadership shed light on additional dimensions such as intellectual stimulation in collaborative communications. To that end, this paper proposes an automated approach, named iLead...
Article
Full-text available
Discourse parsing of scholarly documents is the premise and basis for standardizing the writing of scholarly documents, understanding their content, and quickly locating and extracting specific information from them. With the continuous emergence of a large number of scholarly documents, how to automatically analyze scholarly documents quickly and...
Article
Context Function Point Analysis (FPA) provides an objective, comparative measure for size estimation in the early stage of software development. When practicing FPA, analysts typically abide by the following steps: data function (DF) extraction, transactional function extraction, function type classification and adjustment factor determination. How...
Chapter
Full-text available
Documents often contain complex physical structures, which make the Document Layout Analysis (DLA) task challenging. As a pre-processing step for content extraction, DLA has the potential to capture rich information in historical or scientific documents on a large scale. Although many deep-learning-based methods from computer vision have already ac...
Preprint
Full-text available
Collaborative live chats are gaining popularity as a development communication tool. In community live chatting, developers are likely to post issues they encountered (e.g., setup issues and compile issues), and other developers respond with possible solutions. Therefore, community live chats contain rich sets of information for reported issues and...
Preprint
Full-text available
Static bug finders have been widely-adopted by developers to find bugs in real world software projects. They leverage predefined heuristic static analysis rules to scan source code or binary code of a software project, and report violations to these rules as warnings to be verified. However, the advantages of static bug finders are overshadowed by...
Preprint
Full-text available
Documents often contain complex physical structures, which make the Document Layout Analysis (DLA) task challenging. As a pre-processing step for content extraction, DLA has the potential to capture rich information in historical or scientific documents on a large scale. Although many deep-learning-based methods from computer vision have already ac...
Conference Paper
Full-text available
Despite the valuable information contained in software chat messages, disentangling them into distinct conversations is an essential prerequisite for any in-depth analyses that utilize this information. To provide a better understanding of the current state-of-the-art, we evaluate five popular dialog disentanglement approaches on software-related c...
Preprint
Modern communication platforms such as Gitter and Slack play an increasingly critical role in supporting software teamwork, especially in open source development.Conversations on such platforms often contain intensive, valuable information that may be used for better understanding OSS developer communication and collaboration. However, little work...
Preprint
Full-text available
Graphical User Interface (GUI) provides visual bridges between software apps and end users. However, due to the compatibility of software or hardware, UI display issues such as text overlap, blurred screen, image missing always occur during GUI rendering on different devices. Because these UI display issues can be found directly by human eyes, in t...
Article
Full-text available
Mailing list is widely used as an important channel for communications between developers and stakeholders. It consists of emails that are posted for various purposes, such as reporting problems, seeking help in usage, managing projects, and discussing new features. Due to the intensive amount of new incoming emails every day, some valuable emails...
Article
Crowdsourced software testing (short for crowdtesting) is a special type of crowdsourcing. It requires that crowdworkers master appropriate skill-sets and commit significant effort for completing a task. Abundant uncertainty may arise during a crowdtesting process due to imperfect information between the task requester and crowdworkers. For example...
Preprint
Despite the valuable information contained in software chat messages, disentangling them into distinct conversations is an essential prerequisite for any in-depth analyses that utilize this information. To provide a better understanding of the current state-of-the-art, we evaluate five popular dialog disentanglement approaches on software-related c...
Conference Paper
Pull requests (PRs) prioritization is one of the main challenges faced by integrators in pull-based development. This is especially true for large open-source projects where hundreds of pull requests are submitted daily. Indeed, managing these pull requests manually consumes time and resources and may lead to delays in the reaction (i.e., acceptanc...
Conference Paper
Pull requests (PRs) selection is a challenging task faced by integrators in pull-based development (PbD), with hundreds of PRs submitted on a daily basis to large open-source projects. Managing these PRs manually consumes integrators’ time and resources and may lead to delays in the acceptance, response, or rejection of PRs that can propose bug fix...
Article
Software engineers get questions of “how much testing is enough” on a regular basis. Existing approaches in software testing management employ experience-, risk-, or value-based analysis to prioritize and manage testing processes. However, very few is applicable to the emerging crowdtesting paradigm to cope with extremely limited information and co...
Article
Continuous Integration (CI) is an important practice in agile development. With the growth of integration system, running all tests to verify the quality of submitted code, is clearly uneconomical. This paper aims at selecting a proper test subset for continuous integration so as to reduce test cost as much as possible without sacrificing quality....
Preprint
Full-text available
Pre-trained models such as BERT are widely used in NLP tasks and are fine-tuned to improve the performance of various NLP tasks consistently. Nevertheless, the fine-tuned BERT model trained on our protocol corpus still has a weak performance on the Entity Linking (EL) task. In this paper, we propose a model that joints a fine-tuned language model w...
Article
Crowdsourced testing is an emerging trend, in which test tasks are entrusted to the online crowd workers. Typically, a crowdsourced test task aims to detect as many bugs as possible within a limited budget. However not all crowd workers are equally skilled at finding bugs; Inappropriate workers may miss bugs, or report duplicate bugs, while hiring...
Article
Context: Crowdtesting is effective especially when it comes to the feedback on GUI systems, or subjective opinions about features. Despite of this, we find crowdtesting reports are highly duplicated, i.e., 82% of them are duplicates of others. Most of the existing approaches mainly adopted textual information for duplicate detection, and suffered f...
Article
Full-text available
This paper proposes an approach called feature weighted confidence with support vector machine (FWC–SVM) to incorporate prior knowledge into SVM with sample confidence. First, we use prior features to express prior knowledge. Second, FWC–SVM is biased to assign larger weights for prior weights in the slope vector \(\omega \) than weights correspond...
Conference Paper
Background: The most important challenge regarding the use of static analysis tools (e.g., FindBugs) is that there are a large number of warnings that are not acted on by developers. Many features have been proposed to build classification models for the automatic identification of actionable warnings. Through analyzing these features and related s...
Preprint
Full-text available
Trade-offs such as "how much testing is enough" are critical yet challenging project decisions in software engineering. Most existing approaches adopt risk-driven or value-based analysis to prioritize test cases and minimize test runs. However, none of these is applicable to the emerging crowd testing paradigm where task requesters typically have n...
Preprint
Crowdtesting is effective especially when it comes to the feedback on GUI systems, or subjective opinions about features. Despite of this, we find crowdtesting reports are highly replicated, i.e., 82% of them are replicates of others. Hence automatically detecting replicate reports could help reduce triaging efforts. Most of the existing approaches...
Preprint
Crowdtesting has grown to be an effective alter-native to traditional testing, especially in mobile apps. However,crowdtesting is hard to manage in nature. Given the complexity of mobile applications and unpredictability of distributed, parallel crowdtesting process, it is difficult to estimate (a) the remaining number of bugs as yet undetected or...
Article
Monitoring and predicting the trend of bug number time series of a software system is crucial for both software project managers and software end-users. For software managers, accurate prediction of bug number of a software system will assist them in making timely decisions, such as effort investment and resource allocation. For software end-users,...
Article
Context: Though linking issues and commits plays an important role in software verification and maintenance, such link information is not always explicitly provided during software development or maintenance activities. Current practices in recovering such links highly depend on tedious manual examination. To automatically recover missing links, se...
Article
We propose an approach called Bug report assignment with topic modeling and heterogeneous network analysis (BAHA) to automatically assign bug reports to developers. Existing studies adopt social network analysis to characterize the collaboration of developers. The networks used in these studies are all homogenous. In real practice of bug resolution...
Article
Full-text available
[Background]: Systematic Literature Review (SLR) has become an important software engineering research method but costs tremendous efforts. [Aim]: This paper proposes an approach to leverage on empirically evolved ontology to support automating key SLR activities. [Method]: First, we propose an ontology, SLRONT, built on SLR experiences and best pr...
Conference Paper
Background: Links between issue reports and their fixing commits play an important role in software maintenance. Such link data are often missing in practice and many approaches have been proposed in order to recover them automatically. Most of existing approaches focus on comparing log messages and source code files in commits with issues reports....