Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

A significant amount of research project funding is spent creating customized annotation systems, re-inventing the wheel once and again, developing the same common features. In this paper, we present WACline, a Software Product Line to facilitate customization of browser extension Web annotation clients. WACline reduces the development effort by reusing common features (e.g., highlighting and commenting) while putting the main focus on customization. To this end, WACline provides already implemented 111 features that can be extended with new ones. In this way, researchers can reduce the development and maintenance costs of annotation clients.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Motivation: Annotation tools are applied to build training and test corpora, which are essential for the development and evaluation of new natural language processing algorithms. Further, annotation tools are also used to extract new information for a particular use case. However, owing to the high number of existing annotation tools, finding the one that best fits particular needs is a demanding task that requires searching the scientific literature followed by installing and trying various tools. Methods: We searched for annotation tools and selected a subset of them according to five requirements with which they should comply, such as being Web-based or supporting the definition of a schema. We installed the selected tools (when necessary), carried out hands-on experiments and evaluated them using 26 criteria that covered functional and technical aspects. We defined each criterion on three levels of matches and a score for the final evaluation of the tools. Results: We evaluated 78 tools and selected the following 15 for a detailed evaluation: BioQRator, brat, Catma, Djangology, ezTag, FLAT, LightTag, MAT, MyMiner, PDFAnno, prodigy, tagtog, TextAE, WAT-SL and WebAnno. Full compliance with our 26 criteria ranged from only 9 up to 20 criteria, which demonstrated that some tools are comprehensive and mature enough to be used on most annotation projects. The highest score of 0.81 was obtained by WebAnno (of a maximum value of 1.0).
Chapter
Full-text available
In previous studies on user behavior with Digital Scholarly Editions (DSE), we found that annotating the text is a key technique for working with the text. In this follow-up study, we invited volunteers to perform open research tasks on a DSE by Lope de Vega, providing the annotation tool hypothes.is to support their tasks. What we found is that none of the participants used the tool extensively, yet, it was clear that annotation of text was a major part of their workflow. During a focus discussion after the experiment, the participants gave examples of how they would compensate for a lack of appropriate tools, often at the cost of considerable extra work and overhead. More research is needed to discover why the users were not accepting the tools provided. So, we propose a human-centered structured longitudinal approach to design an annotation tool that would actually be usable.
Conference Paper
Full-text available
Systematic Literature Reviews (SLRs) are increasingly popular to categorize and identify research gaps. Their reliability largely depends on the rigour of the attempt to identify, appraise and aggregate evidences through coding, i.e. the process of examining and organizing the data contained in primary studies in order to answer the research questions. Current Qualitative Data Analysis Software (QDAS) lack of a common format. This jeopardizes reuse (i.e. difficult to share coding data among different tools), evolution (i.e. difficult to turn coding data into living documents that evolve as new research is published), and replicability (i.e. difficult for third parties to access and query coding data). Yet, the result of a recent survey indicates that 71,4% of participants (expert SLR reviewers) are ready to share SLR artifacts in a common repository. On the road towards open coding-data repositories, this work looks into W3C's Open Annotation as the way to RDFized those coding data. Benefits include: portability (i.e. W3C's prestige endorses the adoption of this standard among tool vendors); webization (i.e. coding data becomes URL addressable, hence openly reachable), and data linkage (i.e. RDFizing coding data benefit from Web technologies to query, draw inferences and easily link this data with external vocabularies). This paper rephrases coding practices as annotation practices where data is captured as W3C's Open Annotations. Using an open annotation repository (i.e. Hypothes.is), the paper illustrates how this repository can be populated with coding data. Deployability is proven by describing two clients on top of this repository: (1) a write client that populates the repository through a color-coding highlighter, and (2), a read client that obtains a traditional SLR spreadsheets by querying so-populated repositories.
Conference Paper
Full-text available
Web 2.0 technologies, with their affordances of interconnections, content creation and delivery, promote the design of collaborative activities that engage students in learning process. To this direction, a social annotation system in e-learning environments enables the knowledge sharing, the understanding and memorization of learning objects through adding notes, commenting specific parts of materials, highlighting etc. This paper conducts a comparative analysis of social annotation tools in wed-based educational systems in order the importance and advantages of this technology to be arisen and the requirements for an integrated annotation system to be outlined. There is scope for improvement in the field of personalized annotations and recommendations regarding student’s needs and learning style.
Article
Full-text available
This review discusses the evidences in using SA tools in higher education settings. Using a detailed inclusion/exclusion procedure, 71 studies were included. A large number of studies were centred on system design issues and the evaluation of designed tools within education and computer technology classes, with blended learning modality among undergraduates. Findings suggested there was a gradual increase in the frequency of SA-based publications, with Science Direct, Taylor & Francis, and IEEE as the three databases with the most SA publication experiences. Findings were mostly derived from quasi-experiments. Of the four major topics recognised, 'system design and implementation issues' was categorised as the first topic, followed by 'the effectiveness of SA tools on process-oriented measures', 'the effectiveness of SA tools on outcome-oriented measures', and 'the improvement of SA tools and learning design'. Process-oriented and outcome-oriented measures dominating the studies were quantity and quality of annotations and reading performance, respectively.
Conference Paper
Full-text available
Objective: Even though a number of tools is reported to be used by researchers undertaking systematic reviews, important shortages are still reported revealing how such solutions are unable to satisfy current needs. Method: Two researchers independently provided a competing design for a tool supporting systematic reviews. The resulting tools were assessed against the feature lists provided by prior research. Results: After presenting an overview of the tools and the core design decisions taken, we provide a feature analysis and a discussion regarding selected challenges deemed crucial to provide a proper tool support. Conclusions: Although the designed solutions do not yet support the entire systematic review process, their architecture has been designed to be flexible and extendable. After highlighting the difficulties of developing appropriate tools, we call for action: developing tools to support systematic reviews is a community project.
Article
Full-text available
The breadth and depth of biomedical literature are increasing year upon year. To keep abreast of these increases, FlyBase, a database for Drosophila genomic and genetic information, is constantly exploring new ways to mine the published literature to increase the efficiency and accuracy of manual curation and to automate some aspects, such as triaging and entity extraction. Toward this end, we present the ‘tagtog’ system, a web-based annotation framework that can be used to mark up biological entities (such as genes) and concepts (such as Gene Ontology terms) in full-text articles. tagtog leverages manual user annotation in combination with automatic machine-learned annotation to provide accurate identification of gene symbols and gene names. As part of the BioCreative IV Interactive Annotation Task, FlyBase has used tagtog to identify and extract mentions of Drosophila melanogaster gene symbols and names in full-text biomedical articles from the PLOS stable of journals. We show here the results of three experiments with different sized corpora and assess gene recognition performance and curation speed. We conclude that tagtog-named entity recognition improves with a larger corpus and that tagtog-assisted curation is quicker than manual curation. Database URL: www.tagtog.net, www.flybase.org
Article
Full-text available
Scholars have made handwritten notes and comments in books and manuscripts for centuries. Today's blogs and news sites typically invite users to express their opinions on the published content; URLs allow web resources to be shared with accompanying annotations and comments using third-party services like Twitter or Facebook. These contributions have until recently been constrained within specific services, making them second-class citizens of the Web. Web Annotations are now emerging as fully independent Linked Data in their own right, no longer restricted to plain textual comments in application silos. Annotations can now range from bookmarks and comments, to fine-grained annotations of a selection of, for example, a section of a frame within a video stream. Technologies and standards now exist to create, publish, syndicate, mash-up and consume, finely targeted, semantically rich digital annotations on practically any content, as first-class Web citizens. This development is being driven by the need for collaboration and annotation reuse amongst domain researchers, computer scientists, scientific publishers, and scholarly content databases.
Article
Full-text available
This paper examines the current way of keeping the data produced during an evaluation campaign of Information Retrieval Systems (IRSs) and highlights some shortenings of it. In particular, the Cranfield methodology has been designed for creating compa-rable experiments and evaluating the performances of IRS rather than modeling and managing the scientific data produced during an evaluation campaign. The data produced during an evaluation campaign of IRSs are valuable scientific data, and as a conse-quence, their lineage should be tracked since it allows us to judge the quality and applicability of informa-tion for a given use; those data should be enriched progressively adding further analyses and interpreta-tions on them; it should be possibile to cite them and their further elaboration, since this is an effective way for explicitly mentioning and making references to use-ful information, for improving the cooperation among researchers and to facilitate the transfer of scientific and innovative results from research groups to the in-dustrial sector.
Conference Paper
Full-text available
Product line software engineering (PLSE) is an emerging software engineering paradigm, which guides organizations toward the development of products from core assets rather than the development of products one by one from scratch. In order to develop highly reusable core assets, PLSE must have the ability to exploit commonality and manage variability among products from a domain perspective. Feature modeling is one of the most popular domain analysis techniques, which analyzes commonality and variability in a domain to develop highly reusable core assets for a product line. Various attempts have been made to extend and apply it to the development of software product lines. However, feature modeling can be difficult and time-consuming without a pre- cise understanding of the goals of feature modeling and the aid of practical guidelines. In this paper, we clarify the concept of features and the goals of fea- ture modeling, and provide practical guidelines for successful product line soft- ware engineering. The authors have extensively used feature modeling in sev- eral industrial product line projects and the guidelines described in this paper are based on these experiences.
Article
Full-text available
Science projects are data publishers. The scale and complexity of current and future science data changes the nature of the publication process. Publication is becoming a major project component. At a minimum, a project must preserve the ephemeral data it gathers. Derived data can be reconstructed from metadata, but metadata is ephemeral. Longer term, a project should expect some archive to preserve the data. We observe that pub-lished scientific data needs to be available forever ? this gives rise to the data pyramid of versions and to data inflation where the derived data volumes explode. As an example, this article describes the Sloan Digital Sky Survey (SDSS) strategies for data publication, data access, curation, and preservation.
Article
The annotation practice is an almost daily activity used by healthcare professionals (PHC) to analyze patients' records, collaborate, share knowledge, and communicate. These annotations are generated within a healthcare cycle. Similarly, this cycle represents the life cycle of annotations in the patient record. The exponential increase in the number of medical annotation systems made the choice of a system by a PHC difficult, in a well-defined context (biology, radiology) and according to his/her needs to the functionalities offered by these tools. Therefore, the authors propose two taxonomies to distinguish annotation tools developed by industry and academia over the last two decades. The first classification provides an external vision based on five generic criteria. The second classification is an internal vision that gives us an idea about the functionalities offered by these systems. Finally, these unified and integrated classifications criteria are used to organize and observe the limitation of 50 medical annotation tool systems.
Article
Context Clone and Own (CaO) is a widespread approach to generate new software products from existing software products by adding small changes. The Software Product Line (SPL) approach addresses the development of families of products with similar features, moving away from the production of isolated products. Despite the popularity of both approaches, no experiment has yet compared them directly. Objective The goal of this paper is to know the different performances of software engineers in the software products development process using two different approaches (SPL and CaO). Method We conducted an experiment in the induction hobs software environment with software engineers. This experiment is a single factor experiment where the factor is the approach that is used to develop software products, with two treatments: (SPL or CaO). We compared the results obtained by the software engineers when they develop software products related to effectiveness, efficiency, and satisfaction. Results The findings show that: (1) the SPL approach is more efficient even though the number of checking actions required by this approach is greater than the number required by the CaO approach; (2) the SPL approach offers more possibilities than software engineers need to perform their daily tasks; and (3) software engineers require better search capabilities in the CaO approach. The possible explanations for these results are presented in the paper. Conclusions The results show that there are significant differences in effectiveness, efficiency, and satisfaction, with the SPL approach yielding the best results.
Conference Paper
The paper describes a demonstration of pure::variants, a commercial tool for variant and variability management for product lines. The demonstration shows how flexible product line (PL) architectures can be built, tested and maintained by using the modeling and integration capabilities provided by pure::variants. With pure::variants being available for a long time, the demonstration (and the paper) combines both basics of pure::variants, known to parts of the audience, and new capabilities, introduced within the last year.
Conference Paper
Document annotation tools have been widely used in technology-enhanced learning. Through these tools, students can associate annotations with fragments of documents, which enhances the thorough analysis of content and develops meta-reflective thinking. Likewise, annotation tools can facilitate collaborative annotation among students, as well as implement innovative interaction mechanisms. In addition, most tools provide mechanisms for classifying annotations. These mechanisms facilitate the subsequent retrieval of relevant annotations. Therefore, classification mechanisms are essential to facilitating the evaluation of work done by students, promoting collaborative work in annotation communities, etc. In addition, classification mechanisms help students better understand what they should annotate, implicitly guiding them during annotation activities. However, while there are multiple studies focusing on other aspects of annotation tools, the classification aspect has deserved only marginal attention in the literature. This work makes an exploratory study focused on this essential, although somewhat ignored, requirement. As a result, five main approaches to annotation classification are identified: absence of classification mechanisms, classification based on annotation modes, classification by predefined semantic categories, classification based on folksonomies, and classification based on ontologies.
Article
With the rapid growth in the body of scientific data, scientific research depends more and more on finding theories and knowledge from the data, and thus data-intensive scientific discovery has become the fourth paradigm of scientific research. Therefore, it is urgent to develop and adopt methods to support the collection, collation, preservation and utilization of scientific data. This paper provides an overview of scientific data curation research and practices in mainland China. Firstly, it reviews Chinese research articles on data curation and outlines the research status and progress in this area. Secondly, it surveys existing scientific data repositories or platforms in mainland China, and analyzes the gaps between China's and other countries' data curation practices.
Article
In educational context, and with the emergence of web information technology, the need for annotation tools becomes more and more claimed and felt, because annotation practice is common and omnipresent. While reading, learner usually uses comments, highlights and circles sections to tag digital documents. Therefore, many systems have been developed in the learning environment to help learner to annotate different electronic-consulted resources. This great variety of annotation systems, which is usually considered as an enrichment of the educational community, reveals a lack of a clear strategy of how to compare developed annotation systems of the literature according to their features and services to facilitate learner the choice of an annotation system that allows to enhance learning. As a result, few works have tried to present comparative studies of these tools. The aim of this article is to provide a study of some annotation tools used by learners in educational practices. Therefore, we present a comparison of services provided by 40 annotation systems developed by industry and academia during the last decade. As a second contribution, we classify these annotation tools based on transversal features. Finally, the study further reveals gaps in systems and opportunities for further research.
Article
On autopsy, a patient is found to have hypertrophic cardiomyopathy. The patient's family pursues genetic testing that shows a "likely pathogenic" variant for the condition on the basis of a study in an original research publication. Given the dominant inheritance of the condition and the risk of sudden cardiac death, other family members are tested for the genetic variant to determine their risk. Several family members test negative and are told that they are not at risk for hypertrophic cardiomyopathy and sudden cardiac death, and those who test positive are told that they need to be regularly monitored for cardiomyopathy . . .
Conference Paper
Knowledge visualization is a fascinating study field that has received more attention. In e-Science environment, people pay more attention to the scientific visualization and data visualization. Visualization about information exchange and knowledge sharing is rare to be mentioned. If the concept maps, a visualization tool, are used to help members in e-Science environment to describe their structure of knowledge in communication, it can facilitate thinking together and the collective wisdom, it also can promote the knowledge sharing and the creation of the new knowledge. In this paper, we discuss the application of concept maps and we promote an application model in e-Science environment.
Conference Paper
Collecting and aggregating multimedia knowledge is of fundamental importance for every organisation in order to gain competitiveness and to reduce costs. It is possible that knowledge contained in just one medium E.G. text documents, does not carry the full evidence looked for. Therefore connecting information stored in more than one medium is often required. It is clear that current knowledge management technologies and practises cannot cope with such situations, as they mainly provide simple mechanisms (E.G. keyword searching). Currently knowledge workers manually pierce together the information from different sources. In this report we focus and envisage research methodologies that will enable the semantic enrichment of multimedia documents, both on multiple media and across media through annotation.
Automatic and manual web annotations in an infrastructure to handle fake news and other online media phenomena
  • Rehm
W3C web annotation recommendation
  • Sanderson
Cross-media document annotation and enrichment
  • C Ajay
  • C Fabio
  • L Vitakeska