Valentin TablanIeso Digital Health · Digital futures lab
Valentin Tablan
Doctor of Philosophy
About
78
Publications
22,744
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,009
Citations
Introduction
Publications
Publications (78)
Escalating global mental health demand exceeds existing clinical capacity. Scalable digital solutions will be essential to expand access to high-quality mental healthcare. This study evaluated the effectiveness of a digital intervention to alleviate mild, moderate and severe symptoms of generalized anxiety. This structured, evidence-based program c...
Introduction: The immediate impact of COVID-19 on morbidity and mortality has raised the need for accurate and real time data monitoring and communication. The aim of this study is to document initial observations from multiple digital services providers during the COVID-19 crisis, especially those related to mental health and wellbeing. Materials...
Objective: Understanding patient responses to psychotherapy is important in developing effective interventions. However, coding patient language is a resource-intensive exercise and difficult to perform at scale. Our aim was to develop a deep learning model to automatically identify patient utterances during text-based internet-enabled Cognitive Be...
Background
It is increasingly recognized that existing diagnostic approaches do not capture the underlying heterogeneity and complexity of psychiatric disorders such as depression. This study uses a data-driven approach to define fluid depressive states and explore how patients transition between these states in response to cognitive behavioural th...
Importance
Compared with the treatment of physical conditions, the quality of care of mental health disorders remains poor and the rate of improvement in treatment is slow, a primary reason being the lack of objective and systematic methods for measuring the delivery of psychotherapy.
Objective
To use a deep learning model applied to a large-scale...
We introduce and demonstrate the usefulness of a tool that automatically annotates therapist utterances in real-time according to the therapeutic role that they perform in an evidence-based psychological dialogue. This is implemented within the context of an on-line service that supports the delivery of one-to-one therapy. When combined with patien...
Background
Common mental health problems affect a quarter of the population. Online cognitive–behavioural therapy (CBT) is increasingly used, but the factors modulating response to this treatment modality remain unclear.
Aims
This study aims to explore the demographic and clinical predictors of response to one-to-one CBT delivered via the internet...
Semantic search is gradually establishing itself as the next generation search paradigm, which meets better a wider range of information needs, as compared to traditional full-text search. At the same time, however, expanding search towards document structure and external, formal knowledge sources (e.g. LOD resources) remains challenging, especiall...
Semantic search over documents is about finding information that is not based just on the presence of words, but also on their meaning [1, 2]. This task is a modification of classical Information Retrieval (IR), but documents are retrieved on the basis of relevance to ontology concepts, as well as words. Nevertheless the basic assumption is quite s...
This paper presents GATE Teamware—an open-source, web-based, collaborative text annotation framework. It enables users to carry out complex corpus annotation projects, involving distributed annotator teams. Different user roles are provided (annotator, manager, administrator) with customisable user interface functionalities, in order to support the...
GWAS AdAPT software.
Dataset S2 contains the GWAS Adjusting Association Priors with Text (AdAPT) software.
(TGZ)
This software article describes the GATE family of open source text analysis tools and processes. GATE is one of the most
widely used systems of its type with yearly download rates of tens of thousands and many active users in both academic
and industrial contexts. In this paper we report three examples of GATE-based systems operating in the life s...
Cloud computing is increasingly being regarded as a key enabler of the ‘democratization of science’, because on-demand, highly scalable cloud computing facilities enable researchers anywhere to carry out data-intensive experiments. In the context of natural language processing (NLP), algorithms tend to be complex, which makes their parallelization...
Work on GATE has been partly supported by EPSRC grants GR/K25267 (Large-Scale
In this paper we present Teamware, a novel web-based collaborative annotation environment which enables users to carry out complex corpus annotation projects, involving less skilled, cheaper annotators working remotely. It has been evaluated by us through the creation of several gold standard corpora, as well as through external evaluation in comme...
This chapter describes the development of GATE Mímir, anew tool for indexing documents according to multiple paradigms: full
text, conceptual model, and annotation structures. We also present a usage example for patent searchers covering measurements
and high-level structural information which was automatically extracted from a large patent corpus.
Matrixware, the Information Retrieval Facility and several EU-funded projects (SEKT
When researching new product ideas or ling new patents, inventors need to retrieve all relevant pre-existing know-how and/or to exploit and enforce patents in their technologi- cal domain. However, this process is hindered by lack of richer metadata, which if present, would allow more powerful concept-based search to complement the current keyword-...
Controlled Language (CL) for Ontology Editing tools offer an attractive alternative for naive users wishing to create ontologies,
but they are still required to spend time learning the correct syntactic structures and vocabulary in order to use the Controlled
Language properly. This paper extends previous work (CLOnE) which uses standard NLP tools...
Accessing structured data such as that encoded in ontologies and knowledge bases can be done using either syntactically complex formal query languages like SPARQL or complicated form interfaces that require expensive customisation to each particular application domain. This paper presents the QuestIO system – a natural language interface for access...
Accessing structured data in the form of ontologies requires training and learning formal query languages (e.g., SeRQL or SPARQL) which poses significant difficulties for non-expert users. One of the ways to lower the learning overhead and make ontology queries more straightforward is through a Natural Language Interface (NLI). While there are exis...
The need for efficient corpus indexing and querying arises frequently both in machine learning-based and human-engineered natural language processing systems. This paper presents the ANNIC system, which can index documents not only by content, but also by their linguististic annotations and features. It also enables users to formulate versatile que...
This paper will present the contribution of the European PrestoSpace project to the study and development of a Metadata Access and Delivery (MAD) platform for multimedia and television broadcast archives. The MAD system aims at generating, validating and delivering to archive users metadata created by automatic and semi- automatic information extra...
IntroductionInformation Extraction: A Brief IntroductionSemantic AnnotationApplying ‘Traditional’ IE in Semantic Web ApplicationsOntology-based IEDeterministic Ontology Authoring using Controlled Language IEConclusion
References
Sumerian is a long-extinct language documented throughout the ancient Middle East, arguably the first language for which we have written evidence, and is a language isolate (i.e. no related languages have so far been identified). The Electronic Text Corpus of Sumerian Literature (ETCSL), based at the University of Oxford, aims to make accessible on...
The Rich News system, that can automatically annotate radio and television news with the aid of resources retrieved from the World Wide Web, is described. Automatic speech recognition gives a temporally precise but conceptually inaccurate annotation model. Information extraction from related web news sites gives the opposite: conceptual accuracy bu...
In recent years, following the rapid development in the Semantic Web and Knowledge Management research, ontologies have become more in demand in Natural Language Processing. An increasing number of systems use ontologies either internally, for modelling the domain of the application, or as data structures that hold the output resulting from the wor...
PrestoSpace is a European-funded research project that aims at addressing the problem of decaying audio-visual archives throughout Europe by means of digitisation for preservation and access. One of the work areas within the project is Metadata Access and Delivery (MAD) which employs innovative methods of generating metadata for the digitised media...
ABSTRACT In this paper we present recent work on GATE, a widely-used framework and graphical development environment for creating and deploying Language Engineering components and resources in a robust fashion. The GATE architecture has facilitated the development of a number of successful applications for various language processing tasks (such as...
Digital library strive to add value to the collections they create and maintain. One way is through selectivity: a carefully chosen set of authoritative documents in a particular topic area is far more useful to those working in the area than a huge, unfocused collection (like the Web). Another is by augmenting the collection with high- quality met...
Legacy data in many mature descriptive sciences is distributed across multiple text descriptions. The challenge is both to
extract this data, and to correlate it once extracted. The MultiFlora system does this using an established Information Extraction
system tuned to the domain of botany and integrated with a formal ontology to structure and sto...
This paper describes a robust and easily adaptable system for named entity recognition from a variety of different text types. Most information extraction systems need to be customised according to the domain, either by collecting a large set of training data or by rewriting grammar rules, gazetteer lists etc., both of which methods can be costly a...
This technical report discusses a number of areas in Software Architecture for Language Engineering (SALE, [CFB94, Cun94, Cun99, Cun00]) and specifically the General Architecture for Text Engineering [CGW95, CHGW97, CGHW99, MCB 00, BBR 00, Cun02]:
This paper describes the rapid adaptation for surprise languages of a flexible and robust Information Extraction system based on GATE, a portable Natural Language Processing infrastructure. Our experiences show that even without a native speaker and in the absence of training data, we can quickly customize the system to a new language. We adapted t...
In this paper we describe an experiment to adapt a named entity recognition system from English to Cebuano as part of the TIDES surprise language program. With 4 person-days of effort, and with no previous knowledge of which language would be involved, no knowledge of the language in question once it was announced, and no training data available, w...
This paper reports work aimed at develop-ing an open, distributed learning environ-ment, OLLIE, where researchers can ex-periment with different Machine Learning (ML) methods for Information Extraction. Once the required level of performance is reached, the ML algorithms can be used to speed up the manual annotation process. OLLIE uses a browser cl...
We compare the potential of two classes of linear and hierarchical models of discourse to determine co-reference links and resolve anaphors. The comparison uses a corpns of thirty texts, which were manually annotated for co-reference and discourse structure.
Our experiments show that applying known IE techniques to independent parallel texts de- scribing the same information and merging the results brings signicant improvements in per- formance. Recall summed over six botanical de- scriptions of several plant species is more than triple the average for each text individually. We analyse these results,...
We compare the potential of two classes of linear and hierarchical models of discourse to determine co-reference links and resolve anaphors. The comparison uses a corpus of thirty texts, which were manually annotated for co-reference and discourse structure.
In this paper we present GATE, an architecture and a graphical development environment which enables users to develop and. deploy HLT applications in a robust fashion. GATE also provides reusable, extendable, and customisable language processing modules (e.g., part of speech tagger, named entity recognition grammars), which combined with the extens...
We discuss robustness in LE systems from the perspective of engineering, and the predictability of both outputs and construction process that this entails. We present an architectural system that contributes to engineering robustness and low-overhead systems development (GATE, a General Architecture for Text Engineering). To verify our ideas we pre...
In this paper we present GATE, an architecture and a graphical development environment which enables users to develop and deploy HLT applications in a robust fashion. GATE also provides reusable, extendable, and customisable language processing modules (e.g., part of speech tagger, named entity recognition grammars), which combined with the extensi...
In this paper we present GATE, a framework and graphical development environment which enables users to develop and deploy language engineering components and resources in a robust fashion. The GATE architecture has enabled us not only to develop a number of successful applications for various language processing tasks (such as Information Extracti...
Current research in Information Extraction tends to be focused on application-specific systems tailored to a particular domain. The Muse system is a multi-purpose Named Entity recognition system which aims to reduce the need for costly and time-consuming adaptation of systems to new applications, with its capability for processing texts from widely...
In this paper we argue that the GATE architecture and visual development environment can be used as an e#ective tool for teaching language engineering and computational linguistics.
In this paper we present GATE, a framework and graphical development environment which enables users to develop and deploy language engineering components and resources in a robust fashion. The GATE architecture has enabled us not only to develop a number of successful applications for various language processing tasks (such as Information Extracti...
Contents 1 Introduction 3 1.1 How to Use This Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Structure of the Book . . . . . . . . . . . . . . ....
Mots-clefs – Keywords entités nommées, chanes de référence, résolveurs d'anaphores named entities, coreference chains, anaphora resolution Résumé -Abstract Nous nous intéressons dans cet article aux méthodes superficielles de résolution d'anaphores et de construction des chanes de référence, que nous avons développées comme modules du sys eme d'ext...
Contents 1 Tokeniser 2 2 Gazetteer 3 3 Grammar 4 3.1 Use of Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Use of Priority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4 Walkthrough example 10 4.1 Step 1 - Tokenisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.2 Step 2 - List Lookup...
Contents 1 Grammar of JAPE 2 2 Relation to CPSL 5 3 Algorithms for JAPE Rule Application 6 3.1 The first algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.2 Algorithm 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4 Label Binding Scheme 13 5 Classes 14 6 Implementation 14 6.1 A Walk-Through . . . . . ....
GATE, a General Architecture for Text Engineering, aims to provide a software infrastructure for researchers and developers working in NLP. GATE has now been widely available for four years. In this paper we review the objectives which motivated the creation of GATE and the functionality and design of the current system. We discuss the strengths an...
This paper presents a taxonomy of previous work on infrastructures, architectures and development environments for representing and processing Language Resources (LRs), corpora, and annotations. This classification is then used to derive a set of requirements for a Software Architecture for Language Engineering (SALE). The analysis shows that a SAL...
GATE, a General Architecture for Text Engineering, aims to provide a software infrastructure for researchers and developers working in NLP. GATE has now been widely available for four years. In this paper, we review the objectives which motivated the creation of GATE and the functionality and design of the current system. We describe some of the wa...
We compare the potential of two classes of linear and hierarchical models of discourse to determine co-reference links and resolve anaphors. The comparison uses a corpus of thirty texts, which were manually annotated for co-reference and discourse structure. 1 Introduction Most current anaphora resolution systems implement a pipeline architecture w...
We compare the potential of two classes of linear and hierarchical models of discourse to determine co-reference links and resolve anaphors. The comparison uses a corpus of thirty texts, which were manually annotated for co-reference and discourse structure.
We compare the potential of two classes of linear and hierarchical models of discourse to determine co-reference links and resolve anaphors. The comparison uses a corpus of thirty texts, which were manually annotated for co-reference and discourse structure. 1 Introduction Most current anaphora resolution systems implement a pipeline architecture w...
This paper presents a controlled language for ontology editing and a software implementation, based partly on standard NLP
tools, for processing that language and manipulating an ontology. The input sentences are analysed deterministically and compositionally
with respect to a given ontology, which the software consults in order to interpret the in...
Knowledge Acquisition through Semantic An-notation is vital to the evolution, growth and success of the Semantic Web. Both Semi-automatic and Manual Annotation are con-stricted by a knowledge acquisition bottleneck. Manual Semantic Annotation is a complex and arduous task both time-consuming and costly, often requiring specialist annotators. There-...
Rich News, a system that augments news broadcasts with textual content, is described. The system identifies individual stories in news broadcasts, and annotates them with related content from the World Wide Web. The web content is subsequently semantically analysed, and used to produce summary information for each news story. This content can then...
this paper we will present the new collaborative corpus annotation facilities, recentlydeveloped as part of the GATE language engineering tools and infrastructure. These facilities havebeen used to build OLLIE -- a client-server application that allows users to use the collaborative corpusannotation facilities in their own Web browser
A web services based architecture for Language Resources utilizing existing technology such as XML, SOAP, WSDL and UDDI is presented. The web services architecture creates a pervasive information infrastructure that enables straightforward access to two kinds of Language Resources: traditional information sources and language processing resources....
The Rich News system for semantically annotating television news broadcasts and augmenting them with additional web content is described. Online news sources were mined for material reporting the same stories as those found in television broadcasts, and the text of these pages was semantically an-notated using the KIM knowledge management platform....
NLP infrastructures with comprehensive multi- lingual support can substantially decrease the overhead of developing Information Extraction (IE) systems in new languages by oering sup- port for dierent character encodings, language- independent components, and clean separa- tion between linguistic data and the algorithms that use it. This paper will...
This technical report discusses a number of areas in Software Architecture forLanguage Engineering (SALE, [CFB94, Cun94, Cun99, Cun00]) and specificallythe General Architecture for Text Engineering [CGW95, CHGW97,CGHW99, MCB00, BBR00, Cun02]:
EU-IST Strategic Targeted Research Project (STREP) IST-2004-026460 TAO Deliverable D6.2 (WP6) This deliverable addresses ontology learning and content augmentation applied to software code, doc- umentation, and other artefacts. First, it focuses on elicitation of a domain ontology to represent the concepts treated by GATE components (application of...
This paper presents a Romanian Named En-tity recognition system which was developed by reusing and extending IE components developed for English, as part of the MUSE IE system. The system was evaluated on a corpus of diverse text types – religion, news, and fiction. Both the system and the corpus are freely available 1 and were developed using GATE...
Network
Cited