Galia Angelova

Galia Angelova
  • PhD, Doctor of Sciences
  • Head of Department at Bulgarian Academy of Sciences

About

108
Publications
72,312
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,888
Citations
Current institution
Bulgarian Academy of Sciences
Current position
  • Head of Department

Publications

Publications (108)
Article
Full-text available
Electrical energy management of an organization is related to the appropriate planning of required payments and allocation of resources accordingly. It is important for good management to account for past costs and to forecast future costs. An intelligent solution for the mathematical formalization of this process is presented in the study. The res...
Preprint
Full-text available
We present bgGLUE (Bulgarian General Language Understanding Evaluation), a benchmark for evaluating language models on Natural Language Understanding (NLU) tasks in Bulgarian. Our benchmark includes NLU tasks targeting a variety of NLP problems (e.g., natural language inference, fact-checking, named entity recognition, sentiment analysis, question...
Preprint
Full-text available
We present the system we built for participating in SemEval-2016 Task 3 on Community Question Answering. We achieved the best results on subtask C, and strong results on subtasks A and B, by combining a rich set of various types of features: semantic, lexical, metadata, and user-related. The most important group turned out to be the metadata for th...
Article
Full-text available
Aims To evaluate the expected life expectancy in patients with diabetes in Bulgaria and to compare it to the expected life expectancy of the non-diabetic population in the country. Methods It is a retrospective observational population study on individuals diagnosed with diabetes, compared to the non-diabetic population in Bulgaria for the period...
Chapter
Persons with visual impairments have difficulties to work with graphical computer interface, such as Windows icons. Also they do not have access to objects of cultural and historical heritage such as paintings, tapestries, icons, etc. The chapter presents an approach for providing visual Braille services by 3D digitization of planar or spatial obje...
Preprint
We present the construction of an annotated corpus of PubMed abstracts reporting about positive, negative or neutral effects of treatments or substances. Our ultimate goal is to annotate one sentence (rationale) for each abstract and to use this resource as a training set for text classification of effects discussed in PubMed abstracts. Currently,...
Conference Paper
This paper presents experiments in risk factors analysis based on clinical texts and related Linked Open Data. Enhancements with additional data sources can enrich patient data and allow for a deeper investigation of correlations. In order to explore the potential of this approach several experiments were run on data collections, extracted from a l...
Article
Full-text available
The study investigates the quality of diabetes control and its economic implications in Bulgaria for the years 2012–2016. It is a retrospective study of the national diabetes register. Patients were categorized according to type of diabetes, gender, newly diagnosed cases per year, body-mass index (BMI) and achieved disease control. The relative ris...
Article
Full-text available
AimIncretins [dipeptidyl peptidase-4 inhibitors (DPP-4i) and glucagon-like peptide 1 RA (GLP-1 RA)] and sodium-glucose cotransporter-2 inhibitors (SGLT-2i) groups are now routinely used for type 2 diabetes therapy and comprise a large number of medicinal products. The long term therapeutic and economic effect of the incretins’ and SGLT-2i in real l...
Chapter
Automatic identification of intended tag meanings is a challenge in large image collections where human authors assign tags inspired by emotional or professional motivations. Algorithms for automatic tag disambiguation need “golden” collections of manually created tags to establish baselines for accuracy assessment. Here we show how to use the MIRF...
Chapter
Full-text available
We present the construction of an annotated corpus of PubMed abstracts reporting about positive, negative or neutral effects of treatments or substances. Our ultimate goal is to annotate one sentence (rationale) for each abstract and to use this resource as a training set for text classification of effects discussed in PubMed abstracts. Currently,...
Chapter
This paper discusses the need of building diabetic registers in order to monitor the disease development and assess the prevention and treatment plans. The automatic generation of a nation-wide Diabetes Register in Bulgaria is presented, using outpatient records submitted to the National Health Insurance Fund in 2010–2014 and updated with data from...
Article
Full-text available
Background: Studying comorbidities of disorders is important for detection and prevention. For discovering frequent patterns of diseases we can use retrospective analysis of population data, by filtering events with common properties and similar significance. Most frequent pattern mining methods do not consider contextual information about extract...
Conference Paper
Full-text available
We describe a method which extracts Association Rules from texts in order to recognise verbalisations of risk factors. Usually some basic vocabulary about risk factors is known but medical conditions are expressed in clinical narratives with much higher variety. We propose an approach for data-driven learning of specialised medical vocabulary which...
Conference Paper
Full-text available
In this paper we describe annotation process of clinical texts with morphosyntactic and semantic information. The corpus contains 1,300 discharge letters in Bulgarian language for patients with Endocrinology and Metabolic disorders. The annotated corpus will be used as a Gold standard for information extraction evaluation of test corpus of 6,200 di...
Conference Paper
The main goal of this research is to identify and extract risk factors for Diabetes Mellitus. The data source for our experiments are 8 mln outpatient records from the Bulgarian Diabetes Registry submitted to the Bulgarian Health Insurance Fund by general practitioners and all kinds of professionals during 2014. In this paper we report our work on...
Conference Paper
This paper presents an approach for translation of tags in professional and social image databases, using an original lexical resource extracted from Wikipedia. The translation integrates a tag sense disambiguation algorithm based on WordNet and Wikipedia (as external resources defining word senses). Our disambiguation technique uses the Lesk algor...
Conference Paper
The paper presents a statistical exploration of the use of resources in Bulgarian educational site UCHA.SE based on the user logs and information on students’ interactions stored directly in the site database. This research aims at revealing gaps between the demand and supply that suggest possible improvement of the content and help identifying gro...
Chapter
This paper presents results of ongoing project for discovering complex temporal relations between disorders and their treatment. We propose a cascade data mining approach for frequent pattern and sequence mining. The main difference from the classical methods is that instead of applying separately each method we reuse and extend the result prefix t...
Chapter
This paper presents an approach for word sense disambiguation (WSD) of image tags from professional and social image databases without categorial labels, using WordNet as an external resource defining word senses. We focus on the resolution of lexical ambiguity that arises when a given keyword has several different meanings. Our approach combines s...
Chapter
As part of the EC FP7 project “AComIn: Advanced Computing for Innovation”, which focuses on transferring innovative technologies to Bulgaria, we have applied educational data mining to the most popular Bulgarian K-12 educational web portal, UCHA.SE. UCHA.SE offers interactive instructional materials—videos and practice exercises—for all K-12 subjec...
Book
This volume is a selected collection of papers presented and discussed at the International Conference “Advanced Computing for Innovation (AComIn 2015)”. The Conference was held at 10th -11th of November, 2015 in Sofia, Bulgaria and was aimed at providing a forum for international scientific exchange between Central/Eastern Europe and the rest of t...
Article
Full-text available
Digital Libraries (DL) are offering access to a vast amount of digital content, relevant to practically all domains of human knowledge, which makes it suitable to enhance teaching and learning. Based on a systematic literature review, this article provides an overview and a gap analysis of educational use of DLs.
Article
Full-text available
This paper presents the results of an on-going research project for knowledge extraction from large corpora of clinical narratives in Bulgarian language, approximately 100 million of outpatient care notes. Entities with numerical values are mined in the free text and the extracted information is stored in a structured format. The Algorithms for ret...
Conference Paper
Full-text available
In this paper we present an approach for analysis of sentiments and emotions in image tagging using SentiWordNet as an external linguistic resource of emotional words. Our aim is to design and implement algorithms that assess the emotions and polarity given a set of image tags. The approach is not limited to object analysis only (considering inform...
Conference Paper
This paper describes research aiming to refine the tags that are assigned automatically to images by an industrial auto-tagging system (Imagga). The present annotation contains English keyword tags proposed by the original auto-tagging algorithms which assign to each image a set of keywords corresponding to shapes and colors that are recognized in...
Article
Full-text available
The world is embracing an open education model. The success of this process implies an adequate awareness, an assumption that is inconsistent with recent reports and statistical facts. Despite major advances in recent years, Open Educational Resources (OER) are still not in the mainstream of Computer Science course development. Motivated by the nee...
Article
Full-text available
While gamification is gaining ground in business, marketing, corporate management, and wellness initiatives, its application in education is still an emerging trend. This article presents a study of the published empirical research on the application of gamification to education. The study is limited to papers that discuss explicitly the effects of...
Article
Full-text available
Learning is a goal driven social activity determined by motivational factors. To be able to efficiently gamify learning for improved student motivation and engagement, the educators have to understand the related aspects studied in games, motivational psychology and pedagogy. This will help them to identify the factors that drive and explain desire...
Conference Paper
This paper presents a research project integrating language technologies and a business intelligence tool that help to discover new knowledge in a very large repository of patient records in Bulgarian language. The ultimate project objective is to accelerate the construction of the Register of diabetic patients in Bulgaria. All the information need...
Conference Paper
Full-text available
Sublanguages are varieties of language that form "subsets" of the general language, typically exhibiting particular types of lexical, semantic, and other restrictions and deviance. SubCAT, the Sublanguage Corpus Analysis Toolkit, assesses the representativeness and closure properties of corpora to analyze the extent to which they are either sublang...
Article
Full-text available
Sublanguages are varieties of language that form "subsets" of the general language, typically exhibiting particular types of lexical, semantic, and other restrictions and deviance. SubCAT, the Sublanguage Corpus Analysis Toolkit, assesses the representativeness and closure properties of corpora to analyze the extent to which they are either sublang...
Conference Paper
Full-text available
Patent search is an important information retrieval problem in scientific and business research. Semantic search would be a large improvement to current technologies, but requires some insight into the language of patents. In this article we test the fit of the language of patents to the sublanguage model, focussing on closure properties. The resea...
Conference Paper
Full-text available
Sublanguages are specialized genres of language associated with specific domains and document types. When sublanguages can be recognized and adequately charac-terized, they are useful for a variety of types of natural language processing ap-plications. Although there are sublan-guage studies related to languages other than English, all previous wor...
Article
Full-text available
Online learning is one of the fastest growing trends in Technology- Enhanced Learning (TEL). Technology in combination with an instruction that addresses the cognitive and social processes of knowledge construction could offer more diverse and effective online learning opportunities than their face-to-face counterparts. In this review we attempt to...
Conference Paper
Natural Language Processing (NLP) has been viewed as a promising technology in medical informatics since decades. Despite the gradually improving quality of automatic text analysis, however, clinical NLP systems are still rarely used outside the research Labs due to the following reasons: (i) their development is very expensive so most of them are...
Conference Paper
This paper discusses the notion of medical archetype and the manner how the archetype elements are documented in hospital patient records. This is done by interpreting the archetypes as information extraction templates in automatic text analysis of clinical narratives. The extensive extraction experiments performed over thousands of anonymous disch...
Article
Experiments in automatic analysis of free texts in Bulgarian hospital discharge letters are presented. Natural Language Processing (NLP) has been applied to medical texts since decades but high-quality results have been demonstrated only recently. The progress in automatic text analysis opens new directions for secondary use of Electronic Health Re...
Article
This article presents a feasibility study for retrieving Wikipedia articles matching patents' topics. The long term motivation behind it is to facilitate patent search by enriching patent indexing with relevant keywords found in external (terminological) resources, with their monolingual synonyms and multilingual translations. The similarity betwee...
Conference Paper
This paper presents a research prototype for temporal event information extraction from hospital discharge letters in Bulgarian. An algorithm for extraction of primitive events automatically sets markers for patients' complaints, drug treatment and diagnoses with precision about 90%. Specific domain knowledge is further used to generate compound ev...
Conference Paper
This demo presents Information Extraction from discharge letters in Bulgarian language. The Patient history section is automatically split into episodes (clauses between two temporal markers); then drugs, diagnoses and conditions are recognised within the episodes with accuracy higher than 90%. The temporal markers, which refer to absolute or relat...
Conference Paper
Full-text available
The article presents research in secondary use of information about medical entities that are automatically extracted from the free text of hospital patient records. To capture patient diagnoses, drugs, lab data and status, four extractors that analyse Bulgarian medical texts have been developed. An integrated repository, which comprises the extrac...
Conference Paper
This article discusses current results in automatic Information Extraction (IE) of temporal markers from hospital Patient Records (PRs) texts. The aim is to construct a temporal sequence of important facts about phases in disease development, by recognising the main events that are described in the anamnesis (case history). We consider the conceptu...
Conference Paper
Full-text available
To automatically analyse medical narratives, one needs linguistic and conceptual resources which support capturing of important information from texts and its representation in a structured way. Thus the conceptual structures encoding domain concepts and relations are crucial for the development of reliable and high-performance information extracti...
Chapter
This paper presents experiments in automatic Information Extraction of medication events, diagnoses, and laboratory tests form hospital patient records, in order to increase the completeness of the description of the episode of care. Each patient record in our hospital information system contains structured data and text descriptions, including ful...
Article
This paper presents experiments in automatic Information Extraction of medication events, diagnoses, and laboratory tests form hospital patient records, in order to increase the completeness of the description of the episode of care. Each patient record in our hospital information system contains structured data and text descriptions, including ful...
Article
Full-text available
Information Extraction (IE) from medical texts aims at the automatic recognition of entities and relations of interests. IE is based on shallow analysis and considers only sentences containing important words. Thus IE of drugs from discharge letters can identify as 'current' some past or future medication events. This article presents heuristic obs...
Article
Full-text available
This article describes the automatic processing of medical texts in order to extract important patient characteristics, thus turning the free text description into a structured internal representation. Shallow text analysis is implemented due to the medical language complexity. The paper sketches the information extraction process and discusses the...
Conference Paper
In this article we present a text analysis system designed to extract key information from clinical text in Bulgarian language. Using shallow analysis within an Information Extraction (IE) approach, the system builds structured descriptions of patient status, disease duration, complications and treatments. We discuss some particularities of the med...
Conference Paper
Domain knowledge is essential resource in Information Extraction (IE) from free text since it supports the decisions about structuring the extracted text objects into domain statements. Thus manually-created conceptual structures enable the semantic representation of textual information. This paper discusses the role of domain knowledge in informat...
Article
Full-text available
This article describes the automatic processing of medical texts in order to extract important patient characteristics, thus turning the free text description into a structured internal representation. Shallow text analysis is implemented due to the medical language complexity. The paper sketches the information extraction process and discusses the...
Article
Full-text available
The paper discusses an Information Extraction approach, which is applied for the automatic processing of hospital Patient Records (PRs) in Bulgarian language. The main task reported here is retrieval of status descriptions related to anatomical organs. Due to the specific telegraphic PR style, the approach is focused on shallow analysis. Missing te...
Conference Paper
This paper presents the general framework and the current results of a project that aims to develop a system for knowledge discovery and extraction from the texts of Electronic Health Records in Bulgarian language. The proposed hybrid approach integrates language technologies and conceptual processing. The system generates conceptual graphs encodin...
Article
This paper considers the conceptual primitives in the constructions of domain ontologies which are designed to support foreign language terminology learning. Conceptual hierarchies with integrated natural and role types are formally defined. The proposal is to distinguish explicitly the role types and to provide user interfaces for browsing the und...
Conference Paper
Full-text available
We propose a representation of simple conceptual graphs with binary conceptual relations, which is based on finite-state automata. The representation enables the calculation of injective projection as a two-stage process: off-line calculation of the computationally-intensive subsumption checks and encoding of the results as a minimal finite-state a...
Article
Full-text available
This paper introduces an encoding of knowledge representa- tion statements as regular languages and proposes a two-phase approach to processing of explicitly declared conceptual information. The idea is pre- sented for the simple conceptual graphs where conceptual pattern search is implemented by the so called projection operation. Projection calcu...
Conference Paper
Full-text available
Smart applications behave intelligently because they understand at least partially the context where they operate. To do this, they need not only a formal domain model but also formal descriptions of the data they process and their own operational behaviour. Interoperability of smart applications is based on formalised definitions of all their data...
Conference Paper
Full-text available
There are many tools supporting the visualisation of semantic content. This is due to the fact that end users are involved in complex tasks of annotation and/or search of data using semantic features, so they need guidance and friendly interfaces to navigate through complex hyperspaces and to maintain semantic annotations. A variety of approaches v...
Chapter
To summarise, in CGLex (i) the system controls the user input to ensure consistent syntax and can provide context-sensitive help; (ii) all encoded facts can be tracked by the NL comments associated with each CG. Therefore, the user can think in terms of encoded facts as well as conceptual structures, i.e., can create a library of facts. Such an app...
Chapter
The paper discusses a prototype module for on-line checking of term consistency in a workbench for knowledge-based Machine Aided Human Translation (MAHT). We present the linguistic resources and the knowledge base (KB) of the system as well as their place in the processes. To discover missing or misleading translations, the checker relies on the le...
Conference Paper
Full-text available
This paper overviews and analyses the on-going research attempts to apply language technologies to automatic ontology acquisition. At first glance there are many successful approaches in this very hot field. However, most of them aim at the extraction of named entities as well as draft taxonomies and partonomies. Only few attempts exist for enrichi...
Article
Full-text available
Research report of the ProLearn Network of Excellence (IST 507310), Deliverable 1.1
Conference Paper
This paper presents a knowledge-based approach to eLearn- ing, where the domain ontology plays central role as a resource struc- turing the learning content and supporting flexible adaptive strategies for navigation through it. The content is oriented to computer aided language learning of English financial terminology. Domain knowledge is acquired...
Conference Paper
Full-text available
Internet content today is about 80% text-based. No matter static or dynamic, the information is encoded and presented as multilingual, unstructured natural language text pages. As the Semantic Web aims at turning Internet into a machine-understandable resource, it becomes important to consider the natural language content and to assess the feasibil...
Article
Full-text available
We consider in depth the semantic analysis in learning systems as well as some information retrieval techniques applied for measuring the document similarity in eLearning. These results are obtained in a CALL project, which ended by extensive user evaluation. After several years spent in the development of CALL modules and prototypes, we think that...
Conference Paper
Full-text available
The paper presents on-going work towards deeper understanding of the factors influencing the performance of the Latent Semantic Analysis (LSA). Unlike previous attempts that concentrate on problems such as matrix elements weighting, space dimensionality selection, similarity measure etc., we primarily study the impact of another, often neglected, b...
Conference Paper
Full-text available
A system for recognition and morphological classifi- cation of unknown German words is described. Given raw texts it outputs a list of the unknown nouns together with hypotheses about their possible stems and morphological class(es). The system exploits both global and local information as well as morphological properties and external linguistic kn...
Conference Paper
This paper deals with Natural Language (NL) question-answering to knowledge bases (KB). It considers the usual conceptual graphs (CG) approach for NL semantic interpretation by joins of canonical graphs and compares it to the computational linguistics approach for NL question-answering based on logical forms. After these theoretical considerations,...
Conference Paper
Full-text available
This paper presents the design, implementation and some original features of a Web-based learning environment - STyLE (Scientific Terminology Learning Environment). STyLE4 supports adaptive learning of English terminology with a target user group of non-native speakers. It attempts to improve Compu- ter-Aided Language Learning (CALL) by intelligent...
Conference Paper
Full-text available
Automatic extraction of formal knowledge speci�cations from Natural Language (NL) text is a challenging research area. Currently the task is considered feasible for restricted NL input only. A number of CG researchers approached the problem, applying Sowa's algorithm for analysis of NL input by joins of canonical graphs. This paper summarizes the s...
Article
Full-text available
Building advanced CALL systems is a challenge; no universal solutions are attained so far regarding the most desired features of intelligent CALL like learner-system communication in Natural Language (NL), adequate processing of information about the learner's semantic errors, and adaptive strategies for choice of relevant tutoring materials. This...
Conference Paper
This paper presents the design and currently elaborated com- ponents in the knowledge-based learning environment called STyLE. It supports learning of English terminology in the domain of finances with a target user group of non-native English speakers. 1 The components elaborated so far allow for the discussion of the Web-based learning en- vironm...
Conference Paper
Many AI systems define, store, and manipulate a user model (UM) by knowledge representation means different or separate from the system’s knowledge base (KB). This paper describes a UM strategy in a system for generation of NL explanations. The idea is to track the user’s requests and to modify the declarative patterns for information retrieval by...
Article
The paper considers the problem of classifying countable-uncountable entities during the process of Knowledge Acquisition (KA) from texts. Since one of the main goals of KA is to identify types, means to distinguish new types, instances and individuals become particularly important. We review briefly related studies to show that the distinction co...
Article
. This paper presents some research results and a demo implementation of a knowledge-based Machine Aided Translation (MAT) system supporting the translation process with the necessary linguistic and conceptual knowledge. Conceptual Graphs (cgs) were chosen as a knowledge representation formalism since they provide formal structures and operations s...
Conference Paper
This paper summarises principles of manual acquisition of conceptual graphs which evolved within the framework of a natural language processing system and are now enriched and elaborated to facilitate the construction of a larger knowledge base. Our conventions provide the mapping between language structures (at syntactic and semantic levels) and c...
Article
Full-text available
Successful user-friendly interfaces will enable the application of knowledge based techniques in systems oriented towards end users who are not specialised in computer science. This paper discusses an approach to knowledge based Machine Aided Translation (MAT) which provides an user-friendly interface to Knowledge Bases (KB) of conceptual graphs. I...
Conference Paper
Full-text available
This paper discusses an innovative approach to knowledge based Machine Aided Translation (MAT) where the translator is supported by an user-friendly environment providing linguistic and domain knowledge explanations. Our project aims at integration of a Knowledge Base (KB) in a MAT system and studies the integration principles as well as the intern...
Conference Paper
Full-text available
Lexicalized Tree Adjoining Grammar (LTAG) is an attractive formalism for linguistic description mainly because of its extended domain of locality and its factoring recursion out from the domain of local dependencies (Joshi, 1985, Kroch and Joshi, 1985, ...
Conference Paper
a system for man-computer di~lo6ue in natural, language. The system is being elaborated at the Laboratory of Mathematical Linguistics at the Institute of Mathematics with Computer Center of the Bulgarian Academy of Sciences. The desorlbed system requires: I. A form~ description of the syntax of basic nuclear structures of the natural lanEuage sente...
Chapter
The organization of modern society calls for an efficient operating with a large amount of law information. The foundations of the system SPRINT, designed at the Laboratory of Math. Linguistics, reflect the theoretical concepts about the structure of the socialist legal norm system. The formalized legal information is processed by the system SPRINT...
Article
Full-text available
This paper deals with the extraction of medical information from hospital patient records. It proposes a cascade approach for the extraction of multi-layer knowledge statements because the subject is too complex. We sketch the Information Extraction view to text analy-sis, where patient-related facts are recognised using predefined regular expressi...
Article
Full-text available
Abstract Most translations are needed for technical documents ,in specific domains ,and often the domain knowledge,available to the translator is crucial for the efficiency and ,quality of the translation task. Our project, aims ,at the ,investigation of a ,MAT-paradigm where ,the human ,user is supported by linguistic as well as by subject informa...
Article
This paper presents a computati - onally-efficient approach to multilingual NL generation (NLG) from a knowledge base (KB) of Conceptual Graphs (CG). The NLG module is integrated in a Machine-Aided Translation prototype providing interactive explanations of domain knowledge for end users and was developed in the DB-MAT and DBR-MAT projects.

Network

Cited By