About
21
Publications
1,367
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
98
Citations
Citations since 2017
Publications
Publications (21)
Objective:
Systematic reviews form the basis of evidence-based medicine but are expensive and time-consuming to produce. To address this burden, we have developed a literature identification system (Pythia) that combines the query formulation and citation screening steps.
Study design:
Pythia incorporates a set of natural-language questions with...
We study the effect of seven data augmentation (da) methods in factoid question answering, focusing on the biomedical domain, where obtaining training instances is particularly difficult. We experiment with data from the BioASQ challenge, which we augment with training instances obtained from an artificial biomedical machine reading comprehension d...
Science, technology and innovation (STI) policies have evolved in the past decade. We are now progressing towards policies that are more aligned with sustainable development through integrating social, economic and environmental dimensions. In this new policy environment , the need to keep track of innovation from its conception in Science and Rese...
Question answering (QA) systems for large document collections typically use pipelines that (i) retrieve possibly relevant documents, (ii) re-rank them, (iii) rank paragraphs or other snippets of the top-ranked documents, and (iv) select spans of the top-ranked snippets as exact answers. Pipelines are conceptually simple, but errors propagate from...
Background: The typical approach to literature identification involves two discrete and successive steps: (i) formulating a search strategy (i.e., a set of Boolean queries) and (ii) manually identifying the relevant citations in the corpus returned by the query. We have developed a literature identification system (Pythia) that combines the query f...
We introduce BIOMRC, a large-scale cloze-style biomedical MRC dataset. Care was taken to reduce noise, compared to the previous BIOREAD dataset of Pappas et al. (2018). Experiments show that simple heuristics do not perform well on the new dataset, and that two neural MRC models that had been tested on BIOREAD perform much better on BIOMRC, indicat...
We present the submissions of aueb to the bioasq 7 document and snippet retrieval tasks (parts of Task 7b, Phase A). Our systems build upon the methods we used in bioasq 6. This year we also experimented with models that jointly learn to retrieve documents and snippets, as opposed to using separate pipelined models for document and snippet retrieva...
In this paper we describe our participation to the Task1 of the CL-SciSumm 2019. The task is on automatic paper summarization in the research area of Computational Linguistics. Our approach is a two step binary sentence pair classification between the so-called citances and candidate sentences. Firstly, we classify sentences in the abstracts to pre...
Network Embedding (NE) methods, which map network nodes to low-dimensional feature vectors, have wide applications in network analysis and bioinformatics. Many existing NE methods rely only on network structure, overlooking other information associated with the nodes, e.g., text describing the nodes. Recent attempts to combine the two sources of in...
We present AUEB's submissions to the BioASQ 6 document and snippet retrieval tasks (parts of Task 6b, Phase A). Our models use novel extensions to deep learning architectures that operate solely over the text of the query and candidate document/snippets. Our systems scored at the top or near the top for all batches of the challenge, highlighting th...
In this paper, we describe a hierarchical bi-directional attention-based Re-current Neural Network (RNN) as a reusable sequence encoder architecture, which is used as sentence and document encoder for document classification. The sequence encoder is composed of two bi-directional RNN equipped with an attention mechanism that identifies and captures...
We present¹ a personalized ingredient-based Deep Learning recommender on the food domain that exploits ingredients and nutrition information to create recipe representations and propose to every user a more personalized and healthier meal. The recommender will be a critical component in our Meal Prediction Tool (MPT) designed with a focus on the pe...
Citizens are shaping their food preferences and expressing their food experiences on a daily basis reflecting their way of living, culture and well-being . In this paper, we focus on food perceptions and experiences in the context of smart citizen and tourist sensing. We analyze Foursquare user reviews about food-related points of interest in ten E...
In this paper, we describe our submission to the "Document Triage Task", of the BioCreative VI Precision Medicine Track, in which we ranked first among ten teams. The submitted system is a Hierarchical Bidirectional Attention-Based Recurrent Neural Network (RNN). Our approach utilizes the hierarchical nature of documents, which are composed of sequ...
We present a method to classify fixed-duration windows of speech as expressing anger or not, which does not require speech recognition, utterance segmentation, or separating the utterances of different speakers and can, thus, be easily applied to real-world recordings. We also introduce the task of ranking a set of spoken dialogues by decreasing pe...