About
13
Publications
780
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
111
Citations
Citations since 2017
Publications
Publications (13)
Medical data processing has found a new dimension with the extensive use of machine-learning techniques to classify and extract features. Machine learning strongly benefits from computing accelerators. However, such accelerators are not easily available at hospital premises, although they can be easily found on public cloud infrastructures or resea...
Word recognition is a challenging task faced by many applications, specially in very noisy scenarios. This problem is usually seen as the transmission of a word through a noisy-channel, such that it is necessary to determine which known word of a lexicon is the received string. To be feasible, just a reduced set of candidate words are selected. The...
In this paper, we present Waves, a novel document-at-a-time algorithm for fast computing of top-k query results in search systems. The Waves algorithm uses multi-tier indexes for processing queries. It performs successive tentative evaluations of results which we call waves. Each wave traverses the index, starting from a specific tier level i. Each...
This work conducts a quantitative analysis of a number of Learning Object Repositories (LORs) of Learning Objects (LOs) in both English and Portuguese languages. The focus of this exercise is to understand how the contributors organize their metadata, the update frequency, and measurement upon LOR items such as: (i) the size distribution; (ii) grow...
In this paper we propose and evaluate the Block Max WAND with Candidate Selection and Preserving Top-K Results algorithm, or BMW-CSP. It is an extension of BMW-CS, a method previously proposed by us. Although very efficient, BMW-CS does not guarantee preserving the top-. k results for a given query. Algorithms that do not preserve the top results m...
In this paper we present two new algorithms designed to reduce the overall time required to process top-k queries. These algorithms are based on the document-at-a-time approach and modify the best baseline we found in the literature, Blockmax WAND (BMW), to take advantage of a two-tiered index, in which the first tier is a small index containing on...
Search engines are essential tools for web users today. They rely on a large number of features to compute the rank of search results for each given query. The estimated reputation of pages is among the effective features available for search engine designers, probably being adopted by most current commercial search engines. Page reputation is esti...
State-of-the-art search engine ranking methods combine several distinct sources of relevance evidence to produce a high-quality ranking of results for each query. The fusion of information is currently done at query-processing time, which has a direct effect on the response time of search systems. Previous research also shows that an alternative to...
Identifying replicated sites is an important task for search engines.It can reduce data storage costs, improve query processing time and remove noises that might affect the quality of the nal answer given to the user . This paper introduces a new approach to detect replicated sites in search engines databases, using as replication evidences the web...