Tome Eftimov

Tome Eftimov
Jožef Stefan Institute | IJS · Department of Computer systems

PhD

About

195
Publications
39,395
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,521
Citations
Citations since 2017
185 Research Items
1509 Citations
20172018201920202021202220230100200300400
20172018201920202021202220230100200300400
20172018201920202021202220230100200300400
20172018201920202021202220230100200300400
Introduction
Tome Eftimov currently works as a Researcher at the Computer Systems Department at Jozef Stefan Institute. He did his postdoc at the Department of Biomedical Data Science and the Center for Population Health Sciences, Stanford University. Tome does research in Statistics, Stochastic Optimization Algorithms, Natural Language Processing and Machine Learning.
Additional affiliations
January 2019 - January 2020
Stanford University
Position
  • PostDoc Position
October 2014 - present
Jožef Stefan Institute
Position
  • Researcher
Description
  • Research project: ISO-FOOD
September 2012 - September 2014
Macedonian Academy of Sciences and Arts
Position
  • Research Assistant
Description
  • Research project: "Non-coherent communication for future wireless networks"
Education
October 2014 - January 2018
Jožef Stefan International Postgraduate School
Field of study
  • Information and Communication Technologies
October 2011 - June 2013
Faculty of computer science and engineering
Field of study
  • Computer Science and Computer Engineering (Computer networks and E-technologies)
October 2007 - June 2011
Faculty of electrical engineering and information technologies
Field of study
  • Electrical engineer on the field informatics and computer science

Publications

Publications (195)
Article
In this paper a novel approach for making a statistical comparison of meta-heuristic stochastic optimization algorithms over multiple single-objective problems is introduced, where a new ranking scheme is proposed to obtain data for multiple problems. The main contribution of this approach is that the ranking scheme is based on the whole distributi...
Conference Paper
The accelerating growth of big data in the biomedical domain, with an endless amount of electronic health records and more than 30 million citations and abstracts in PubMed, introduces the need for automatic structuring of textual biomedical data. In this paper, we develop a method for detecting relations between food and disease entities from raw...
Article
Full-text available
Background Recently, food science has been garnering a lot of attention. There are many open research questions on food interactions, as one of the main environmental factors, with other health-related entities such as diseases, treatments, and drugs. In the last 2 decades, a large amount of work has been done in natural language processing and mac...
Preprint
Performance complementarity of solvers available to tackle black-box optimization problems gives rise to the important task of algorithm selection (AS). Automated AS approaches can help replace tedious and labor-intensive manual selection, and have already shown promising performance in various optimization domains. Automated AS relies on machine l...
Preprint
The application of machine learning (ML) models to the analysis of optimization algorithms requires the representation of optimization problems using numerical features. These features can be used as input for ML models that are trained to select or to configure a suitable algorithm for the problem at hand. Since in pure black-box optimization info...
Preprint
In black-box optimization, it is essential to understand why an algorithm instance works on a set of problem instances while failing on others and provide explanations of its behavior. We propose a methodology for formulating an algorithm instance footprint that consists of a set of problem instances that are easy to be solved and a set of problem...
Preprint
A key component of automated algorithm selection and configuration, which in most cases are performed using supervised machine learning (ML) methods is a good-performing predictive model. The predictive model uses the feature representation of a set of problem instances as input data and predicts the algorithm performance achieved on them. Common m...
Preprint
Leave-one-problem-out (LOPO) performance prediction requires machine learning (ML) models to extrapolate algorithms' performance from a set of training problems to a previously unseen problem. LOPO is a very challenging task even for state-of-the-art approaches. Models that work well in the easier leave-one-instance-out scenario often fail to gener...
Article
Full-text available
Knowledge about the interactions between dietary and biomedical factors is scattered throughout uncountable research articles in an unstructured form (e.g., text, images, etc.) and requires automatic structuring so that it can be provided to medical professionals in a suitable format. Various biomedical knowledge graphs exist, however, they require...
Article
Nowadays, it is really important and crucial to follow the new biomedical knowledge that is presented in scientific literature. To this end, Information Extraction pipelines can help to automatically extract meaningful relations from textual data that further require additional checks by domain experts. In the last two decades, a lot of work has be...
Preprint
Although recipe data are very easy to come by nowadays, it is really hard to find a complete recipe dataset - with a list of ingredients, nutrient values per ingredient, and per recipe, allergens, etc. Recipe datasets are usually collected from social media websites where users post and publish recipes. Usually written with little to no structure,...
Preprint
Full-text available
Empirical data plays an important role in evolutionary computation research. To make better use of the available data, ontologies have been proposed in the literature to organize their storage in a structured way. However, the full potential of these formal methods to capture our domain knowledge has yet to be demonstrated. In this work, we evaluat...
Preprint
Per-instance automated algorithm configuration and selection are gaining significant moments in evolutionary computation in recent years. Two crucial, sometimes implicit, ingredients for these automated machine learning (AutoML) methods are 1) feature-based representations of the problem instances and 2) performance prediction methods that take the...
Article
Full-text available
In the last decades, a great amount of work has been done in predictive modeling of issues related to human and environmental health. Resolution of issues related to healthcare is made possible by the existence of several biomedical vocabularies and standards, which play a crucial role in understanding the health information, together with a large...
Preprint
Multi-label classification (MLC) is an ML task of predictive modeling in which a data instance can simultaneously belong to multiple classes. MLC is increasingly gaining interest in different application domains such as text mining, computer vision, and bioinformatics. Several MLC algorithms have been proposed in the literature, resulting in a meta...
Preprint
Many optimization algorithm benchmarking platforms allow users to share their experimental data to promote reproducible and reusable research. However, different platforms use different data models and formats, which drastically complicates the identification of relevant datasets, their interpretation, and their interoperability. Therefore, a seman...
Chapter
Providing comprehensive details on how and why a stochastic optimization algorithm behaves in a particular way, on a single problem instance or a set of problem instances is a challenging task. For this purpose, we propose a methodology based on problem landscape features and explainable machine learning models, for automated algorithm performance...
Article
Accurate and reliable forecasting is a crucial task in many different domains. The selection of a forecasting algorithm that is suitable for a specific time series can be a challenging task, since the algorithms’ performance depends on the time-series properties, as well as the properties of the forecasting algorithms. The methodology and analysis...
Preprint
Algorithm selection wizards are effective and versatile tools that automatically select an optimization algorithm given high-level information about the problem and available computational resources, such as number and type of decision variables, maximal number of evaluations, possibility to parallelize evaluations, etc. State-of-the-art algorithm...
Article
Full-text available
Besides the numerous studies in the last decade involving food and nutrition data, this domain remains low resourced. Annotated corpuses are very useful tools for researchers and experts of the domain in question, as well as for data scientists for analysis. In this paper, we present the annotation process of food consumption data (recipes) with se...
Article
Full-text available
Non-communicable diseases are on the rise and are often related to food choices; nutrition affects infectious diseases too. Therefore, there is growing interest in research on public and personal health, as related to food, nutrition behaviour and well-being of consumers throughout the life cycle. These concepts and their relations are complex and...
Chapter
Per-instance algorithm selection seeks to recommend, for a given problem instance and a given performance criterion, one or several suitable algorithms that are expected to perform well for the particular setting. The selection is classically done offline, using openly available information about the problem instance or features that are extracted...
Article
Full-text available
Many factors significantly influence the outcomes of infectious diseases such as COVID-19. A significant focus needs to be put on dietary habits as environmental factors since it has been deemed that imbalanced diets contribute to chronic diseases. However, not enough effort has been made in order to assess these relations. So far, studies in the f...
Chapter
Algorithm selection wizards are effective and versatile tools that automatically select an optimization algorithm given high-level information about the problem and available computational resources, such as number and type of decision variables, maximal number of evaluations, possibility to parallelize evaluations, etc. State-of-the-art algorithm...
Chapter
In this chapter, we introduce the statistical background required to understand the deep statistical comparison methods that are presented in this book. We give an overview of the basic terms used in statistics, starting with descriptive statistics and a special focus on hypothesis testing. At the end, we provide guidelines for which statistical te...
Chapter
This chapter presents an application of the Deep Statistical Comparison approach to a single-objective optimization. It provides examples of how the Deep Statistical Comparison ranking scheme can be used for a performance assessment of single-objective stochastic optimization algorithms. Next, a practical Deep Statistical Comparison ranking scheme...
Chapter
This chapter presents an application of the Deep Statistical Comparison approach in multi-objective optimization. It provides examples of how the Deep Statistical Comparison ranking scheme can be used for performance assessment of multi-objective stochastic optimization algorithms using a single-quality-indicator data. Next, different ensembles of...
Chapter
This chapter provides a short introduction to meta-heuristic stochastic optimization, so that the reader is acquainted with the statistical analysis of the optimization results. First, the optimization and its two main families in the form of combinatorial and numerical optimization are introduced. Next, the two classifications of optimization prob...
Chapter
This chapter introduces statistical approaches that can be utilized for statistical comparisons of meta-heuristic stochastic optimization algorithms. First, the most commonly used approach for a statistical comparison is presented, followed by a recently published approach, known as the Deep Statistical Comparison. Both approaches are discussed usi...
Chapter
This chapter provides an introduction to the theory of benchmarking, which is a crucial step when performing a comparison of optimization algorithms. The four main steps of benchmarking will be explained in more detail, starting from identifying the reasons for benchmarking, defining the optimization domain (problem and algorithm selection), defini...
Chapter
In this chapter, a web-service-based e-learning tool called DSCTool for making a statistical comparison of meta-heuristic optimization algorithms is introduced, without having to worry about making incorrect conclusions. First, the general pipeline of the tool is presented, followed by a detailed explanation of all the web services. Next, two types...
Article
Missing data is a common problem in a wide range of fields that can arise as a result of different reasons: lack of analysis, mishandling samples, measurement error, etc. The area of nutrition and food composition is no exception to the problem of missing values. Missing data in food composition databases (FCDB) significantly limits their usage. Co...
Preprint
Fair algorithm evaluation is conditioned on the existence of high-quality benchmark datasets that are non-redundant and are representative of typical optimization scenarios. In this paper, we evaluate three heuristics for selecting diverse problem instances which should be involved in the comparison of optimization algorithms in order to ensure rob...
Article
Full-text available
Alzheimer’s disease is still a field of research with lots of open questions. The complexity of the disease prevents the early diagnosis before visible symptoms regarding the individual’s cognitive capabilities occur. This research presents an in-depth analysis of a huge data set encompassing medical, cognitive and lifestyle’s measurements from mor...
Preprint
Full-text available
Per-instance algorithm selection seeks to recommend, for a given problem instance and a given performance criterion, one or several suitable algorithms that are expected to perform well for the particular setting. The selection is classically done offline, using openly available information about the problem instance or features that are extracted...
Preprint
Full-text available
Selecting the most suitable algorithm and determining its hyperparameters for a given optimization problem is a challenging task. Accurately predicting how well a certain algorithm could solve the problem is hence desirable. Recent studies in single-objective numerical optimization show that supervised machine learning methods can predict algorithm...
Preprint
Full-text available
Landscape-aware algorithm selection approaches have so far mostly been relying on landscape feature extraction as a preprocessing step, independent of the execution of optimization algorithms in the portfolio. This introduces a significant overhead in computational cost for many practical applications, as features are extracted and computed via sam...
Preprint
Full-text available
Predicting the performance of an optimization algorithm on a new problem instance is crucial in order to select the most appropriate algorithm for solving that problem instance. For this purpose, recent studies learn a supervised machine learning (ML) model using a set of problem landscape features linked to the performance achieved by the optimiza...
Article
In this paper, we have proposed a new pipeline for landscape analysis of time-series machine learning datasets that enables us to better understand a benchmarking problem landscape, allows us to select a diverse benchmark datasets portfolio, and reduces the presence of performance assessment bias via bootstrapping evaluation. Combining a large mult...
Article
Full-text available
The focus of the current paper is on a design of responsible governance of food consumer science e-infrastructure using the case study Determinants and Intake Data Platform (DI Data Platform). One of the key challenges for implementation of the DI Data Platform is how to develop responsible governance that observes the ethical and legal frameworks...
Article
Full-text available
In optimization, algorithm selection, which is the selection of the most suitable algorithm for a specific problem, is of great importance, as algorithm performance is heavily dependent on the problem being solved. However, when using machine learning for algorithm selection, the performance of the algorithm selection model depends on the data used...
Chapter
Predicting the performance of an optimization algorithm on a new problem instance is crucial in order to select the most appropriate algorithm for solving that problem instance. For this purpose, recent studies learn a supervised machine learning (ML) model using a set of problem landscape features linked to the performance achieved by the optimiza...
Article
Full-text available
Many optimization algorithm benchmarking platforms allow users to share their experimental data to promote reproducible and reusable research. However, different platforms use different data models and formats, which drastically complicates the identification of relevant datasets, their interpretation, and their interoperability. Therefore, a seman...
Article
Full-text available
Choosing optimal Deep Learning (DL) architecture and hyperparameters for a particular problem is still not a trivial task among researchers. The most common approach relies on popular architectures proven to work on specific problem domains led on the same experiment environment and setup. However, this limits the opportunity to choose or invent no...
Preprint
Full-text available
Efficient solving of an unseen optimization problem is related to appropriate selection of an optimization algorithm and its hyper-parameters. For this purpose, automated algorithm performance prediction should be performed that in most commonly-applied practices involves training a supervised ML algorithm using a set of problem landscape features....
Preprint
Full-text available
In this paper, we present FoodChem, a new Relation Extraction (RE) model for identifying chemicals present in the composition of food entities, based on textual information provided in biomedical peer-reviewed scientific literature. The RE task is treated as a binary classification problem, aimed at identifying whether the contains relation exists...
Article
Full-text available
Background A better understanding of food-related behaviour and its determinants can be achieved through harmonisation and linking of the various data-sources and knowledge platforms. Scope We describe the key decision-making in the development of a prototype of the Determinants and Intake Platform (DI Platform), a data platform that aims to harmo...
Article
Full-text available
Being both a poison and a cure for many lifestyle and non-communicable diseases, food is inscribing itself into the prime focus of precise medicine. The monitoring of few groups of nutrients is crucial for some patients, and methods for easing their calculations are emerging. Our proposed machine learning pipeline deals with nutrient prediction bas...
Preprint
BACKGROUND Being both a poison and a cure for many lifestyle and non-communicable diseases, food is inscribing itself into the prime focus of precise medicine, therefore knowing what is in our food has become utmost important. The monitoring of few groups of nutrients has become crucial for some patients, and with that methods for easing their calc...
Preprint
When designing a benchmark problem set, it is important to create a set of benchmark problems that are a good generalization of the set of all possible problems. One possible way of easing this difficult task is by using artificially generated problems. In this paper, one such single-objective continuous problem generation approach is analyzed and...
Preprint
Full-text available
Many platforms for benchmarking optimization algorithms offer users the possibility of sharing their experimental data with the purpose of promoting reproducible and reusable research. However, different platforms use different data models and formats, which drastically inhibits identification of relevant data sets, their interpretation, and their...
Preprint
Full-text available
Accurately predicting the performance of different optimization algorithms for previously unseen problem instances is crucial for high-performing algorithm selection and configuration techniques. In the context of numerical optimization, supervised regression approaches built on top of exploratory landscape analysis are becoming very popular. From...