Veselka Boeva

Veselka Boeva
Blekinge Institute of Technology | BTH

Professor of Computer Science

About

85
Publications
4,873
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
424
Citations
Additional affiliations
September 1994 - present
Technical University of Sofia - Branch Plovdiv
Position
  • Professor (Full)

Publications

Publications (85)
Article
Full-text available
Data has become an integral part of our society in the past years, arriving faster and in larger quantities than before. Traditional clustering algorithms rely on the availability of entire datasets to model them correctly and efficiently. Such requirements are not possible in the data stream clustering scenario, where data arrives and needs to be...
Article
Full-text available
Recent advances in sensor technology are expected to lead to a greater use of wireless sensor networks (WSNs) in industry, logistics, healthcare, etc. On the other hand, advances in artificial intelligence (AI), machine learning (ML), and deep learning (DL) are becoming dominant solutions for processing large amounts of data from edge-synthesized h...
Chapter
In this paper, we propose a Global Navigation Satellite System (GNSS) component activation model for mobile tracking devices that automatically detects indoor/outdoor environments using the radio signals received from Long-Term Evolution (LTE) base stations. We use an Inductive System Monitoring (ISM) technique to model environmental scenarios capt...
Conference Paper
Full-text available
A reliable detection and characterisation of the different operating modes of a wind turbine is essential for the correct understanding of its behaviour and production performance. Wind turbines are usually installed in fleets, which offers richer datasets to exploit. However, blindly applying machine learning approaches to such datasets may mask t...
Article
Full-text available
In smart buildings, many different systems work in coordination to accomplish their tasks. In this process, the sensors associated with these systems collect large amounts of data generated in a streaming fashion, which is prone to concept drift. Such data are heterogeneous due to the wide range of sensors collecting information about different cha...
Article
Full-text available
Wind turbines are typically organised as a fleet in a wind park, subject to similar, but varying, environmental conditions. This makes it possible to assess and benchmark a turbine’s output performance by comparing it to the other assets in the fleet. However, such a comparison cannot be performed straightforwardly on time series production data si...
Chapter
Data available today in smart monitoring applications such as smart buildings, machine health monitoring, smart healthcare, etc., is not centralized and usually supplied by a number of different devices (sensors, mobile devices and edge nodes). Due to which the data has a heterogeneous nature and provides different perspectives (views) about the st...
Article
Full-text available
Recently machine learning researchers are designing algorithms that can run in embedded and mobile devices, which introduces additional constraints compared to traditional algorithm design approaches. One of these constraints is energy consumption, which directly translates to battery capacity for these devices. Streaming algorithms, such as the Ve...
Chapter
In this study, we propose a novel data analysis approach that can be used for multi-view analysis and integration of heterogeneous temporal data originating from multiple sources. The proposed approach consists of several distinctive layers: (i) select a suitable set (view) of parameters in order to identify characteristic behaviour within each ind...
Article
In this study, we propose a multi-view data analysis approach that can be used for modelling and monitoring smart control valve system behaviour. The proposed approach consists of four distinctive steps: (i) multi-view interpretation of the available data attributes by separating them into several representations (views), e.g., operational paramete...
Article
Full-text available
In this study, we propose a higher order mining approach that can be used for the analysis of real-world datasets. The approach can be used to monitor and identify the deviating operational behaviour of the studied phenomenon in the absence of prior knowledge about the data. The proposed approach consists of several different data analysis techniqu...
Conference Paper
The communication of sustainability values shared between product developers and customers is an important catalyst for effective collaboration that inspires sustainable consumption. Despite the many tools developed for assessing and communicating the product’s sustainability performance, customers are facing difficulties in understanding product s...
Chapter
In this paper we address the problem of modeling the evolution of clusters over time by applying sequential clustering. We propose a sequential partitioning algorithm that can be applied for grouping distinct snapshots of streaming data so that a clustering model is built on each data snapshot. The algorithm is initialized by a clustering solution...
Chapter
In this ongoing study, we propose a higher order data mining approach for modelling district heating (DH) substations’ behaviour and linking operational behaviour representative profiles with different performance indicators. We initially create substation’s operational behaviour models by extracting weekly patterns and clustering them into groups...
Chapter
Cluster validation measures are designed to find the partitioning that best fits the underlying data. In this study, we show that these measures can be used for identifying mislabeled instances or class outliers prior to training in supervised learning problems. We introduce an ensemble technique, entitled CVI-based Outlier Filtering, which identif...
Article
Full-text available
In this study, we propose a new multi-view stream clustering approach, called MV Split-Merge Clustering. The proposed approach is an extension of an existing split-merge evolutionary clustering algorithm (entitled Split-Merge Clustering) to multi-view data applications. The extended version can be used to integrate data from multiple views in a str...
Chapter
We propose a split-merge framework for evolutionary clustering. The proposed clustering technique, entitled Split-Merge Evolutionary Clustering is supposed to be more robust to concept drift scenarios by providing the flexibility to consider at each step a portion of the data and derive clusters from it to be used subsequently to update the existin...
Chapter
In this study we apply clustering techniques for analyzing and understanding households’ electricity consumption data. The knowledge extracted by this analysis is used to create a model of normal electricity consumption behavior for each particular household. Initially, the household’s electricity consumption data are partitioned into a number of c...
Conference Paper
Full-text available
We present a new method, called hyperplane folding, that increases the margin in Support Vector Machines (SVMs). Based on the location of the support vectors, the method splits the dataset into two parts, rotates one part of the dataset and then merges the two parts again. This procedure increases the margin as long as the margin is smaller than ha...
Chapter
Machine learning algorithms are responsible for a significant amount of computations. These computations are increasing with the advancements in different machine learning fields. For example, fields such as deep learning require algorithms to run during weeks consuming vast amounts of energy. While there is a trend in optimizing machine learning a...
Chapter
Finding experts in academics is an important practical problem, e.g., recruiting reviewers for reviewing conference, journal or project submissions, partner matching for research proposals, finding relevant MSc or PhD supervisors, etc. In this work, we discuss an expertise recommender system that is built on data extracted from the Blekinge Institu...
Article
Full-text available
In this work, we apply cluster validation measures for analyzing email communications at an organizational level of a company. This analysis can be used to evaluate the company structure and to produce further recommendations for structural improvements. Our evaluations, based on data in the forms of email logs and organizational structure for a la...
Article
Full-text available
The growth of Internet video and over-the-top transmission techniques has enabled online video service providers to deliver high quality video content to viewers. To maintain and improve the quality of experience, video providers need to detect unexpected issues that can highly affect the viewers' experience. This requires analyzing massive amounts...
Preprint
Machine learning software accounts for a significant amount of energy consumed in data centers. These algorithms are usually optimized towards predictive performance, i.e. accuracy, and scalability. This is the case of data stream mining algorithms. Although these algorithms are adaptive to the incoming data, they have fixed parameters from the beg...
Conference Paper
Full-text available
We propose a cluster analysis approach for organizing , visualizing and understanding households' electricity consumption data. We initially partition the consumption data into a number of clusters with similar daily electricity consumption profiles. The centroids of each cluster can be seen as representative signatures of a household's electricity...
Poster
Full-text available
https://schlieplab.org/Static/Downloads/SweDS2017-Abstracts.pdf
Conference Paper
Expertise retrieval has already gained significant interest in the area of information retrieval due to multitude of concrete application contexts where search for specific experts is required. In this paper, we introduce a formal concept analysis approach for clustering of a group of experts with respect to given subject areas. Initially, the doma...
Article
Gene regulatory network (GRN) inference is an important problem in bioinformatics. Many machine learning methods have been applied to increase the inference accuracy. Ensemble learning methods are shown in DREAM3 and DREAM5 challenges to yield a higher inference accuracy than individual algorithms. However, no ensemble method has been proposed to t...
Conference Paper
In this paper, we present a novel semantic-aware clustering approach for partitioning of experts represented by lists of keywords. A common set of all different keywords is initially formed by pooling all the keywords of all the expert profiles. The semantic distance between each pair of keywords is then calculated and the keywords are partitioned...
Article
Full-text available
According to the Swedish National Council for Crime Prevention, law enforcement agencies solved approximately three to five percent of the reported residential burglaries in 2012. Internationally, studies suggest that a large proportion of crimes are committed by a minority of offenders. Law enforcement agencies, consequently, are required to detec...
Article
Full-text available
Background Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield more reliable and robust results since they are based...
Article
This paper centres on clustering approaches that deal with multiple DNA microarray datasets. Four clustering algorithms for deriving a clustering solution from multiple gene expression matrices studying the same biological phenomenon are considered: two unsupervised cluster techniques based on information integration, a hybrid consensus clustering...
Chapter
In contrast to conventional clustering algorithms, where a single dataset is used to produce a clustering solution, we introduce herein a MapReduce approach for clustering of datasets generated in multiple-experiment settings. It is inspired by the map-reduce functions commonly used in functional programming and consists of two distinctive phases...
Article
Today, it is common to include machine learning components in software products. These components offer specific functionalities such as image recognition, time series analysis, and forecasting but may not satisfy the non-functional constraints of the software products. It is difficult to identify suitable learning algorithms for a particular task...
Conference Paper
Clustering algorithms have been used to divide genes into groups according to the degree of their expression similarity. Such a grouping may suggest that the respective genes are correlated and/or co-regulated, and subsequently indicates that the genes could possibly share a common biological role. In this paper, four clustering algorithms are inve...
Conference Paper
In this article we propose an integrative clustering approach for analysis of gene expression data across multiple experiments, based on Particle Swarm Optimization (PSO) and Formal Concept Analysis (FCA). In the proposed algorithm, the available microarray experiments are initially divided into groups of related datasets with respect to a predefin...
Conference Paper
Gene expression microarrays are the most commonly available source of high-throughput biological data. They are widely employed for studying many different aspects of gene regulation and function, ranging from understanding the global cell-cycle control of microorganisms to cancer in humans. Gene expression microarray experiments often generate dat...
Conference Paper
In this paper, we present an initial work on a method for comparing expert profiles within the context of expert networks by measuring expertise similarity between experts. We introduce the concept of expertise spheres and describe how an expert expertise profile can be compared with a certain subject and how can be determined how well an expert's...
Article
Full-text available
In this article we propose a hybrid approach for clustering of gene expression data across multiple experiments, based on Particle Swarm Optimization and k-means clustering. In the proposed algorithm, each experiment identifies a particle initialized with the result of the k-means algorithm applied over the experiment. The final clustering solution...
Conference Paper
Full-text available
In this article, we study two microarray data integration techniques and describe how they can be applied and validated on a set of independent, but biologically related, microarray data sets in order to derive consistent and relevant clustering results. First, we present a cluster integration approach, which combines the information containing in...
Article
In contrast to conventional clustering algorithms, where a single data set is used to produce a clustering solution, we introduce herein a MapReduce approach for clustering of data sets generated in multipleexperiment settings. It is inspired by the map-reduce functions commonly used in functional programming and consists of two distinctive phases....
Chapter
This work proposes a novel multi-purpose data standardization method inspired by gene-centric clustering approaches. The clustering is performed via template matching of expression profiles employing Dynamic Time Warping (DTW) alignment algorithm to measure the similarity between the profiles. In this way, for each gene profile a cluster consisting...
Conference Paper
In recent years, microarray gene expression profiles have become a common technique for inferring the relationship or regulation among different genes. While most of the previous work on microarray analysis focused on individual datasets, some global studies exploiting large numbers of microarrays have been presented recently. In this paper, we inv...
Conference Paper
Full-text available
Gene expression microarrays are the most commonly available source of high-throughput biological data. They have been widely employed in recent years for the definition of cell cycle regulated (or periodically expressed) subsets of the genome in a number of different organisms. These have driven the development of various computational methods for...
Conference Paper
Full-text available
In this paper, we propose a collaborative decision support platform that supports the product manager in defining the contents of a product release. The platform allows interactive and collaborative decision making by facilitating the exchange of information about product features among individual autonomous stakeholders, providing reputation-enhan...
Article
Gene expression microarrays are the most commonly available source of high-throughput biological data. Each microarray experiment is supposed to measure the gene expression levels of a set of genes in a number of different experimental conditions or time points. Integration of results from different microarray experiments to the specific analysis i...
Article
Full-text available
A novel integration approach targeting the combination of multi-experiment time series expression data is proposed. A recursive hybrid aggregation algorithm is initially employed to extract a set of genes, which are eventually of interest for the biological phenomenon under study. Next, a hierarchical merge procedure is specifically developed for t...
Article
This paper proposes a novel data transformation method aiming at multi-purpose data standardization and inspired by gene-centric clustering approaches. The idea is to perform data standardization via template matching of each expression profile with the rest of the expression profiles employing dynamic time warping (DTW) alignment algorithm to meas...
Conference Paper
The accurate estimation of missing values is important for efficient use of DNA microarray data since most of the analysis and clustering algorithms require a complete data matrix. Several imputation algorithms have already been proposed in the biological literature. Most of these approaches identify, in one or another way, a fixed number of neighb...
Article
Gene expression microarray experiments frequently generate datasets with multiple values missing. However, most of the analysis, mining, and classification methods for gene expression data require a complete matrix of gene array values. Therefore, the accurate estimation of missing values in such datasets has been recognized as an important issue,...
Conference Paper
The contribution develops a mathematical model allowing interpretation and simulation of the phenomenon of additive-dominance heterosis as a network of interacting parallel aggregation processes. Initially, the overall heterosis potential has been expressed in terms of the heterosis potentials of each of the individual genes controlling the trait o...
Article
In this work we introduce a decision model, in the form of a recursive aggregation algorithm, that attempts to mimic a multi-step ranking process of a set of alternatives in a multi-criteria and multi-expert decision making environment. The main idea is rather intuitive. Each alternative is initially assigned a list of values, representing the grou...
Article
In the framework of interval decision making, the available information is vague and numerically imprecise, and decision situations are modelled by imprecise probabilities and utilities that are simply represented by suitable intervals and comparisons. Alternatives are therefore evaluated in terms of interval expected utilities, which are then used...
Article
This work suggests a set of concepts and procedures for detecting dynamic conflicts in conceptual schemata. Standard modal logic is used for expressing properties of schemata or specifications. The considered model is enriched with correspondence assertions expressing relationships between different properties described in the formal specifications...
Article
Conflict detection and analysis are of high importance, e.g., when integrating conceptual schemata, such as UML-Specifications, or analysing goal-fulfilment of sets of autonomous agents. In general, models for this introduce unnecessarily complicated frameworks with several disadvantages regarding semantics as well as complexity. This paper demonst...
Conference Paper
In the framework of interval decision making, the available information is vague and numerically imprecise and the decision situation is modelled by imprecise probabilities and utilities that are simply represented by suitable intervals and comparisons. Alternatives are therefore evaluated in terms of interval expected utilities, which are then use...
Article
Full-text available
In this paper standard modal logic is suggested as a formal framework for modelling and analysing different aspects of schema integration. Schemata are represented by sets of modal first-order formulae and interpreted in terms of standard models of modal logic. An approach based on correspondence assertions, i.e. expressing relationships between di...
Article
We introduce a nonparametric recursive aggregation process called Multilayer Aggregation (MLA). The name refers to the fact that at each step the results from the previous one are aggregated and thus, before the final result is derived, the initial values are subjected to several layers of aggregation. Most of the conventional aggregation operators...
Article
Full-text available
In this paper, we use standard modal logic as a formal framework for modelling and analyzing of schemata. A main problem considered is how to combine two schemata into an integrated schema that has the same information capacity as the original ones. To formalize this, we translate the concept of weak dominance into the language of modal logic. It i...
Conference Paper
In this work, we consider a situation when a decision making agent has to choose between a finite set of strategies, having access to the opinions of finite set of autonomous agents. Moreover, the decision making agent is allowed to assign different credibilities to the statements made by the agents. We present this decision situation in terms of K...
Article
A modal logic interpretation of Dempster-Shafer theory is developed in the framework of multivalued models of modal logic, i.e. models in which in any possible world an arbitrary number (possibly zero) of atomic propositions can be true. Several approaches to conditioning in multivalued models of modal logic are presented.
Article
Modal logic interpretations of plausibility and belief measures are developed based on the observation that the accessibility relation in a model of modal logic, regarded as a multivalued mapping, induces a plausibility measure and a belief measure on the set of possible worlds.