
Stefanos Ougiaroglou- M.Sc.,Ph.D.
- Assistant Professor at International Hellenic University
Stefanos Ougiaroglou
- M.Sc.,Ph.D.
- Assistant Professor at International Hellenic University
About
57
Publications
13,043
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
379
Citations
Introduction
Current institution
Additional affiliations
September 2015 - present
September 2014 - February 2015
September 2007 - July 2009
Publications
Publications (57)
RSSI-based proximity positioning is a well-established technique for indoor localization, featuring simplicity and cost-effectiveness, requiring low-price and off-the-shelf hardware. However, it suffers from low accuracy (in NLOS traffic), noise, and multipath fading issues. In large complex spaces, such as museums, where heavy visitor traffic is e...
Signature forgery detection remains a challenge in the field of biometric security. The goal is to develop automated detection systems capable of distinguishing genuine signatures from forged ones with high accuracy. Traditional signature verification methods, which rely on human judgment, are not only error-prone but also lack the speed and scalab...
Reduction through Homogeneous Clustering (RHC) and its editing variant (ERHC) represent effective methods for reducing data in the context of instance-based classification. Both RHC and ERHC are based on an iterative k-means clustering procedure that builds homogeneous clusters. Therefore, they are inappropriate for data reduction tasks that need t...
Reducing the size of the training set, which involves replacing it with a condensed set, is a widely adopted practice to enhance the efficiency of instance-based classifiers while trying to maintain high classification accuracy. This objective can be achieved through the use of data reduction techniques, also known as prototype selection or generat...
Partition-based clustering is widely applied over diverse domains. Researchers and practitioners from various scientific disciplines engage with partition-based algorithms relying on specialized software or programming libraries. Addressing the need to bridge the knowledge gap associated with these tools, this paper introduces kClusterHub, an AutoM...
The K-Means clustering finds many applications in different domains. Researchers and practitioners utilize K-Means through specialized software or libraries of programming languages. This implies knowledge on these tools. This paper presents Web-K-Means, a user-friendly web application that simplifies the process of running the K-Means clustering a...
Reducing the size of the training set, that is, replacing it with a condensing set, while maintaining the classification accuracy as much as possible is a very common practice to speed up instance-based classifiers. Data reduction techniques , also known as prototype selection or generation algorithms , can be used to accomplish this. There are num...
Collaborative filtering has proved to be one of the most popular and successful rating prediction techniques over the last few years. In collaborative filtering, each rating prediction, concerning a product or a service, is based on the rating values that users that are considered “close” to the user for whom the prediction is being generated have...
A very common practice to speed up instance based classifiers is to reduce the size of their training set, that is,
replace it by a condensing set, and at the same time keep accuracy unaffected as much as possible. This can be
achieved by applying a Prototype Selection or Generation algorithm also referred to as Data Reduction
Techniques. One can f...
The Reduction by Space Partitioning (RSP3) algorithm is a well-known data reduction technique. It summarizes the training data and generates representative prototypes. Its goal is to reduce the computational cost of an instance-based classifier without penalty in accuracy. The algorithm keeps on dividing the initial training data into subsets until...
Featured Application
Early diagnosis and warning mechanisms are essential in every health condition. The research described in this paper can provide the means for the development of medical assistance applications.
Abstract
The correlation between the kind of cesarean section and post-traumatic stress disorder (PTSD) in Greek women after a trauma...
Large volumes of training data introduce high computational cost in instance-based classification. Data reduction algorithms select or generate a small (condensing) set of representative training prototypes from the available training data. The Reduction by Space Partitioning algorithm is one of the most well-known prototype generation algorithms t...
Dear Colleagues,
Nowadays, data analysis and mining are being used in numerous everyday tasks to solve practical problems. This research field has attracted the interest of both academia and industry. Therefore, the research community has contributed algorithms, techniques and tools for the prediction of future situations, discovery of clusters wi...
Numerous Prototype Selection and Generation algorithms for instance based classifiers and single label classification problems have been proposed in the past and are available in the literature. They build a small set of prototypes that represents as best as possible the initial training data. This set is called the condensing set and has the benef...
The effectiveness of the k-NN classifier is highly dependent on the value of the parameter k that is chosen in advance and is fixed during classification. Different values are appropriate for different datasets and parameter tuning is usually inevitable. A dataset may include simultaneously well-separated and not well-separated classes as well as n...
This paper presents a web application for Association Rules Mining (ARM). It utilizes Apriori that is the most widely used algorithm for this type of data mining tasks. The web application is called WebApriori and offers a modern responsive web interface and a web service to scientific communities working in the field of ARM. It is also appropriate...
Data reduction aims to reduce the number of training data in order to speed-up the classifier training. They do that by collecting a small set of representative prototypes from the original patterns. The Reduction by finding Homogeneous Clusters algorithm is a simple data reduction technique that recursively utilizes k-means clustering to build a s...
Data reduction, achieved by collecting a small subset of representative prototypes from the original patterns, aims at alleviating the computational burden of training a classifier without sacrificing performance. We propose an extension of the Reduction by finding Homogeneous Clusters algorithm, which utilizes the k-means method to propose a set o...
In this paper, we investigate the effect of parallelism on two data reduction algorithms that use k-Means clustering in order to find homogeneous clusters in the training set. By homogeneous, we refer to clusters where all instances belong to the same class label. Our approach divides the training set into subsets and applies the data reduction alg...
A well-known and adaptable classifier is the k-Nearest Neighbor (kNN) that requires a training set of relatively small size in order to perform adequately. Training sets can be reduced in size by using conventional data reduction techniques. Unfortunately, these techniques are inappropriate in streaming environments or when executed in devices with...
Nowadays, large volumes of training data are available from various data sources and streaming environments. Instance-based classifiers perform adequately when they use only a small subset of such datasets. Larger data volumes introduce high computational cost that prohibits the timely execution of the classification process. Conventional prototype...
Neural Networks and Support Vector Machines (SVMs) are two of the most popular and efficient supervised classification models. However, in the context of large datasets many complexity issues arise due to high memory requirements and high computational cost. In the context of the application of Data Mining algorithms, data reduction techniques atte...
The k Nearest Neighbor is a popular and versatile classifier but requires a relatively small training set in order to perform adequately, a prerequisite not satisfiable with the large volumes of training data that are nowadays available from streaming environments. Conventional Data Reduction Techniques that select or generate training prototypes a...
Although Support Vector Machines (SVMs) are considered effective supervised learning methods, their training procedure is time-consuming and has high memory requirements. Therefore, SVMs are inappropriate for large datasets. Many Data Reduction Techniques have been proposed in the context of dealing with the drawbacks of $k$-Nearest Neighbor classi...
Like many other classifiers, k-NN classifier is noise-sensitive. Its accuracy highly depends on the quality of the training data. Noise and mislabeled data, as well as outliers and overlaps between data regions of different classes, lead to less accurate classification. This problem can be dealt with by adopting either a large k value or by pre-pro...
The efficiency of the k-Nearest Neighbour classifier depends on the size of the training set as well as the level of noise in it. Large datasets with high level of noise lead to less accurate classifiers with high computational cost and storage requirements. The goal of editing is to improve accuracy by improving the quality of the training dataset...
A popular and easy to implement classifier is the k k-Nearest Neighbour (k-NN). However, sequentially searching for nearest neighbours in large datasets leads to inefficient classification
because of the high computational cost involved. This article presents an adaptive hybrid and cluster-based method for speeding
up the k k-NN classifier. The pro...
Although the k-NN classifier is a popular classification method, it suffers from the high compu-tational cost and storage requirements it involves. This paper proposes two effective cluster-based data reduc-tion algorithms for efficient k-NN classification. Both have low preprocessing cost and can achieve high data reduction rates while maintaining...
A widely used time series classification method is the single nearest neighbour. It has been adopted in many time series classification systems because of its simplicity and effectiveness. However, the efficiency of the classification process depends on the size of the training set as well as on data dimensionality. Although many speed-up methods f...
Data reduction is a common preprocessing task in the con-text of the k nearest neighbour classification. This paper presents WebDR, a web-based application where several data reduction techniques have been integrated and can be executed on-line. WebDR allows the perfor-mance evaluation of the classification process through a web interface. Therefor...
Although the k-NN classifier is considered to be an effective classification algorithm, it has some major weaknesses that may render its use inappropriate for some application domains and / or datasets. The first one is the high computational cost involved (all distances between each unclassified item and all training data must be computed). Althou...
Data reduction techniques improve the efficiency of k-Nearest Neighbour classification on large datasets since they accelerate the classification process and reduce storage requirements for the training data. 1:132 is an effective prototype selection data reduction technique. It selects some items from the initial training dataset and uses them as...
Editing is a crucial data mining task in the context of k-Nearest Neighbor classification. Its purpose is to improve classification accuracy by improving the quality of training datasets. To obtain such datasets, editing algorithms try to remove noisy and mislabeled data as well as smooth the decision boundaries between the discrete classes. In thi...
The
$k$
k
-NN classifier is a widely used classification algorithm. However, exhaustively searching the whole dataset for the nearest neighbors is prohibitive for large datasets because of the high computational cost involved. The paper proposes an efficient model for fast and accurate nearest neighbor classification. The model consists of a no...
The k-Nearest Neighbor (k-NN) classification algorithm is one of the most
widely-used lazy classifiers because of its simplicity and ease of
implementation. It is considered to be an effective classifier and has many
applications. However, its major drawback is that when sequential search is
used to find the neighbors, it involves high computationa...
Data reduction improves the efficiency of k-NN classifier on large datasets since it accelerates the classification process and reduces storage requirements for the training data. IB2 is an effective data reduction technique that selects some training items form the initial dataset and uses them as representatives (prototypes). Contrary to many oth...
The one-nearest neighbour classifier is a widely-used time series classification method. However, its efficiency depends on the size of the training set as well as on data dimensionality. Although many speed-up methods for fast time series classification have been proposed, state-of-the-art, non-parametric data reduction techniques have not been ex...
Data reduction is very important especially when using the k-NN Classifier on large datasets. Many prototype selection and generation Algorithms have been proposed aiming to condense the initial training data as much as possible and keep the classification accuracy at a high level. The Prototype Selection by Clustering (PSC) algorithm is one of the...
Although the k-Nearest Neighbor classifier is one of the most widely-used classification methods, it suffers from the high computational cost and storage requirements it involves. These major drawbacks have constituted an active research field over the last decades. This paper proposes an effective data reduction algorithm that has low preprocessin...
A well known classification method is the k-Nearest Neighbors (k-NN) classifier. However, sequentially searching for the nearest neighbors in large datasets downgrades its performance because of the high computational cost involved. This paper proposes a cluster-based classification model for speeding up the k-NN classifier. The model aims to reduc...
The k-Nearest Neighbor (k-NN) classifier is a widely-used and effective classification method. The main k-NN drawback is that it involves high computational cost when applied on large datasets. Many Data Reduction Techniques have been proposed in order to speed-up the classification process. However, their effectiveness depends on the level of nois...
Many researchers have focused on the mining of educational data stored in databases of educational software and Learning Management Systems. The goal is the knowledge discovery that can help educators to support their students by managing effectively educational units, redesigning student’s activities and finally improving the learning outcome. A b...
This paper proposes a hybrid method for fast and accurate Nearest Neighbor Classification. The method consists of a non-parametric cluster-based algorithm that produces a two-level speed-up data structure and a hybrid algorithm that accesses this structure to perform the classification. The proposed method was evaluated using eight real-life datase...
Many students do not manage to complete their higher education studies because they chose an university department whose curriculum consists of courses that are out of their interest. If the selection of their studies is done with more caution and knowledge might lead to better decision making. This paper presents a web-based decision support syste...
Some of the most commonly used classifiers are based on the retrieval and examination of the k Nearest Neighbors of unclassified instances. However, since the size of datasets can be large, these classifiers are inapplicable when the time-costly sequential search over all instances is used to find the neighbors. The Minimum Distance Classifier is a...
A well-known technique for broadcast program construction is the Broadcast Disks technique. However, in the Broadcast Disks
approach there are some important disadvantages. For example some parts of the broadcast program remain empty during the construction
procedure and the disk relative frequencies have to be selected very carefully. This paper g...
A well-known technique for broadcast program construction is the Broadcast Disks. However, it has important disadvantages, as for example that the broadcast program construction procedure leaves some parts of the broadcast program empty. This paper proposes a new approach for the construction of the broadcast program. Specifically, it presents thre...
Classification based on k-nearest neighbors (kNN classification) is one of the most widely used classification methods. The number k of nearest neighbors used for achieving a high accuracy in classification is given in advance and is highly dependent on
the data set used. If the size of data set is large, the sequential or binary search of NNs is i...