About
181
Publications
32,804
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,642
Citations
Introduction
Piotr Kulczycki graduated in Electrical Engineering from the AGH University of Science and Technology, and in Applied Mathematics from the Jagiellonian University. He currently holds the professor positions at the Systems Research Institute of the Polish Academy of Sciences, where he is the Head of the Centre of Information Technology for Data Analysis Methods, as well as the AGH University of Science and Technology, where he is the Head of the Division for Information Technology and Systems Research.
Current institution
Additional affiliations
October 2000 - present
October 2014 - present
October 1992 - September 1993
Publications
Publications (181)
The study tested an alternative method for the evaluation of habitat preferences of wisents (Bison bonasus) based upon possibly non-biased dataset – presence only data. Data on spatial distribution of free-ranging wisents (3055 individual presence records) were collected over a period of 10 years in the Bieszczady Mountains, south-eastern Poland. U...
This book presents a wide and comprehensive spectrum of issues and problems related to fractional-order dynamical systems. It is meant to be a full-fledge, comprehensive presentation of many aspects related to the broadly perceived fractional-order dynamical systems which constitute an extension of the traditional integer-order-type descriptions. T...
Recent growth in interest concerning streaming data has been forced by the expansion of systems successively providing current measurements and information, which enables their ongoing, consecutive analysis. The subject of this research is the determination of a density function characterizing potentially changeable distribution of streaming data....
LHCb at CERN, Geneva is a world-leading high energy physics experiment dedicated to searching for New Physics phenomena. The experiment is undergoing a major upgrade and will rely entirely on a flexible software trigger to process the data in real-time. In this paper a novel approach to reconstructing (detecting) long-lived particles using a new pa...
The subject of this study is three fundamental procedures of contemporary data analysis: outlier detection, clustering and classification. These issues are considered in a conditional approach – the introduction of specific (e.g., current) values to the model allows, in practice, a significantly precise description of the reality under research. Th...
This book presents a wide and comprehensive range of issues and problems in various fields of science and engineering, from both theoretical and applied perspectives. The desire to develop more effective and efficient tools and techniques for dealing with complex processes and systems has been a natural inspiration for the emergence of numerous fie...
The subject of the study are three fundamental procedures of contemporary data analysis: outliers detection, clustering and classification. The issue is considered in a conditional approach – introduction of specific (e.g. current) values to the model allows in practice a significantly precise description of the reality under research. The same met...
Many contemporary automatic control applications require parametric identificationParametric identification, taking into account results of estimation errors unavoidable in practice. The subject of this chapter constitutes a procedure enabling effective calculation of optimal, in the sense of minimal expectation of losses, estimator of a parameter...
Animal grouping is a very complex process that is triggered by a number of factors: habitat, social structure, season, predators' pressure etc. Proper identification of sites where natural concentrations of animals occur, is crucial for the explanation of such behaviour and for species and habitat management. Therefore, the selection of methods all...
The Cuttlefish Algorithm, a modern metaheuristic procedure, is a very recent solution to a broad-range of optimization tasks. The aim of the article is to outline the Cuttlefish Algorithm and to demonstrate its usability in data mining problems. In this paper, we apply this metaheuristic procedure for a clustering problem, with the Calinski-Harabas...
The images obtained by X-Ray or computed tomography (CT) may be contaminated with different kinds of noise or show lack of sharpness, too low or high intensity and poor contrast. Such image deficiencies can be induced by adverse physical conditions and by the transmission properties of imaging devices. A number of enhancement techniques in image pr...
Analyzing astronomical observations represents one of the most challenging tasks of data exploration. It is largely due to the volume of the data acquired using advanced observational tools. While other challenges typical for the class of Big Data problems - like data variety - are also present, datasets size represents the most significant obstacl...
In the practice of data analysis some problems for many-sided researches are caused by the methodological variety of specific algorithms, often leading to laborious interpretations and time-consuming studies. This paper presents the concept of methodically unified procedures, based on kernel estimators, for three fundamental tasks: outlier detectio...
This book highlights a broad range of modern information technology tools, techniques, investigations and open challenges, mainly with applications in systems research and computational physics. Divided into three major sections, it begins by presenting specialized calculation methods in the framework of data analysis and intelligent computing. In...
Extracting useful information from astronomical observations represents one of the most challenging tasks of data exploration. This is largely due to the volume of the data acquired using advanced observational tools. While other challenges typical for the class of big data problems (like data variety) are also present, the size of datasets represe...
The aim of this article is to present research involving the employment of intelligent methods for image analysis, particularly, the binarization process. In this case, the Flower Pollination Algorithm was used to optimize the internal parameters of the Niblack binarization algorithm. As a criterion for the quality of the proposed solution, the mor...
Nature inspired metaheuristics were found to be applicable in deriving best solutions for several optimization tasks, and clustering represents a typical problem which can be successfully tackled with these methods. This paper investigates certain techniques of cluster analysis based on two recent heuristic algorithms mimicking natural processes: t...
The task of detecting atypical (rare) elements is of major significance
in the field of medical problems and its conditions seem to be specific in practice. Such elements, mostly concerned with pathology, are very different in nature and their set is often small in size with a low level of representativeness. A frequency approach was applied in the...
This book constitutes the refereed post-conference proceedings of the 6th International Symposium on Computational Modeling of Objects Presented in Images, CompIMAGE 2018, held in Cracow, Poland, inJuly 2018.The 16 revised full papers presented in this book were carefully reviewed and selected from 30 submissions. The papers cover the following top...
The discovery of atypical elements has become one of the most important challenges in data analysis and exploration. At the same time it is not an easy matter, with difficult conditions, and not even strictly defined. This paper presents a ready-to-use procedure for identifying atypical elements, in the sense of rare occurring. The issue is conside...
This paper presents a proposal of a model error mitigation technique based on the error distribution analysis of the original model and creating the additional model that tempers the error impact in particular domain areas identified as the most sensitive. Both models are then combined into single ensemble model. The idea is demonstrated on the tri...
The subject of this chapter is the presentation of a coherent concept of establishing the methodology of kernel estimators for the three main tasks of data analysis: identification/detection of atypical elements (outliers), clustering, and classification. The application of a uniform apparatus for all three basic problems facilitates comprehension...
Nowadays, with the rapid development of digital image processing, there has been a notable increase in elaborating advanced tools for studying the internal structure of objects. This may be very helpful in characterizing certain morphological traits of grains, as well as in quantifying the differences between them. The current research was carried...
This e-book contains conference material from two concurrent conferences:
− 3rd Conference on Information Technology, Systems Research and Computational Physics (ITSRCP'18),
− 6th International Symposium CompIMAGE’18 – Computational Modeling of Objects Presented in Images: Fundamentals, Methods, and Applications (CompIMAGE'18),
which has been org...
The subject of this paper is a procedure for the identification (detection, discovery) of atypical elements, understood in the sense that they occur rarely. A result of the procedure is the generation of a rating as to whether an examined observation should be classed as atypical, given in classic two-values form (deterministic, sharp), as well as...
Automated classification systems have allowed for the rapid development of exploratory data analysis. Such systems increase the independence of human intervention in obtaining the analysis results, especially when inaccurate information is under consideration. The aim of this paper is to present a novel approach, a neural networking, for use in cla...
This paper presents a ready-to-use procedure for detecting atypical (rarely occurring) elements, in one- and multidimensional spaces. The issue is considered through a conditional approach. The application of nonparametric concepts frees the investigated procedure from distributions of describing and conditioning variables. Ease of interpretation a...
The Bayes approach is arguably the classification method most used in unspecialized applications, thanks to its robustness, simplicity, and interpretability. The main problem here is establishing proper probability values. This paper deals with adapting the above method for cases where the classified data is of interval type, with changing environm...
The problem of identifying atypical elements in a data set presents many difficulties at every stage of analysis. For instance, it is not clear which traits should distinguish such elements, and what more we cannot know in advance of their natural pattern, which even if it did exist, would in its nature be significantly limited. The subject of the...
A broad spectrum of modern Information Technology (IT) tools, techniques, main developments and still open challenges is presented. Emphasis is on new research directions in various fields of science and technology that are related to data analysis, data mining, knowledge discovery, information retrieval, clustering and classification, decision mak...
The task for detection of atypical elements is one of the fundamental tasks of contemporary data analysis, finding applications in numerous problems in practically all areas of sciences and engineering. As an example, in the classic approach of automatic control, e.g. fault detection problems, the appearance of an unusual value of a vector describi...
Dealing with astronomical observations represents one of the most challenging areas of big data analytics. Besides huge variety of data types, dynamics related to continuous data flow from multiple sources, handling enormous volumes of data is essential. This paper provides an overview of methods aimed at reducing both the number of features/attrib...
This paper describes a new approach to metaheuristic-based data clustering by means of Krill Herd Algorithm (KHA). In this work, KHA is used to find centres of the cluster groups. Moreover, the number of clusters is set up at the beginning of the procedure, and during the subsequent iterations of the optimization algorithm, particular solutions are...
The massive amounts of data processed by information systems raise the importance of detailed database performance analysis. Column-oriented data stores are becoming increasingly popular in big data appliances. This paper identifies database performance factors on the basis of empirical studies on a custom implementation. To summarize the research,...
Task of clustering, that is data division into homogeneous groups represents one of the elementary problems of contemporary data mining. Cluster analysis can be approached through variety of methods based on statistical inference or heuristic techniques. Recently algorithms employing novel metaheuristics are of special interest – as they can effect...
A study was conducted so as to develop a methodology for wheat
variety discrimination and identification by way of image analysis techniques. The main purpose of this work was to determine a crucial set of parameters with respect to wheat grain morphology which best differentiate wheat varieties. To achieve better performance, the study was done by...
This paper deals with the popularity of given names in the United States, for the period 1885–2009. Based on the data obtained from the website of U.S. Social Security Administration, it was demonstrated that the fashion of naming babies after the incumbent American president passed away in the ’60s. At the same time, however, examples were given,...
Many of today’s specialized applicational tasks are obliged to consider the influence of inevitable errors in the identification of parameters appearing in a model. Favourable results can also be achieved through measuring, and then accounting for definite (e.g. current) values of factors which show a significant reaction to the values of those par...
The aim of this paper is to present a novel method of data sample reduction that can be applied, in particular, to the classification of interval type imprecise information. Its concept is based on the sensitivity method, inspired by artificial neural networks, while the goal is to increase the number of apposite classifications, and, consequently,...
This paper describes the basic components of a research project aimed at the application of natural computing metaheuristics to optimize the horizontal scaling of databases. Column oriented databases were selected for the project because of their unique properties. A mathematical model has been created in order to align the problem of horizontal sc...
The paper’s subject is classification with nonstationary patterns. The attribute space is finite-dimensional, while its coordinates in particular may be continuous, binary, discrete, categorical in character, or also a combination of these. The number of patterns is not methodologically limited. Use of the Bayes approach minimizes the expected valu...
This paper contains a complete procedure for calculating the value of a conditional quantile estimator. The concept is based on the nonparametric kernel estimator method, which frees the algorithm from the random variables’ distributions. The procedure was worked out in a ready-to-use form – specific formulas for functions and the parameter used we...
Pore space study has been utilized as a general method for defining soil structures. This is because the characteristics particular to pore space impact the majority of physical and physicochemical soil parameters relevant due to plant growth. This paper presents an image segmentation approach for detecting the soil pore structures that have been s...
The paper deals with the classification task of interval information, when processed data is gradually displaced, i.e. they originate from a nonstationary environment. The procedure worked out is characterized by its many practical properties: ensuring the minimum expected value of misclassifications; allowing influence on the probability of errors...
The study was aimed at verifying the existence of a global innovation process by applying Latent Growth Curve Modelling to a short time series of key field indicators. Setting a matrix of time series in the form of a structural model enabled testing functional forms of the processes' dynamics and verifying its dependence on initial levels of the an...
Classification of data streams is currently a very important task. Datasets characterized by constant influx of data are predominantly massive and often have various types of features. Even more challenging is to classify evolving streams. Various approaches have been proposed to deal with this problem. In this paper we will present a new method ba...
The paper deals with the issue of reducing dimension and size of a data set (random sample) for purposes of exploratory data analysis procedures. The concept of the algorithm investigated here is based on linear transformation to a space of smaller dimension, while keeping as much as possible the same distances between particular elements. Elements...
This publication deals with the applicational aspects and possibilities of the Complete Gradient Clustering Algorithm—the classic procedure of Fukunaga and Hostetler, prepared to a ready-to-use state, by providing a full set of procedures for defining all functions and the values of parameters. Moreover, it describes how a possible change in those...
A universal method of dimension and sample size reduction, designed for exploratory data analysis procedures, constitutes the subject of this paper. The dimension is reduced by applying linear transformation, with the requirement that it has the least possible influence on the respective locations of sample elements. For this purpose an original ve...
The subject of Bayes classification of imprecise multidimensional information of interval type by means of patterns defined through precise data (i.e. deterministic or sharp) is investigated here. To this aim the statistical kernel estimators methodology was applied, which avoids the pattern shape for the resulting algorithm. In addition, elements...
This paper is dedicated to the problem of the estimation of a vector of parameters, as losses resulting from their under- and overestimation are asymmetric and mutually correlated. The issue is considered from an additional conditional aspect, where particular coordinates of conditioning variables may be continuous, binary, discrete or categorized...
The subject of the presented research is to determine the complete neural procedure for classifying inaccurate information, as given in the form of an interval vector. For such a formulated task, a basic functionality Probabilistic Neural Network was extended upon the interval type of information. As a consequence, a new type of neural network has...
Wiele wspolczesnych wyspecjalizowanych zadan badawczych wymaga uwzglednienia wpływu nieuniknionych w praktyce bledow identyfikacji parametrow, wystepujacych w modelu. Pozytywne wyniki moze rowniez przyniesc dokonanie pomiaru, a nastepnie uwzglednienie konkretnych (np. aktualnych) wartosci tych czynnikow, ktore w istotny sposob oddziałuja na wartosc...
Przedmiotem niniejszej pracy jest zadanie klasyfikacji z niestacjonarnymi wzorcami, gdy sukcesywnie dostarczane są nowe ich elementy. Ilość wzorców nie jest z założenia ograniczona. Użycie ujęcia bayesowskiego minimalizuje wartość oczekiwaną strat wynikłych z błędnych klasyfikacji, natomiast metodyka statystycznych estymatorów jądrowych uniezależni...
In many scientific and practical tasks, the classical concepts for parameter identification are satisfactory and generally applied with success, although many specialized problems necessitate the use of methods created with specifically defined assumptions and conditions. This paper investigates the method of parameter identification for the case w...
This paper investigates a possibility of supplementing standard dimensionality reduction procedures, used in the process of knowledge extraction from multidimensional datasets, with topology preservation measures. This approach is based on an observation that not all elements of an initial dataset are equally preserved in its low-dimensional embedd...
The paper deals with the classification task, where patterns are nonstationary. The method ensures the minimum expected value of misclassifications and is independent of patterns' shapes. This procedure eliminates elements of patterns with insignificant or even negative influence on the results' accuracy. Appropriate modifications follow the classi...
In many scientific and practical tasks, the classical concepts for parameter identification are satisfactory and generally applied with success, although many specialized problems necessitate the use of methods created with specifically defined assumptions and conditions. This paper investigates the method of parameter identification for the case w...
Seeds dataset, used in:
M. Charytanowicz, J. Niewczas, P. Kulczycki, P.A. Kowalski, S. Lukasik, S. Zak, 'A Complete Gradient Clustering Algorithm for Features Analysis of X-ray Images', in: Information Technologies in Biomedicine, Ewa Pietka, Jacek Kawa (eds.), Springer-Verlag, Berlin-Heidelberg, 2010, pp. 15-24.
See also: https://archive.ics.uci....
Celem niniejszej publikacji jest zaprezentowanie aplikacyjnych aspektów i własności kompletnej postaci gradientowego algorytmu klasteryzacji, a także ich ilustracja dla konkretnych praktycznych zadań z zakresu analizy systemowej. Podstawową cechą powyższego algorytmu jest brak wymagań dotyczących arbitralnego ustalenia liczby klastrów, co umożliwia...
The aim of this paper is to present a Complete Gradient Clustering Algorithm, its applicational aspects and properties, as well as to illustrate them with specific practical problems from the subject of bioinformatics (the categorization of grains for seed production), management (the design of a marketing support strategy for a mobile phone operat...
Przedmiotem niniejszej pracy jest wielowymiarowa analiza danych, która realizowana jest poprzez uzupełnienie standardowych procedur ekstrakcji cech odpowiednimi miarami zachowania struktury topologicznej zbioru. Podejście to motywuje obserwacja, że nie wszystkie elementy zbioru pierwotnego w toku redukcji są właściwie zachowane w ramach reprezentac...
In many scientific and practical tasks, the classical concepts for parameter identification are satisfactory and generally applied with success, although many specialized problems necessitate the use of methods created with specifically defined assumptions and conditions. This paper investigates the method of parameter identification for the case w...
Obecny wzrost możliwości oraz powszechność systemów komputerowych umożliwia równie intensywny rozwój jednej z głównych dziedzin współczesnych technik informacyjnych: wysokospecjalizowanych procedur analizy i eksploracji danych. Przedmiot niniejszej pracy stanowią estymatory jądrowe – jedna z wiodących koncepcji stosowanej tu metodyki estymacji niep...
The subject of this paper is a statistical fault detection system with the scope of detection, diagnosis and prognosis. It was designed using the fundamental procedures of data analysis and exploration: recognizing atypical elements (outliers), clustering, and classification, based on the nonparametric methodology of kernel estimators. Employing a...
A gradient clustering algorithm, based on the nonparametric methodology of statistical kernel estimators, expanded to its complete form, enabling implementation without particular knowledge of the theoretical aspects or laborious research, is presented here. The possibilities of calculating tentative optimal parameter values, and then – based on il...
The subject of the investigation presented here is Bayes classification of imprecise multidimensional information of in-terval type by means of patterns defined through precise data, e.g. deterministic or sharp. For this purpose the statistical kernel estimators methodology was applied, which makes the resulting algo-rithm independent of the patter...
Przedmiotem pracy jest zagadnienie identyfikacji parametrycznej dla przypadku, gdy straty wynikłe z błędów estymacji można opisać w postaci wielomianowej, dodatkowo wprowadzając asymetrię reprezentującą odmienne skutki błędów ujemnych i dodatnich. Co najważniejsze, opracowana metoda uwzględnia warunkowość badanego parametru, czyli jego istotne uzal...
This paper deals with dimensionality and sample length reduction applied to the tasks of exploratory data analysis. Proposed technique relies on distance preserving linear transformation of given dataset to the lower dimensionality feature space. Coefficients of feature transformation matrix are found using Fast Simulated Annealing - an algorithm i...
A complete gradient clustering algorithm formed with kernel estimators
The aim of this paper is to provide a gradient clustering algorithm in its complete form, suitable for direct use without requiring a deeper statistical knowledge. The values of all parameters are effectively calculated using optimizing procedures. Moreover, an illustrative anal...
The aim of this paper is to provide a gradient clustering algorithm in its complete form, suitable for direct use without requiring a deeper statistical knowledge. The values of all parameters are effectively calculated using optimizing procedures. Moreover, an illustrative analysis of the meaning of particular parameters is shown, followed by the...
The aim of this paper is present a novel method of data sample reduction for classification of interval information. Its concept is based on the sensitivity analysis, inspired by artificial neural networks, while the goal is to increase the number of proper classifications and primarily, calculation speed. The presented procedure was tested for the...
Methods based on kernel density estimation have been successfully applied for various data mining tasks. Their natural interpretation together with suitable properties make them an attractive tool among others in clustering problems. In this paper, the Complete Gradient Clustering Algorithm has been used to in-vestigate a real data set of grains. T...
Współczesne sieci teleinformatyczne w coraz większym stopniu wykorzystują metody transmisji radiowej. Pozwalają one na zmniejszenie kosztów związanych z budową infrastruktury sieciowej, a rosnąca wydajność łączności bezprzewodowej umożliwia jej zastosowanie także w przypadkach, gdy wymagana jest wysoka sprawność przesyłu informacji. Pasmo użyteczny...
The parameter identification for problems where losses arising from overestimation and underestimation are different and can be described by an asymmetrical and polynomial function is investigated in this paper. The Bayes decision rule allowing to minimize potential losses is used. Calculation algorithms are based on the nonparametric methodology o...
Przedmiotem prezentowanych badań jest zadanie wspomagania decyzji dotyczącego wyznaczenia strategii postępowania wobec klienta korporacyjnego – abonenta sieci telefonii komórkowej. Badania przeprowadzone zostały na rzeczywistej bazie danych, uzyskanej od jednego z polskich operatorów sieci GSM. Wzmiankowana strategia została określona z zastosowani...
At present, statistical kernel estimators constitute the dominant – in practice – method of nonparametric estimation. It allows the useful characterization of probability distributions without arbitrary assumptions regarding their membership to a fixed class. In this paper their use to the basic tasks of data analysis and exploration, i.e. identifi...
Przedmiotem niniejszej pracy jest zagadnienie redukcji wymiaru i liczności próby losowej z przeznaczeniem do procedur eksploracyjnej analizy danych, określonych przy użyciu metodyki statystycznych estymatorów jądrowych. Koncepcja opiera się na liniowej transformacji przestrzeni, przy czym współczynniki macierzy wyznaczane są z zastosowaniem metaheu...
ERRATUM to the paper:
Piotr Kulczycki, Karina Daniel
“Metoda wspomagania strategii marketingowej operatora telefonii komorkowej”
Przeglad Statystyczny, vol. 56, no. 2, pp. 116-134, 2009
ERRATUM to the paper:
Piotr Kulczycki, Aleksander. Mazgaj
“Parameter Identification for Asymmetrical Polynomial Loss Function”
Information Technology and Control, vol. 38, no. 1, pp. 51-60, 2009
Together with the dynamic development of modern computer systems, the possibilities of applying refined methods of nonpara-metric estimation to control engineering tasks have grown just as fast. This broad and complex theme is presented in this paper for the case of estimation of density of a random variable distribution. Nonparametric methods allo...
Data clustering constitutes at present a commonly used technique for extracting fuzzy system rules from experimental data. Detailed studies in the field have shown that using above-mentioned method results in significantly reduced structure of fuzzy identification system, maintaining at the same time its high modelling efficiency. In this paper a c...
The parameter identification for problems where losses arising from overestimation and underestimation are different and can be described by an asymmetrical and polynomial function, is investigated here. The Bayes decision rule allowing to minimize potential losses is used. Calculation algorithms are based on the nonparametric methodology of statis...
The parameter identification for problems where losses arising from overestimation and underestimation are different and can be described by an asymmetrical and polynomial function, is investigated here. The Bayes decision rule allowing to minimize potential losses is used. Calculation algorithms are based on the nonparametric methodology of statis...
This paper presents the concept of a fault detection system covering detection, diagnosis, and prognosis associated with them. To this aim procedures of data analysis and exploration, based on the nonparametric method of kernel estimators were applied. This method allows the useful characterization of probability distributions without arbitrary as-...