About
59
Publications
12,030
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
478
Citations
Introduction
Skills and Expertise
Publications
Publications (59)
The present study describes the data sets produced in Warsaw, Poland with the aim of developing tools and methods for the implementation of human-centred and data-driven solutions to the enhancement of sustainable mobility transition. This study focuses on school commutes and alternatives to private cars for children drop off and pick up from prima...
Demand for sustainable mobility is particularly high in urban areas. Hence, there is a growing need to predict when people will decide to use different travel modes with an emphasis on environmentally friendly travel modes. As travel mode choice (TMC) is influenced by multiple factors, in a growing number of cases machine learning methods are used...
Travel mode choice (TMC) prediction, which can be formulated as a classification task, helps in understanding what makes citizens choose different modes of transport for individual trips. This is also a major step towards fostering sustainable transportation. As behaviour may evolve over time, we also face the question of detecting concept drift in...
Promoting sustainable transportation necessitates understanding what makes people select individual travel modes. Hence, classifiers are trained to predict travel modes, such as the use of private cars vs bikes for individual journeys in the cities. In this work, we focus on parking-related factors to propose how survey data, including spatial data...
The unprecedented interest in sustainable transport modes for urban areas raises the question of what makes citizens select environmentally friendly transport modes such as public transport rather than private cars. While travel surveys are conducted to document real transport mode choices, they can also shed light on how these choices are made.In...
Unlabelled data appear in many domains and are particularly relevant to streaming applications, where even though data is abundant, labelled data is rare. To address the learning problems associated with such data, one can ignore the unlabelled data and focus only on the labelled data (supervised learning); use the labelled data and attempt to leve...
Evaluation methods for data stream classification have frequently been focused on how available data are used for learning a model and for its performance assessment, with major emphasis on the difference between predicted and true labels. More recently, growing interest in delayed labelling evaluation has resulted in the evaluation of multiple pre...
Public transport systems are expected to reduce pollution and contribute to sustainable development. However, disruptions in public transport such as delays may negatively affect mobility choices. To quantify delays, aggregated data from vehicle location systems are frequently used. However, delays observed at individual stops are caused inter alia...
Open data portals are used to make a growing number of government data resources public. However, difficulties in preprocessing and integrating multiple possibly incomplete open data resources hinder the potential of open data re-use for software development. While incomplete data sets can be imputed in an offline process, this is not the case for...
Unlabelled data appear in many domains and are particularly relevant to streaming applications, where even though data is abundant, labelled data is rare. To address the learning problems associated with such data, one can ignore the unlabelled data and focus only on the labelled data (supervised learning); use the labelled data and attempt to leve...
The way open data resources of varied type and volume are used by software applications remains only partly known. In this study, following CRoss-Industry Standard Process for Data Mining, we propose a methodology for collecting and analyzing access data describing the use of open data resources by individual software applications. The methodology...
A large portion of the stream mining studies on classification rely on the availability of true labels immediately after making predictions. This approach is well exemplified by the test-then-train evaluation, where predictions immediately precede true label arrival. However, in many real scenarios, labels arrive with non-negligible latency. This r...
For many streaming classification tasks, the ground truth labels become available with a non-negligible latency. Given this delayed labelling setting, after the instance data arrives and before its true label is known, the online classifier model may change. Hence, the initial prediction can be replaced with additional periodic predictions graduall...
The explosive increase in volume, velocity, variety, and veracity of data generated by distributed and heterogeneous nodes such as IoT and other devices, continuously challenge the state of art in big data processing platforms and mining techniques. Consequently, it reveals an urgent need to address the ever-growing gap between this expected exasca...
In this work, we describe an urban Internet of Things (IoT) architecture, grounded in big data patterns and focused on the needs of cities and their key stakeholders. First, the architecture of the dedicated platform USE4IoT (Urban Service Environment for the Internet of Things), which gathers and processes urban big data and extends the Lambda arc...
In spite of the diversity of solutions developed in the Internet of Things (IoT) domain, some features are shared by numerous IoT deployments and the data they process. These include incompleteness and latency in data transmission from multiple distributed objects. Among others, the systems tracking the location of vehicles are affected by these pr...
Many modern applications use services and data made available by provisioning platforms of third parties. The question arises if the use of individual services and data resources such as open data by novel applications can be predicted. In particular, whether initial software development efforts such as application development during hackathons can...
Indoor positioning methods make it possible to estimate the location of a mobile object in a building. Many of these methods rely on fingerprinting approach. First, signal strength data is collected in a number of reference indoor locations. Frequently, the vectors of the strength of the signals emitted by WiFi access points acquired in this way ar...
Experimental data sets that include tool settings, tool and machine-tool behavior, and surface roughness data for milling processes are usually of limited size, due mainly to the high costs of machining tests. This fact restricts the application of machine-learning techniques for surface roughness prediction in industrial settings. The primary obje...
Indoor positioning systems answer the need for ubiquitous localisation systems. Frequently, indoor positioning relies on machine learning models developed based on the training data composed of WiFi received signal strength (RSS) vectors observed in different indoor locations. However, this requires expensive collection of RSS vectors in precisely...
The systems monitoring the location of public transport vehicles rely on wireless transmission. The location readings from GPS-based devices are received with some latency caused by periodical data transmission and temporal problems preventing data transmission. This negatively affects identification of delayed vehicles. The primary objective of th...
The systems monitoring the location of public transport vehicles rely on wireless transmission. The location readings from GPS-based devices are received with some latency caused by periodical data transmission and temporal problems preventing data transmission. This negatively affects identification of delayed vehicles. The primary objective of th...
Route planning makes direct use of geographic data and provides beneficial recommendations to the public. In real-world the schedule of transit vehicles is dynamic and delays in the schedules occur. Incorporation of these dynamic schedule changes in multi-modal route computation is difficult and requires a lot of computational resources. Our approa...
This book provides the reader with a comprehensive selection of cutting–edge algorithms, technologies, and applications. The volume offers new insights into a range of fundamentally important topics in network architectures, network security, and network applications. It serves as a reference for researchers and practitioners by featuring research...
Owing to the ever growing communication systems, modern networks currently encompass a wide range of solutions and technologies, including wireless and wired networks and provide basis for network systems from multiple partly overlapping domains such as the Internet of Things (IoT), cloud services, and network applications. This appears in numerous...
Machine-learning techniques frequently predict the results of machining processes, based on pre-determined cutting tool settings. By doing so, key parameters of a machined product can be predicted before production begins. Nevertheless, a prediction model cannot capture all the features of interest under real-life industrial conditions. Moreover, c...
Indoor Positioning Services (IPS) estimate the location of devices, frequently being mobile terminals, in a building. Many IPS algorithms rely on the Received Signal Strength (RSS) of radio signals observed at a terminal. However, these signals are noisy due to both the impact of the surrounding environment such as presence of other persons, and li...
The primary objective of the work is the preliminary investigation of the adoption of Open Data and Open API telecommunication functions by software developers. The analysis is based on the statistical data collected during developer contests. Based on Open API Hackathon and Business Intelligence API Hackathon contests, the interest of software dev...
As a response to the increasing number of cyber threats, novel detection and prevention methods are constantly being developed. One of the main obstacles hindering the development and evaluation of such methods is the shortage of reference data sets. What is proposed in this work is a way of testing methods detecting network threats. It includes a...
A wide range of opportunities are emerging in the micro-system technology sector for laser micro-machining systems, because they are capable of processing various types of materials with micro-scale precision. However, few process datasets and machine-learning techniques are optimized for this industrial task. This study describes the process param...
The future of Location Based Services largely depends on the accuracy of positioning techniques. In the case of indoor positioning, frequently fingerprinting-based solutions are developed. A well known k Nearest Neighbours method is frequently used in this case. However, when the detection of a floor a mobile terminal is located at is an objective,...
The vigorous expansion of wind energy power generation over the last decade has also entailed innovative improvements to surface roughness prediction models applied to high-torque milling operations. Artificial neural networks are the most widely used soft computing technique for the development of these prediction models. In this paper, we concent...
Surface roughness generation is influenced by many complex and interrelated factors. Moreover, in real industrial conditions many different milling tools have to be used to create a final product. Hence, the acquisition of experimental data used to set up artificial intelligence models of individual tools is a complicated task. The aim of this pape...
One of the key applications of statistical analysis and data mining is the development of the classification and prediction models. In both cases, significant improvements can be attained by limiting the number of model inputs.
Potential benefits of data reduction methods are largely hindered by the fact that the selection of an appropriate feature...
The ever growing volume of network traffic results in the need for even more efficient data processing in Intrusion Detection Systems. In particular, the raw network data has to be transformed and largely reduced to be processed by data mining models.
The primary objective of this work is to control the dimensionality reduction (DR) of network flow...
The advent of modern low-cost monitoring and wireless transmission systems results in unprecedented availability of measurement data potentially available in near real-time mode. In particular, some of the remote meter reading systems can be used to collect data on an hourly or even sub-hourly basis. This allows the utility companies to model and p...
Soft computing techniques are frequently used to develop data-driven prediction models. When modelling of an industrial process
is planned, experiments in a real production environment are frequently required to collect the data. As a consequence, in
many cases the experimental data sets contain only limited number of valuable records acquired in e...
This paper presents a simple automatic system for small and middle Internet companies selling goods. The system combines temporal sales data with its geographical location and presents the resulting information on a map. Such an approach to data presentation should facilitate understanding of sales structure. This insight might be helpful in genera...
A soft computing system used to optimize deep drilling operations under high-speed conditions in the manufacture of steel
components is presented. The input data includes cutting parameters and axial cutting force obtained from the power consumption
of the feed motor of the milling centres. Two different coolant strategies are tested: traditional w...
World Wide Web (WWW) is a vast source of information, the problem of information overload is more acute than ever. Due to
noise in WWW, it is becoming hard to find usable information. Real-estate listings are frequently available through different
real estate agencies and published on different web sites. As a consequence, differences in price and...
Modern utility companies manage extensive distribution systems to provide multiple consumers with water, heat and electrical
power. At the same time significant savings can be received from a combination of monitoring systems and modelling applications
used to optimize the distribution systems. Thus, in case of pipeline systems, the problem of iden...
Load forecasting plays an important role in modern utilities. However, further improvements can be expected by predicting
the load at a consumer level. The latter approach has become available with the advent of low-cost monitoring and transmission
systems. Still, due to the limited number of monitored clients, the way groups of consumers should be...
Numerous techniques of artificial intelligence have been used for building prediction models. One of such tasks is the prediction of heat consumption in a district heating system. Not only is it required for ensuring sufficient heat production, but also it is necessary to avoid substantial heat loss due to overestimated demand for heat. The work pr...
The advent of the Internet has strongly influenced modern software systems. Existing intranet solutions are being gradually
replaced with www services available everywhere and at any time. The availability of the wide area network has resulted in
unprecedented opportunities of remote and distributed cooperation of large groups of people and organiz...
The primary objective of the work was to check the relation between household socio-economic characteristics and the corresponding food purchasing capabilities. The analysis has been based on the data collected in the national survey “Social Diagnosis 2000”. In order to search for possible dependencies between variables gathered in the survey, diff...
The purpose of short-term load forecasting is to optimise the power supply volume in short time horizon. There is no straightforward mapping rule between the type of time period and the resulting power consumption. Still, it is inevitable for the overall efficiency of the power system to rely on a good prediction model. Our paper illustrates a nove...
Evolutionary artificial neural networks have gradually evolved as a new field at the junction of artificial neural networks and evolutionary algorithms. Rapid growth of both fields could be observed in the last few years. However, limitations in neural networks techniques can diminish the potential of neural networks. Some of these limitations can...
Among different models of neural networks multilayer perceptrons
play an important role. Most training methods, including
back-propagation concentrate on weight adjustment only. Still the
performance of the network strongly depends on its architecture. In our
paper the algorithm based on evolutionary programming is proposed.
Unlike most other metho...
Evolutionary computation as an alternative to the traditional methods of multilayer neural networks design has been widely applied. The results of many simulations show that evolutionary algorithm can outperform standard training strategies, including back-propagation and its modifications.
The algorithm, described in this paper, summarises the r...
Prisoner's Dilemma is an important model not only for game theory, but also psychology, economy and political science. Both to evolve successful strategies and simulate the behaviour of independent agents, the genetic algorithm has been used for a few years. In this paper the solution comprising the collaborative combination of neural network and g...
In this paper new approach to treat incomplete data has been proposed. It has been based on the evolution of imputation strategies
built using both non-parametric and parametric imputation methods. Genetic algorithms and multilayer perceptrons have been
applied to develop a framework for constructing the imputation strategies addressing multiple in...
In many technical issues, the processes of interest could be precisely modelled if only all the relevant information were available. On the other hand, detailed modelling is frequently not feasible due to the cost of acquiring appropriate data. The paper discusses the way self-organising maps and multilayer perceptrons can be used to develop two-st...
District heating companies are responsible for delivering the heat produced in central heat plants to the consumers through a pipeline system. At the same time they are expected to keep the total heat production cost as low as possible. Therefore, there is a growing need to optimise heat production through better prediction of customers needs. The...