Marcin Paprzycki

Marcin Paprzycki
Instytut Badań Systemowych Polskiej Akademii Nauk | IBSPAN · Intelligent Systems

D.Sc.

About

538
Publications
195,512
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,694
Citations
Additional affiliations
January 2013 - March 2013
The University of Aizu
Position
  • Professor
August 2001 - August 2005
Oklahoma State University - Tulsa
Position
  • Professor (Assistant)
August 1990 - August 1997
University of Texas of the Permian Basin
Position
  • Assistant and Associate Professor

Publications

Publications (538)
Article
Full-text available
Modern systems often employ decentralised and distributed approaches. This can be attributed, among others, to the increasing complexity of system processes, which go beyond the capabilities of singular components. Additionally, with the growth in demand for system automation and high-level coordination, solutions belonging to the decentralised Art...
Preprint
Full-text available
The landscape of computing technologies is changing rapidly, straining existing software engineering practices and tools. The growing need to produce and maintain increasingly complex multi-architecture applications makes it crucial to effectively accelerate and automate software engineering processes. At the same time, artificial intelligence (AI)...
Preprint
Full-text available
Handling heterogeneity and unpredictability are two core problems in pervasive computing. The challenge is to seamlessly integrate devices with varying computational resources in a dynamic environment to form a cohesive system that can fulfill the needs of all participants. Existing work on systems that adapt to changing requirements typically focu...
Preprint
This comprehensive survey serves as an indispensable resource for researchers embarking on the journey of fake news detection. By highlighting the pivotal role of dataset quality and diversity, it underscores the significance of these elements in the effectiveness and robustness of detection models. The survey meticulously outlines the key features...
Article
Full-text available
Over the years, RDF streaming has been explored in research and practice from many angles, resulting in a wide range of RDF stream definitions. This variety presents a major challenge in discussing and integrating streaming systems due to a lack of a common language. This work attempts to address this critical research gap by systematizing RDF stre...
Preprint
Recently, multiple applications of machine learning have been introduced. They include various possibilities arising when image analysis methods are applied to, broadly understood, video streams. In this context, a novel tool, developed for academic educators to enhance the teaching process by automating, summarizing, and offering prompt feedback o...
Article
Full-text available
Recent years have been characterized by increasing interest in graph computations. This trend can be related to the large number of potential application areas. Moreover, increasing computational capabilities of modern computers allowed turning theory of graph algorithms into explorations of best methods for their actual realization. These factors,...
Article
Full-text available
Currently, deploying machine learning workloads in the Cloud-Edge-IoT continuum is challenging due to the wide variety of available hardware platforms, stringent performance requirements , and the heterogeneity of the workloads themselves. To alleviate this, a novel, flexible approach for machine learning inference is introduced, which is suitable...
Article
Full-text available
In this contribution, a novel optimization approach, derived from the behavioral patterns exhibited by Duroc pig herds, is proposed. In the developed metaheuristic, termed Artificial Duroc Pigs Optimization (ADPO), Ordered Fuzzy Numbers (OFN) have been applied to articulate and elucidate the behavioral dynamics of the pig herd. A series of experime...
Conference Paper
With the rising popularity of artificial intelligence-based solutions, it is becoming important not only to deploy machine learning models/pipelines with a good accuracy, but also to be able to control and manage their documentation and information related to monitoring, performance tracking, etc. Moreover, crucial aspects of data that is to be use...
Article
Full-text available
Reddit is the largest topically structured social network. Existing literature, reporting results of Reddit-related research, considers different phenomena, from social and political studies to recommender systems. The most common techniques used in these works, include natural language processing, e.g., named entity recognition, as well as graph n...
Article
Full-text available
Cloud infrastructures operate in highly dynamic environments, and today, energy-focused optimization become crucial. Moreover, the concept of extended cloud infrastructure, which, among others, uses green energy, started to gain traction. This introduces a new level of dynamicity to the ecosystem, as “processing components” may “disappear” and “com...
Article
Research into fake news detection has a long history, although it gained significant attention following the 2016 US election. During this time, the widespread use of social media and the resulting increase in interpersonal communication led to the extensive spread of ambiguous and potentially misleading news. Traditional approaches, relying solely...
Article
This work concerns automation of the training process, using modern information technologies, including virtual reality (VR). The starting point is an observation that automotive and aerospace industries require effective methods of preparation of engineering personnel. In this context, the technological process of preparing operations of a CNC num...
Article
Full-text available
Fall accidents in industrial and construction environments require an immediate reaction, to provide first aid. Shortening the time between the fall and the relevant personnel being notified can significantly improve the safety and health of workers. Therefore, in this work, an IoT system for real-time fall detection is proposed, using the ASSIST-I...
Article
Full-text available
As the largest open social medium on the Internet, Reddit is widely studied in the scientific literature. Due to its structured form and division into topical subfora (subreddits), conducted research often concerns connections and interactions between users and/or whole, subreddit-structure-based communities. Overall, the relations between communit...
Article
Full-text available
Continuous, real-time monitoring of occupational health and safety in high-risk workplaces such as construction sites can substantially improve the safety of workers. However, introducing such systems in practice is associated with a number of challenges, such as scaling up the solution while keeping its cost low. In this context, this work investi...
Chapter
Recently, cloud computing has emerged as key way of delivering computing resources. Hence, research has focused on optimizing use of cloud resources. The following contribution presents an agent-based Extended Green Cloud Simulator, motivated by the Green Edge Processing project of the cloud company, CloudFerro. The simulator serves as a digital tw...
Article
Pan-sharpening is a procedure to fuse the spatial detail of high-resolution multispectral images (HR-MSI) and low-resolution hyperspectral images (LR-HSI) to produce HR-MSI. Due to increase in high-resolution satellites, methods based on pan-sharpening are increasingly utilized all over the world. However, the majority of techniques consider pan-sh...
Article
Full-text available
Nowadays, natural language processing (NLP) is one of the most popular areas of, broadly understood, artificial intelligence. Therefore, every day, new research contributions are posted, for instance, to the arXiv repository. Hence, it is rather difficult to capture the current "state of the field" and thus, to enter it. This brought the idea of ap...
Chapter
Modern programming languages are very complex, diverse, and non-uniform in their structure, code composition, and syntax. Therefore, it is a difficult task for computer science students to retrieve relevant code snippets from large code repositories, according to their programming course requirements. To solve this problem, an AI-based approach is...
Chapter
The concept of extended cloud requires efficient network infrastructure to support ecosystems reaching form the edge to the cloud(s). Standard network load balancing delivers static solutions that are insufficient for the extended clouds, where network loads change often. To address this issue, a genetic algorithm based load optimizer is proposed a...
Article
Full-text available
Agent-based computing remains an active field of research with the goal of building (semi-)autonomous software for dynamic ecosystems. Today, this task should be realized using dedicated, specialized frameworks. Over almost 40 years, multiple agent platforms have been developed. While many of them have been “abandoned”, others remain active, and ne...
Article
Multicollinearity occurs when there comes a high level of correlation between the independent variables. This correlation creates the problem because the independent variables should be independent. Higher the degree of correlation means more complex problems you will face while fitting the model and interpreting the results. In this paper, we have...
Preprint
Full-text available
RDF data streaming has been explored by the Semantic Web community from many angles, resulting in multiple task formulations and streaming methods. However, for many existing formulations of the problem, reliably benchmarking streaming solutions has been challenging due to the lack of well-described and appropriately diverse benchmark datasets. Exi...
Chapter
Mathematical models are used to study and predict the behavior of a variety of complex systems - engineering, physical, economic, social, environmental. Sensitivity studies are nowadays applied to some of the most complicated mathematical models from various intensively developing areas of applications. Sensitivity analysis is a modern promising te...
Preprint
Full-text available
The concept of extended cloud requires efficient network infrastructure to support ecosystems reaching form the edge to the cloud(s). Standard approaches to network load balancing deliver static solutions that are insufficient for the extended clouds, where network loads change often. To address this issue, a genetic algorithm based load optimizer...
Chapter
Availability of large amount of annotated data is one of the pillars of deep learning success. Although numerous big datasets have been made available for research, this is often not the case in real life applications (e.g. companies are not able to share data due to GDPR or concerns related to intellectual property rights protection). Federated le...
Chapter
Recently, it has been stipulated that training larger and larger models, using ever increasing datasets is not sustainable in a long-run. Hence, the idea of Frugal Artificial Intelligence has been put forward. While there are many ways to make AI frugal, this contribution focuses on two of them, namely neural network pruning and binarization. Exper...
Chapter
While development of very large models is the core of today’s artificial intelligence, very often the cost of model training is being raised. In this context, active learning is pointed to as a method to maximize model quality, while minimizing the amount of resources needed to train it. The aim of this contribution is to systematically compare per...
Chapter
Nowadays, cataracts are one of the prevalent eye conditions that may lead to vision loss. Precise and prompt recognition of the cataract is the best method to prevent/treat it in early stages. Artificial intelligence-based cataract detection systems have been considered in multiple studies. There, different deep learning algorithms have been used t...
Chapter
In practical realizations of a Federated Learning ecosystems, the parties cooperating during the training process, and that later use the trained/global model may consist of competing institutions. This can result in incentives for malicious behavior, which can infringe on the safety and data privacy of other participants. Additionally, even in cas...
Article
With the recent advancements in technology, there has been a tremendous growth in the usage of images captured using satellites in various applications, like defense, academics, resource exploration, land-use mapping, and so on. Certain mission-critical applications need images of higher visual quality, but the images captured by the sensors normal...
Chapter
While actual deployments of fifth generation (5G) networks are in their initial stages and the actual need for 5G in our daily lives remains an open question, their potential to deliver high speed, low latency, and dependable communication services remains promising. Nevertheless, sixth generation (6G) networks have been proposed as a way to enhanc...
Article
Full-text available
Next Generation Internet of Things (NGIoT) addresses the deployment of complex, novel IoT ecosystems. These ecosystems are related to different technologies and initiatives, such as 5G/6G, AI, cybersecurity, and data science. The interaction with these disciplines requires addressing complex challenges related with the implementation of flexible so...
Preprint
Full-text available
Detecting Personal Protective Equipment in images and video streams is a relevant problem in ensuring the safety of construction workers. In this contribution, an architecture enabling live image recognition of such equipment is proposed. The solution is deployable in two settings -- edge-cloud and edge-only. The system was tested on an active cons...
Chapter
Full-text available
For many years, it was claimed that semantics should provide foundation of knowledge management in the enterprise. Today, it is easy to realize that this vision did not materialize. The aim of this work is to critically analyse the state of the art of use of semantic technologies in the enterprise and an attempt at diagnosing key problem(s).Keyword...
Article
Full-text available
There are many areas where conventional supervised machine learning does not work well, for instance, in cases with a large, or systematically increasing, number of countably infinite classes. Zero-shot learning has been proposed to address this. In generalized settings, the zero-shot learning problem represents real-world applications where test i...
Chapter
Full-text available
The vast body of scientific publications presents an increasing challenge of finding those that are relevant to a given research question, and making informed decisions on their basis. This becomes extremely difficult without the use of automated tools. Here, one possible area for improvement is automatic classification of publication abstracts acc...
Chapter
Abundance of vastly heterogeneous, high-volume/high-velocity data producers/consumers, predominantly caused by proliferation of IoT-based solutions, results in an urgent need for efficient semantic interoperability solutions. Hence, the need to solve the problems of domain understanding, domain formal representation, and expression of mappings betw...
Chapter
Researchers studying group behavior, social dynamics, or epidemiology lack an easy-to-use tool to run large-scale simulations. This contribution introduces a domain specific language (Agents Assembly; AASM) and a toolset for creating and running scalable simulations, using containerized environment. The proposed language supports describing abstrac...
Chapter
Full-text available
Reusing ontologies in practice is still very challenging, especially when multiple ontologies are (jointly) involved. Moreover, despite recent advances, the realization of systematic ontology quality assurance remains a difficult problem. In this work, the quality of thirty biomedical ontologies, and the Computer Science Ontology are investigated,...
Chapter
Full-text available
The aim of this contribution is to analyse practical aspects of the use of REST APIs and gRPC to realize communication tasks in applications in microservice-based ecosystems. On the basis of performed experiments, classes of communication tasks, for which given technology performs data transfer more efficiently, have been established. This, in turn...
Conference Paper
In this work, the development of virtual reality software for “industrial applications” is considered. It is argued that, in this context, the vast experience from the development of computer games cannot be used directly. Especially, the specific nature of solutions dedicated to industrial applications requires taking into account their specificit...
Chapter
Full-text available
New requirements, posed by the Next Generation IoT, demand design of novel reference architectures, providing foundation for implementation of Internet of Things (IoT) ecosystems. Building on cloud-native concepts (e.g. microservices, virtualisation, and containerization), a flexible architecture that answers requirements present in recent IoT depl...
Chapter
Full-text available
The Management and Orchestration framework (MANO) is the main element of the Network Function Virtualization paradigm. It is in charge of managing the life cycle of virtualized functions, from instantiation to manageability, live configuration, and termination. This kind of framework was originally designed to orchestrate network functions over vir...
Preprint
Full-text available
The aim of this contribution is to analyse practical aspects of the use of REST APIs and gRPC to realize communication tasks in applications in microservice-based ecosystems. On the basis of performed experiments, classes of communication tasks, for which given technology performs data transfer more efficiently, have been established. This, in turn...
Preprint
Full-text available
Currently, mid 2022, one of important trends in machine learning is to move away from monster-size models, which need petabytes of data to train and, during training, use Giga Watts of energy. This movement (called Frugal AI) is caused, also by rapid growth of IoT deployments. There, intelligence at the edge involves resource constrained models. Th...
Preprint
Full-text available
Federated learning (FL) was proposed to facilitate the training of models in a distributed environment. It supports the protection of (local) data privacy and uses local resources for model training. Until now, the majority of research has been devoted to "core issues", such as adaptation of machine learning algorithms to FL, data privacy protectio...
Preprint
Full-text available
With the ongoing, gradual shift of large-scale distributed systems towards the edge-cloud continuum, the need arises for software solutions that are universal, scalable, practical, and grounded in well-established technologies. Simultaneously, semantic technologies, especially in the streaming context, are becoming increasingly important for enabli...
Preprint
Full-text available
Availability of large amount of annotated data is one of the pillars of deep learning success. Although numerous big datasets have been made available for research, this is often not the case in real life applications (e.g. companies are not able to share data due to GDPR or concerns related to intellectual property rights protection). Federated le...
Preprint
Full-text available
One of the important problems in federated learning is how to deal with unbalanced data. This contribution introduces a novel technique designed to deal with label skewed non-IID data, using adversarial inputs, created by the I-FGSM method. Adversarial inputs guide the training process and allow the Weighted Federated Averaging to give more importa...
Chapter
Zero-shot learning is applied, for instance, when properly labeled training data is not available. A number of zero-shot algorithms have been proposed. However, since none of them seems to be an “overall winner”, development of a meta-classifier(s) combining “best aspects” of individual classifiers can be attempted. In this context, state-of-the-ar...
Preprint
Full-text available
Reusing ontologies in practice is still very challenging, especially when multiple ontologies are involved. Moreover, despite recent advances, systematic ontology quality assurance remains a difficult problem. In this work, the quality of thirty biomedical ontologies, and the Computer Science Ontology, are investigated from the perspective of a pra...