Janusz SosnowskiWarsaw University of Technology · Institute of Computer Science
Janusz Sosnowski
Prof.
About
197
Publications
18,408
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
947
Citations
Introduction
Professor in Computer Science and Computer Engineering, Head of the Division for Computer Architectures and Software Engineering within the Institute of Computer Science - Warsaw University of Technology. Member of IEEE Computer Society and Reliability Society; Test Technology Technical Council ; Member of Euromicro board of directors. Member of IFAC and POLSPAR (2008-). Program Committee member of many international conferences, reviewer for many int. conferences and journals
Publications
Publications (197)
The Issue Tracking System (ITS) repositories are rich sources of software development documentation that are useful in assessing the status and quality of software projects. An original model is proposed for tracing issue handling activities and their impact on project progress. As opposed to classical data mining of software repositories, we consi...
Software project development and maintenance activities have been reported in various repositories. The data contained in these repositories have been widely used in various studies on specific problems, e.g., predicting bug appearance, allocating issues to developers, and identifying duplicated issues. Developed analysis schemes are usually based...
Issue tracking systems comprise diverse data which are useful in evaluating or improving software development processes. Revealing and interpreting this information is a challenging problem which needs appropriate algorithms supported with relevant tools. For this purpose, we use text mining schemes adapted to the specificity of the software reposi...
Software reliability depends on the performed tests. Bug detection and diagnosis are based on test outcome (oracle) analysis. Most practical test reports do not provide sufficient information for localizing and correcting bugs. We have found the need to extend the space of test result observation in data and time perspectives. This resulted in trac...
In many embedded systems, we face the problem of correlating signals characterising device operation (e.g., performance parameters, anomalies) with events describing internal device activities. This leads to the investigation of two types of data: time series, representing signal periodic samples in a background of noise, and sporadic event logs. T...
Performance monitoring and anomaly detection are major issues in designing and maintaining electronic devices and systems. In recent years, they become more difficult due to the increased complexity of hardware and software. Hence, an important point is to collect representative signal samples and reveal characteristic features allowing to evaluate...
Wireless network nodes working in industrial environment or mounted on mobile machines are exposed to frequent and unpredictable temperature changes, which impact local clock accuracy resulting in synchronization problems (time offset). These issues have to be taken into account when optimizing the network to avoid higher energy consumption and tra...
Context
Tracing reports for software repositories have attracted many researchers. Most of them have focused on defect analysis and development processes in relation to open source programs. There exists a gap between open source and industrial software projects, which, in particular, relates to different schemes for creating software repositories...
The paper discusses problems related to testing software applications in mobile system environment. We present possibilities and limitations of available test support tools and propose three original programs improving this process. In the experimental studies we analyze test coverage for a set of representative open source projects. This research...
Event logs provide the capability to gain insight into system operation under the real workload. They have been widely used to detect anomalies, evaluate dependability including security, service time analysis, etc. The paper outlines log analyses schemes described in the literature and presents a new approach which takes into account specificity o...
Energy consumption has become dominant issue for wireless internet of things (IoT) networks with battery-powered nodes. The prevailing mechanism allowing to reduce energy consumption is duty-cycling. In this technique the node sleeps most of the time and wakes up only at selected moments to extend the lifespan of nodes up to 5–10 years. Unfortunate...
Operational/functional problems in computer systems can be identified by monitoring and exploring performance metrics. These metrics can also be used to evaluate system activity profiles and manage relevant infrastructure (hardware and software). The critical point is finding features that make it possible to distinguish normal from abnormal system...
Fast growing market related to Internet of Things creates new demands for wireless sensor networks. Here we face the
problem of optimizing energy consumption in many miniature electronic devices powered from local batteries and
requested long life span of nodes up to 5-10 years. The transmission processes constitute the primary source of energy
con...
Agile software development and management related to Scrum model is gaining popularity. It may differ significantly
from classical development schemes widely discussed in the literature. An interesting issue is to evaluate project
progress and the effectiveness of task handling in Scrum based development. This process can be supported with an
advan...
In software development and testing an interesting issue is checking correlations of observed software defects with
various product and process metrics. Such analysis is helpful in predicting potential defects and optimization of testing
processes. In the paper we present results of deeper studies in this area, they involve many metrics and various...
The paper deals with the problem of computer performance evaluation basing on program benchmarks. We outline the drawbacks of existing techniques and propose a significant enhancement using available performance counters. This new approach needed development of supplementary tools to control benchmarking processes and combining them with various me...
Computer systems generate large amounts of event logs related to various operational aspects (positive and negative). Extracting from them useful information (e.g. targeted at dependability and resilience issues) is a challenging problem widely discussed in the literature and still needing deeper studies. We have developed a new holistic approach u...
In the paper we discuss the problem of evaluating disc performance with benchmarks. In particular, we concentrate on assessing benchmark properties. For this purpose we have developed benchmark managing platform which allows us to enhance the benchmark execution process with monitoring performance counters. The developed methodology and tool do not...
Context
Although many papers have been published on software development and maintenance processes, there is still a need for deeper exploration of software repositories related to real projects to evaluate these processes.
Objective
The aim of this study is to present and evaluate different schemes of handling problems (bugs) during software deve...
This paper presents software methods of improving fault tolerance in embedded systems. These methods have been adapted to a telemetry system dedicated to tracking vehicles for logistics purposes. The developed telemetry system allows us to monitor vehicle position and some technical parameters via GSM communication. It comprises the capability of r...
Various event and performance logs are available in computer systems. They are considered as useful data source characterizing system operational profiles and possible anomalies. In the literature different classes of logs are usually analyzed separately and targeted at some specific problems. We have developed analysis schemes and tools which faci...
The development and maintenance of software is a complex process which needs systematic monitoring of appearing problems (bugs) and appropriate actions to resolve them. These processes are supported by various tools which may create repositories comprising a lot of data. An important issue is to explore these data and derive information useful to e...
The paper deals with the problem of monitoring software development and maintenance processes. In particular, we concentrate on data reported in software bug repositories. These data characterize the progress and effectiveness of the above mentioned processes. To analyze these data we have developed a special program which extracts information from...
The paper deals with the problem of monitoring reliability issues in embedded systems. In particular, we concentrate on data reported during development, testing, operation in the field and service. For this purpose a special tool has been developed which provides the capability to collect data on-line and perform various analyses. The usefulness o...
Contemporary computer systems provide mechanisms for monitoring various performance parameters (e.g. processor or memory usage, disc or network transfers), which are collected and stored in performance logs. An important issue is to derive characteristic features describing normal and abnormal behavior of the systems. For this purpose we use variou...
This is a short paper (work in progress) dealing with monitoring the operation of embedded systems in the field. It covers our experience with a satellitte and some industrial controllers.
The paper deals with the problem of analyzing performability of creating cloud environment within a server taking into account various hardware and software configurations. Within the hardware we consider the impact of using SATA and SSD discs for holding data and program codes of the involved processes. Within software we checked single and multip...
This is a handbook devoted to system dependability issues such as: testing and diagnostics of didital systems and circuits (boundary scan test, BIST, deterministic, pseuddoramdom testing, memory and CPU testing, diagniostic models), testing and reliability of software (SRGM models), error detection and fault tolerance (massive and partial redundanc...
In the paper we present our experience with development and maintenance of complex software systems. In particular, we concentrate on monitoring related development, testing and debugging processes. We have analyzed the contents of collected reports (provided by different tools) covering many projects and defined several metrics and statistics help...
Many systems implemented in FPGAs are based on embedded processor cores (so called soft cores). Testing such systems is a challenging task due to possible faults in functional blocks, configuration memory and relevant circuitry. The paper deals with software-based self-test schemes taking into account an important requirement on test memory and tim...
Context
Although many papers have been published on software development and defect prediction techniques, problem reports in real projects quite often differ from those described in the literature. Hence, there is still a need for deeper exploration of case studies from industry.
Objective
The aim of this study is to present the impact of fine-gr...
Due to specific conditions for electronic equipment in satellites and high launching costs dependability issues of satellite subsystems are of great importance. This paper presents PPLD-PSU subsystem designed for polish payload of BRITE-PL Hevelius microsatellite. Developing software for this system we have assured some dependability requirements r...
In the paper we discuss how a single node communication interface failure in a time-triggered system can be used to model a DoS-type attack. More so, we present a design approach based on active detection of common DoS characteristics, which can serve as a template for attack detection. This approach in feasible in time-triggered systems because of...
The paper presents the extent of fault effects in FPGA based systems and concentrates on transient faults (induced by single event upsets – SEUs) within the configuration memory of FPGA. An original method of detailed analysis of fault effect propagation is presented. It is targeted at microprocessor based FPGA systems using the developed fault inj...
Recently, much attention is paid to assess system operation by collecting and analyzing various system or application features. This facilitates to evaluate quality of services, detect or even predict problems, etc. The paper presents our experience with monitoring data repository operation. It is based on collected system/application event and per...
This paper presents the methodology of monitoring software testing and debugging processes during system development and usage. We concentrate on control metrics related to these problems and consider two development models related to practical projects. Basing on the collected data we show the usefulness of the presented approach to control softwa...
Developments of technologies play an important role in success and competitiveness of firms. This can be assured by means of
innovation, which improves quality of products and services, expends the markets and attracts customers. This paper outlines
factors having direct impact on enterprise innovativeness. Referencing to classical approaches we sh...
The paper deals with the problem of creating a knowledge data base on system dependability and resilience created on the basis of available system and application logs. Special tools to collect and analyse this data from many systems have been developed. Taking into account a wide spectrum of various logs we explore them locally and globally. This...
During system exploitation and maintenance an important issue is to evaluate its operational profile and detect occurring anomalies or situations which may lead to such anomalies (anomaly prediction issue). To resolve these problems we have studied the capabilities of standard event and performance logs which are available in computer systems. In p...
In many applications basing on embedded systems we have the problem with limited access for servicing. During the exploitation of such systems it happens that various errors can appear in hardware or software. Many of these errors can be eliminated (e.g. single event upsets), avoided or repaired (e.g. software bugs) by reprogramming the system part...
The paper deals with problems related to run-time system monitoring based on event and performance logs. Specially developed tools and experimental results are presented. Full text is available at http://isim.wzim.sggw.pl/resources/ISIM_XVI_2012.pdf
This paper deals with the problem of analyzing application event logs in
relevance to dependability evaluation. We present the significance of
application logs as a valuable source of information on operational profiles,
anomalies and errors. They can enhance classical approaches based on monitoring
system logs and performance variables. Keywords;...
This paper presents the methodology of monitoring a mail server in order to assess its dependability and detect various anomalies. It is
based on collecting and analysing various events stored in system logs and continuous monitoring of system resource usage. A special program has
been developed and practically verified to deal with these problems...
The paper deals with the problem of improving dependability in industrial embedded systems. This problem is considered in relevance to the developed gas flow computer. It is implemented around ARM microcontroller which performs complex measurements and calculations of gas flow with embedded software based self-test mechanisms (SBST) assuring fault...
Resolving complex problems on cluster systems we have to take into account threats related to system dependability. We faced this problem in relevance to the developed project of the KASKADA platform targeted at managing heavy multimedia processing in a supercomputer environment (Galera cluster in Intel technology). Having analyzed the experience o...
Developing software for mobile platforms we face the problem of dealing with various erroneous situations, transient faults,
component incompatibilities which influence their operations. This results in the need of embedding error detection mechanisms
and handling them software procedures. This problem has been appreciated by Samsung. As the conseq...
In many microcontroller applications the impact of transient faults
(electromagnetic disturbances, cosmic radiation, etc.) on their operation
has to be taken into account. The paper presents a new methodology of
testing transient fault robustness in microcontrollers. It is based on the
developed fault injection platform which is coupled to the test...
High complexity and dynamicity of computer systems result in various anomalies related to failures and usage problems. Detection and diagnosis of anomalies is still a challenging practical problem. The paper presents a project on a multidimensional and wide range monitoring of a big population of heterogeneous systems.
The paper presents some experience with developing a satellite power controller. Due to high probability of transient faults
(in particular caused by cosmic radiation) we had to check their impact on the controller operation. For this purpose we have
developed a special test bed. It comprises some universal fault simulator (FITS), a software model...
The paper presents a new technique of simulating faults in embedded systems via JTAG interface. Experimental results are presented for ARM architecture.
Tghe paper deals with the problem of analaysing event logs in Unix systems
In the paper, we discuss an original methodology of dependability evaluation dedicated for safety-critical embedded systems. It is based on a fault simulation technique known as Software Implemented Fault Injection (SWIFI). This methodology combines functional and structural models to achieve higher modeling accuracy than existing approaches. The m...
The paper presents the dependability comparison of software implementation of the explicit and numerical Generalized Predictive
Control (GPC) Model Predictive Control (MPC) algorithms. The investigated GPC algorithms are implemented for a control system
of a multivariable chemical reactor – a process with strong cross-couplings. The fault sensitivi...
The paper presents our experience with developing tests for microcontroller based embedded systems. We use application specific tests. They are integrated with the implemented application (program) and available on-line error detection mechanisms. The effectiveness of this approach has been analyzed in simulation experiments and referenced to some...
The paper deals with the problem of evaluating fault robustness of the software implemented Dynamic Matrix Control (DMC) Model
Predictive Control (MPC) algorithms. Numerical and explicit implementations of the DMC algorithms are considered. It is shown
that faults affecting the algorithms can provoke undesirable behaviour or even destabilize the pr...
The paper presents the new software library supporting development of the fault-robust applications. The main goals of the proposed software hardening mechanisms are: usage simplicity for the programmer, independence from the development tool, effectiveness in terms of fault coverage, low static and dynamic overheads. The paper describes implemente...
The paper deals with the problem of adapting software implemented fault injection technique (SWIFI) to evaluate dependability
of reactive microcontroller systems. We present an original methodology of disturbing controller operation and analyzing fault
effects taking into account reactions of the controlled object and the impact of the system envir...
The paper deals with the problem of evalauting system dependability basing on event logs and performance counters. For this purpose some special,tools have been developed which include aslo some data exploartion capabilities. The usefullness of these tools has been ilustrated in relevance to some real monitored systems. Full copy of this paper is a...
The paper deals with the problem of testing CPUs in embedded systems taking into account application properties. Basing on the developed original software tools we have analysed the coverage of CPU functionality and operational stresses for many benchmark programs. The experimental results confirmed the need of introducing application driven testin...
Methods of fault-hardening software implementations of the numerical Model Predictive Control (MPC) algorithms. are discussed in the paper. The fault sensitivity of the non-fault-hardened algorithms implementations and the effectiveness of the fault hardening procedures are verified in experiments with a software implemented fault injector. These e...
This paper presents a new approach to detecting abnormal situations in computer systems basing on an integrated monitoring, which covers performance parameters as well as other operational reports or logs. It takes into account different time perspectives and analysis goals. The presented methodology was implemented in Windows based systems. Its us...
The paper deals with the problem of developing builti in self tests for microprocessors. We consider available
hardware mechanisms improving testability and software based self-testing. This approaches are supllemented with application based testing and an illustration of testing car immobiliser.
The paper describes the problem of evaluating dependability of computer systems with on-line monitoring
mechanisms. The main features of possible measurements (related to system operation) and the
scope of collected data are shortly outlined. On the basis of this survey we formulate problems of selecting
and processing the collected data in relevan...
Testing peripheral interfaces of computers is the challenging problem in the context of their complexity and multi layer structure. This paper presents our experience with a new systematic approach to generate tests for interfaces and check testability features. It was verified for real interfaces.
In real-time safety-critical systems, it is important to predict the impact of faults on their operation. For this purpose
we have developed a test bed based on software implemented fault injection (SWIFI). Faults are simulated by disturbing the
states of registers and memory cells. Analyzing reactive and embedded systems with SWIFI tools is a new...
The paper presents an approach to improve the dependability of software implementation of the explicit DMC (Dynamic Matrix
Control) Model Predictive Control (MPC) algorithm. The investigated DMC algorithm is implemented for a control system of a
rectification column - a process with strong cross-couplings and significant time delays. The control pl...
The paper addresses the problem of creating a comprehensive fault injection environment, which integrates and improves various simulation and supplementary functions. This is illustrated with experimental results.
Testing cache memories within the computer system environment is based on using processor instructions, which involve cache operations intermixed with RAM memory accesses. Applying test patterns to the cache and checking its behavior needs sophisticated instruction sequences. We simplify these sequences by means of the available on-chip performance...
Selecting and designing bus standards for safety-critical applications requires careful analysis of error-detection and fault-handling mechanisms. This analysis must be based on the revision of standard specifications and an experimental evaluation that covers representative fault classes. The traditional approach to evaluating a bus design measure...
The paper studies dependability of software implementation of the explicit DMC (Dynamic Matrix Control) Model Predictive Control (MPC) algorithm applied for a rectification column. The process with two inputs and two outputs with strong cross-couplings and significant time delays is studied. The algo-rithm's control law is calculated off-line. Depe...
tekst wystąpienia na sesji poświęconej prof. Zdzisławowi Pawlakowi, Zakopane 2006
Software implemented fault injection technique is gaining much interest in evaluating system dependability. For complex software
applications fault injection experiments take a lot of time. In the paper we present an innovative approach to fault injection
by performing it in LAN distributed environment. The paper presents the system architecture,...
The paper deals with the problem of handling detected faults in computer systems. We present software procedures targeted at fault detection, fault masking and error recovery. They are discussed in the context of standard PC Windows and Linux environments. Various aspects of checkpointing and recovery policies are studied. The presented considerati...
The paper presents a study of generating test sequences for the SCSI (small computer systems interface) interface. For this purpose a special conformance test generator has been developed. The paper describes the conformance testing algorithm and the conditions that are assumed for the tests. Test generation method is based on UIO (unique input out...
This paper discusses some dependability problems (e.g. message delays) related to clock drifts in the processing nodes of real-time networks based on event and time triggered interfaces. We show that having detected the clock drift, it is reasonable to delay the shutdown of the network node whose clock synchronization algorithm is not able to track...
The paper deals with the problem of creating a specialized data warehouse for collecting and analyzing experimental results,
which relate to system dependability evaluation using fault injections into running programs. The developed data warehouse
with embedded data mining capabilities facilitates to identify factors influencing fault susceptibilit...
The paper deals with the problem of evaluating system dependability using software implemented fault injectors (SWIFIs). In particular we describe methods of improving functionality and performance in SWIFI injectors. We discuss problems related to experiment scheduling and simulation result interpretation. The presented considerations base on our...
The paper studies dependability of software implementation of the explicit DMC (Dynamic Matrix Control) Model Predictive Control (MPC) algorithm applied for a rectification column. The process with two inputs and two outputs with strong cross-couplings and significant time delays is studied. The algorithm's control law is calculated off-line. Depen...
This paper studies dependability of software implementation of DMC (Dynamic Matrix Control) and GPC (Generalised Predictive Control) Model Predictive Control (MPC) algorithms. Explicit formula-tion of algorithms is considered in which the control laws are calculated off-line. Dependability is evaluated usig software implemented fault in-jection app...