Krzysztof J. Stencel

Krzysztof J. Stencel
University of Warsaw | UW · Institute of Informatics

Prof.

About

101
Publications
27,700
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
536
Citations
Additional affiliations
February 2008 - present
Nicolaus Copernicus University
Position
  • Professor (Associate)
October 1995 - present
Polish-Japanese Academy of Information Technology
Position
  • Professor (Associate)
November 1993 - present
University of Warsaw
Position
  • Professor (Associate)

Publications

Publications (101)
Article
Full-text available
Upon receiving a new bug report, developers need to find its cause in the source code. Bug localization can be helped by a tool that ranks all source files according to how likely they include the bug. This problem was thoroughly examined by numerous scientists. We introduce a novel adaptive bug localization algorithm. The concept behind it is base...
Conference Paper
Full-text available
Nowadays, companies must inevitably analyze the available data and extract meaningful knowledge. As an essential prerequisite, Extract-Transform-Load (ETL) requires significant effort, especially for Big Data. The existing solutions fail to formalize, integrate and evaluate the ETL process for Big Data in a scalable and cost-effective way. In this...
Article
Full-text available
Code reviews consist in proof-reading proposed code changes in order to find their shortcomings such as bugs, insufficient test coverage or misused design patterns. Code reviews are conducted before merging submitted changes into the main development branch. The selection of suitable reviewers is crucial to obtain the high quality of reviews. In th...
Conference Paper
Full-text available
Analytic queries can exhaust resources of the DBMS at hand. Since the nature of such queries can be foreseen, a database administrator can prepare the DBMS so that it serves such queries efficiently. Materialization of partial results (aggregates) is perhaps the most important method to reduce the resource consumption of such queries. The number of...
Conference Paper
Full-text available
In this article we present a new algorithm for creating sim-plicial Vietoris-Rips complexes that is easily parallelizable using computation models like MapReduce and Apache Spark. The algorithm does not involve any computation in homology spaces.
Conference Paper
Full-text available
Medical research initiatives more and more often involve processing considerable amounts of data that may evolve during the project. These data should be preserved and aggregated for the purpose of future analyses beyond the lifetime of a given research project. This paper discussed the challenges concerned with the construction of the storage mana...
Conference Paper
Full-text available
Although querying hierarchies and networks is one of common tasks in numerous business application, the SQL standard has not acquired appropriate features until its 1999 edition. Furthermore, neither relational algebra nor calculus offer them. Since the announcement of the abovementioned standard, various database vendors introduced SQL:1999 recurs...
Conference Paper
Full-text available
Fortunately, the industry has eventually abandoned the old “one-size fits all” relational dream and started to develop task-oriented storage solutions. Nowadays, in a big project a devotion to a single persistence mechanism usually leads to suboptimal architectures. A combination of appropriate storage engines is often the best solution. However, s...
Conference Paper
Full-text available
Code reviews constitute an important activity in software quality assurance. Although they are essentially based on human expertise and scrupulosity, they can also be supported by automated tools. In this paper we present such a solution integrated with code review tools. It is based on a SVM classifier that indicates potentially buggy changes. We...
Conference Paper
Full-text available
Functional dependencies have been used in query optimisa-tion for decades. Moreover, if two domains have a natural ordering of their elements, a functional dependency of them can potentially preserve these orderings, i.e. be a monotonic function. This monotonicity can be exploited by query optimizers. Recently, such monotonic functional de-pendenci...
Conference Paper
Full-text available
Github is one of the most popular repository sites. It is a place where contributors come together to share code, ideas, thoughts and report issues. By using topic modelling applied to comments we are able to mine plentiful interesting information. Three aspects of an open source project mostly attracted our attention: the existence of a ”Core Team...
Article
Full-text available
Development and maintenance of understandable and modifiable software is very challenging. Good system design and implementation requires strict discipline. The architecture of a project can sometimes be exceptionally difficult to grasp by developers. A project's documentation gets outdated in a matter of days. These problems can be addressed using...
Conference Paper
Full-text available
Modern approaches to data analysis often require an intense integration of data from multiple data sources. The gap between utilized data models and schemata of pulled data require a significant effort to unify and deliver a clean view of an integrated data grid. This paper includes a discussion of a data model that challenges the most severe probl...
Conference Paper
Full-text available
Database management systems use numerous optimization techniques to accelerate complex analytical queries. Such queries have to scan enormous amounts of records. The usual technique to reduce their run-time is the materialization of partial aggregates of base data. In previous papers we have proposed the concept of metagranules, i.e. partially orde...
Article
Full-text available
Three new simple O(nlogn)O(nlogn) time algorithms related to repeating factors are presented in the paper. The first two algorithms employ only a basic textual data structure called the Dictionary of Basic Factors. Despite their simplicity these algorithms not only detect existence of powers (in particular, squares) in a string but also find all pr...
Article
Full-text available
Unified State Model USM is a single data model that allows conveying objects of major programming languages and databases. USM exploits and emphasizes common properties of their data models. USM is equipped with mappings from these data models onto it. With USM at hand, we have faced the next natural research question whether numerous query languag...
Article
Full-text available
Graphics Processing Units (GPU) have significantly more applications than just rendering images. They are also used in general-purpose computing to solve problems that can benefit from massive parallel processing. However, there are tasks that either hardly suit GPU or fit GPU only partially. The latter class is the focus of this paper. We elaborat...
Article
Full-text available
Analytic database queries are exceptionally time consuming. Decision support systems employ various execution techniques in order to accelerate such queries and reduce their resource consumption. Probably the most important of them consists in materialization of partial results. However, any introduction of derived objects into the database schema...
Article
Full-text available
Recursive data structures are often used in business applications. They store data on e.g. corporate hierarchies, product categories and bill-of-material. Therefore, recursive queries as introduced by SQL:1999 or formerly implemented by Oracle constitute a useful facility for application programmers. Unfortunately, recursive queries are not impleme...
Conference Paper
Full-text available
Tree-shaped data often occur in business applications, e.g. a corporate hierarchy or a categorization of products. A natural class of analytic queries posed to such data consists of aggregate queries over subtrees. Evaluation of such queries in large data sets requires significant amount of time. In this paper we focus on dedicated data structures...
Conference Paper
Full-text available
By an architecture of a software system we mean the fundamental organization of the system embodied in its components, their relationships to one another and to the system's environment. It also encompasses principles governing the system's design and evolution. Architectures of complex systems are obviously complex as well. The goal of our researc...
Article
Full-text available
Spreadsheets are among the most commonly used applications for data management and analysis. Perhaps they are even among the most widely used computer applications of all kinds. However, the spreadsheet paradigm of computation still lacks sufficient analysis. In this article we demonstrate that a spreadsheet can play the role of a relational databa...
Article
Full-text available
The software architecture is typically defined as the fundamental organization of the system embodied in its components, their relationships to one another and to the system's environment. It also encompases principles governing the system's design and evolution. In order to manage the architecture of a large software system the architect needs a h...
Conference Paper
Full-text available
Persistent data of most business applications contain recursive data structures, i.e. hierarchies and networks. Processing such data stored in relational databases is not straightforward, since the relational algebra and calculus do not provide adequate facilities. Therefore, it is not surprising that initial SQL standards do not contain recursion...
Conference Paper
Full-text available
Modern software systems are inherently complex. Their maintenance is hardly possible without precise up-to-date documentation. It is often tricky to document dependencies among software components by only looking at the raw source code. We address these issues by researching new software analysis and visualization tools. In this paper we focus on s...
Conference Paper
Full-text available
Design patterns codify general solutions to frequently encountered design problems. They also facilitate writing robust and readable code. Their usage happens to be particularly profitable if the documentation of the resulting system is lost, inaccurate or out of date. In reverse engineering, detection of instances of design patterns is extremely h...
Article
Full-text available
Nowadays, even small systems contain numerous components with complex dependencies. These components differ in importance and quality. Some of them get deprecated over time and can be removed from the project. This leads to the question relevant for all architects: which parts of my source code are still alive? To answer this question we harness th...
Article
Full-text available
Graphics Processing Units (GPU) have significantly more applications than just rendering images. They are also used in generalpurpose computing to solve problems that can benefit from massive parallel processing. However, there are tasks that either hardly suit GPU or fit GPU only partially. The latter class is the focus of this paper. We elaborate...
Article
Full-text available
We propose a new optimization method for object-oriented queries. The method enables pushing selection conditions before structure constructors, joins and quantifiers. A less general variant of this method is known from relational systems and SQL as pushing a selection before a join. If a query involves a selection which predicates refer to proper...
Article
Full-text available
The mismatch between relational databases and object-oriented programming languages has been significantly mitigated by the use of object-relational mapping. However, the querying facilities available in such mapping systems are still inferior when compared to the features of a fully-fledged relational DBMS. In our research we aim at enriching obje...
Conference Paper
Full-text available
In modern world the data storage is based on databases. The enormous scatter between the approaches and techniques caused by wide field of database applications brings every system manager to point where integration is required. In this paper we want to challenge the problems of heterogeneous data integration. We discuss a solution that can unify t...
Article
Full-text available
In the era of Web 2.0 and the apparent dawn of Web 3.0 web pages are dynamic and personalized. As the result, the load of web servers rapidly increases. Moreover, the upcoming load boost is impossible to predict. Although deceptively funny, the term of success − tolerant architectures has been coined. A number of web services actually failed becaus...
Article
Full-text available
In this paper we discuss the misunderstanding that have arisen over the years around the broadly defined term of the object-relational impedance mismatch. It occurs in various aspects of database application programming. There are three concerns judged the most important: mismatching data models, mismatching binding times and mismatching object lif...
Conference Paper
Full-text available
For a software-intensive system, software quality measures how well the software is designed and how well the software conforms to that design, whereas architecture of a software system is typically defined as the fundamental organization of the system embodied in its components, their relationships to each other and the environment, and the princi...
Chapter
Full-text available
We present the relational database schema aimed at efficient storage and querying parsed scientific articles, as well as entities corresponding to researchers, institutions, scientific areas, et cetera. An important requirement in front of the proposed model is to operate with various types of entities, but with no increase of schema’s complexity....
Article
Full-text available
In the prequel paper we introduced the Unified State Model (USM), i.e. a single model that allowed conveying objects of popular programming languages and databases. That model exploited and emphasized common properties of all these objects. We showed mappings between those popular data models and USM. Our natural next research goal is the Unified Q...
Article
Full-text available
Hierarchical and graph data structures are common in practical application development. In order to query such data, one can use SQL:1999 recursive queries based on Common Table Expressions. Nowadays, numerous relational database management systems implement them. However, some popular systems, e.g. MySQL, still lack this useful feature. In this pa...
Article
Full-text available
For a software-intensive system, software architecture is typically defined as the fundamental organization of the system embodied in its components, their relationships to one another and to the system's environment, and the principles governing the system's design and evolution. In this paper we propose a unified approach to the problem of managi...
Conference Paper
Full-text available
The following paper presents the effects of combining two technologies: object-relational mapping and SQL’s recursive queries. Both technologies are widely used in modern software, and yet, modern ORM systems still lack the support for recursive database querying. The currently used methods for querying graph and hierarchical structures are either...
Conference Paper
Full-text available
The architecture of a software system is typically defined as the organization of the system, the relationships among its components and the principles governing their design. By including artifacts coresponding to software engineering processes, the definition gets naturally extended into the architecture of a software system and process. In this...
Article
Full-text available
For a software system, its architecture is typically defined as the fundamental organization of the system incorporated by its components, their relationships to one another and their environment, and the principles governing their design. If contributed to by the artifacts coresponding to engineering processes that govern the system's evolution, t...
Conference Paper
Full-text available
Data processing often results in generation of a lot of temporary structures. They cause an increase in processing time and resources consumption. This especially concerns databases since their temporary data are huge and often they must be dumped to secondary storage. This situation has a serious impact on the query engine. An interesting techniqu...
Conference Paper
Full-text available
Indexing virtually integrated distributed, heterogeneous and defragmented resources is a serious challenge that so far was not even considered in the database literature. However, it is difficult to imagine that very large integrated resources (millions or billions of objects) can be processed without indexes. This paper presents the pioneering app...
Conference Paper
Full-text available
Since databases became bottlenecks of modern web applications, several techniques of caching data have been proposed. Caching data helps to resolve a problem of a database scalability, however it introduces an additional problem of a consistency maintenance. We are going to develop an existing caching model for an automatic consistency maintenance...
Conference Paper
Full-text available
Since databases became bottlenecks of modern web applications, several techniques of caching data have been proposed. This paper expands the existing caching model for automatic consistency maintenance of the cached data and data stored in a database. We propose a dependency graph which provides a mapper between update statements in a relational da...
Conference Paper
Full-text available
Precise modelling of behaviour is an area where programming meets modelling, and textual syntax competes with a visual one. By developing a UML based platform-independent framework, we aimed to find a visual syntax aid to make the language more approachable to stakeholders, while taking advantage of existing UML syntax intuitions and offering a tru...
Conference Paper
Full-text available
We show a novel execution method of queries over structural data. We present the idea in detail on SBQL (a.k.a. AOQL)—a powerful language with clean semantics. SBQL stands for the Stack-Based Query Language. The stack used in its name and semantics is a heavy and centralised structure which makes parallel and stream processing unfeasible. We propos...
Conference Paper
Full-text available
The relational model is the basis for most modern databases, while SQL is the most commonly used query language. However, there are data structures and computational problems that cannot be expressed using SQL-92 queries. Among them are those concerned with the bill-of-material and corporate hierarchies. A newer standard, called the SQL-99, introdu...
Conference Paper
Full-text available
When a query jointly addresses very large and very small collections it may happen that an iteration caused by a query operator is driven by a large collection and in each cycle it evaluates a subquery that depends on an element of a small collection. For each such element the result returned by the subquery is the same. In effect, such a subquery...
Conference Paper
Full-text available
Detection of design patterns is an important part of reverse engineering. Availability of patterns provides for a better understanding of code and also makes analysis more efficient in terms of time and cost. In recent years, we have observed a continual improvement in the field of automatic detection of design patterns in source code. Existing app...
Conference Paper
Full-text available
D-CUBED is a semantic code query system for Java. Its focus is on capturing the semantics of an analyzed program. It provides rich support to investigate the call flow and data flow of a program by using static analysis techniques with the custom model of symbolic instances. The usage scenarios of D-CUBED include: (1) detection of design patterns w...
Conference Paper
Full-text available
We present a study of different implementation variants of the Singleton pattern and propose an intuitive definition of this pattern expressed as a first-order logic formula. We also show that our method for automatically detecting design patterns can be used to detect instances of the Singleton with respect to this definition. We also provide data...
Conference Paper
Full-text available
Despite the specification of OCL mentions “query language” as one of its possible applications, there are rather few efforts in that direction. However, the problem becomes central where applying MDA to data intensive application modelling is considered. Recently added UML elements of Actions and Structured Activities make it possible to represent...