Jan Krzysztof Miziołek’s research while affiliated with University of Warsaw and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (12)


Permutation Based XML Compression
  • Conference Paper

December 2015

·

55 Reads

Lecture Notes in Business Information Processing

·

Jan Krzysztof Miziołek

·

Tyler Corbin

An XML document D often has a regular structure, i.e., it is composed of many similarly named and structured subtrees. Therefore, the entropy of a trees structuredness should be relatively low and thus the trees should be highly compressible by transforming them to an intermediate form. In general, this idea is used in permutation based XML-conscious compressors. An example of such a compressor is called XSAQCT, where the compressible form is called an annotated tree. While XSAQCT proved to be useful for various applications, it was never shown that it is a lossless compressor. This paper provides the formal background for the definition of an annotated tree, and a formal proof that the compression is lossless. It also shows properties of annotated trees that are useful for various applications, and discusses a measure of compressibility using this approach, followed by the experimental results showing compressibility of annotated trees.


Succinct Role Based Access Control Policies for XML Documents

September 2014

·

37 Reads

·

Scott Durno

·

Jan Krzysztof Miziołek

·

[...]

·

sdiwc organization

The popularity of role-based access control (RBAC) policies within industry has generated consid-erable interest in the research community. Since XML has become a de facto standard for data representation, most RBAC policies are expressed in XML. Although XML documents can be very large, no succinct imple-mentations for these policies exist. This paper describes a novel implementation (not previously proposed) for schema-less and streamed XML documents to provide authorized users with the results of queries on com-pressed documents. The designer of the policy does not need to be aware of any implementation details. Results of this research will be essential for industry, which could take advantage of efficient implementations of RBAC policies.


Networked XML Compression by Encoding Pre-Order Traversals

July 2014

·

25 Reads

Lecture Notes in Business Information Processing

The advantages of the eXtensible Markup Language, XML, come at a cost, especially for huge datasets or when used on small mobile devices. Several known XML-conscious compressors used in real time environments compress data during data streaming. This paper presents a study of new real time algorithms that exploit local structural redundan- cies of pre-order traversals of an XML tree. These algorithms focus on reducing the overhead of streaming data while maintaining load balancing between the sender and receiver. Our algorithms have similar or better performance than existing algorithms, while emphasizing low memory and processing overheads.


Parallelization of Permuting XML Compressors

May 2014

·

3 Reads

·

1 Citation

Lecture Notes in Computer Science

The verbose nature of XML results in overheads in storage and network transfers, which may be overcome by using parallel computing. This paper presents four permuting parallel XML compressors, based on an existing XML compressor, called XSAQCT. Tests were performed on multi-core machines using a test suite incorporating XML documents with various characteristics, and results were analyzed to find upper bounds given by Amdahl’s law, the actual speedup, and compression ratios.


Column-oriented Database Systems and XML Compression

January 2014

·

10 Reads

·

2 Citations

The verbose nature of XML requires data compression, which makes it more difficult to efficiently imple- ment querying. At the same time, the renewed industrial and academic interest in Column-Oriented DBMS (column-stores) resulted in improved efficiency of queries in these DBMS. Nevertheless there has been no research on relations between XML compression and column-stores. This paper describes an existing XML compressor and shows the inherent similarities between its compression technique and column-stores. Effi- ciency of compression is tested using specially designed benchmark data.


Figure 1: (a) XML document D; (b) the annotated tree T A,D representing D.
Figure 2: Overview of the first version. 
Table 2 : Analysis of results using various Java versions.
Table 2 -continued from previous page
Figure 3: Comparison of Java and OpenMP. 

+1

Parallelization of an XML Data Compressor on Multi-cores
  • Conference Paper
  • Full-text available

October 2011

·

139 Reads

·

1 Citation

Lecture Notes in Computer Science

Because of a growing interest in using XML for massive complex data there has been considerable research on designing XML compressors. This paper presents our research aimed at building parallel XML compressors, using Java and OpenMP (with C++). Our findings show that OpenMP is a preferred choice achieving better results than Java using a multi-core platform.

Download


Figure 3.1. System overview. 
Figure 4.1. Testing scalability of Algorithms 4.1 and 4.2. 
Figure 4.2. Testing scalability in the presence of a growing number of parameters. 
Parameterized Role-Based Access Control Policies for XML Documents

December 2009

·

122 Reads

·

9 Citations

Information Security Journal A Global Perspective

Role-based access control policies (RBAC) are often used to provide access to fragments of static XML documents. Existing implementations of such RBACs often disseminate a single document encrypted with multiple cryptographic keys. However, most existing approaches are subject to role proliferation, especially in the case of large organizations where the number of defined roles may be several hundred. In such circumstances, correctly administering access control becomes much more difficult and error-prone. In this article, we present a novel approach to RBACs, which supports role parameterization to mitigate the potential of role proliferation. Our approach supports the association of specific user and/or session-specific credentials (i.e., parameters) with roles. We first define parameterized RBAC (PRABC), and then provide an algorithm for generating the minimal set of keys required to enforce a particular parameterized policy. We present another algorithm for efficiently encrypting an XML document in a single pass, using a technique that disguises the original structure of hidden subtrees. Finally, we include a key distribution algorithm that ensures each user receives only those keys that are needed for decrypting accessible fragments of the document. We analyze the complexity of our implementation and provide experiments to demonstrate its scalability.


Figure 5.  
Figure 6.  
Figure 8.  
Schema‐level access control policies for XML documents

November 2009

·

55 Reads

International Journal of Web Information Systems

Purpose The purpose of this paper is to consider the secure publishing of XML documents, where a single copy of an XML document is disseminated and a stated role‐based access control policy (RBACP) is enforced via selective encryption. It describes a more efficient solution over previously proposed approaches, in which both policy specification and key generation are performed once, at the schema‐level. In lieu of the commonly used super‐encryption technique, in which nodes residing in the intersection of multiple roles are encrypted with multiple keys, it describes a new approach called multi‐encryption that guarantees each node is encrypted at most once. Design/methodology/approach This paper describes two alternative algorithms for key generation and single‐pass algorithms for multi‐encrypting and decrypting a document. The solution typically results in a smaller number of keys being distributed to each user. Findings The paper proves the correctness of the presented algorithms, and provides experimental results indicating the superiority of multi‐encryption over super‐encryption, in terms of encryption and decryption time requirements. It also demonstrates the scalability of the approach as the size of the input document and complexity of the schema‐level RBACP are increased. Research limitations/implications An extension of this work involves designing and implementing re‐usability of keyrings when a schema or ACP is modified. In addition, more flexible solutions for handling cycles in schema graphs are possible. The current solution encounters difficulty when schema graphs are particularly deep and broad. Practical implications The experimental results indicate that the proposed approach is scalable, and is applicable to scenarios in which XML documents conforming to a common schema are to be securely published. Originality/value This paper contributes to the efficient implementation of secure XML publication systems.


fig_annotated_tree  
fig_9  
XSAQCT: XML Queryable Compressor

August 2009

·

116 Reads

·

9 Citations

XML (Extensible Markup Language) is a meta-language (developed by the W3C, World Wide Web Consortium in 1996), which represents semi-structured data using markups. While the use of XML facilitates the interchange and access of data, its verbose nature tends to considerably increase the size of a data file. This increase in size limits applications of XML, in particular, because of time efficiency of storage on large data files, and because of space considerations of storage on mobile devices. Besides storing (possibly compressed) XML data, one is also interested in being able to query them in order to obtain specific information; such as the information pertaining to all patients who visited the emergency room of a specific hospital in the last year. The reasons for querying a compressed XML file are: Querying a compressed XML file is generally faster than completely decompressing the compressed file and then querying it. Portable devices may not have disk space available for a complete decompression of the XML file. There are many known XML-aware compressors, i.e. compressors, which can take advantage of XML syntax. Some of these XML compressors are grammar-free, in other words, information available to the compressor is limited to the XML document. Other XML compressors are grammar-based, i.e. the compressor is aware of the grammar for which the input document is valid. Grammar-based compressors may produce better results - in terms of both compression rate and time - than grammar-free compressors because they can take advantage of information available in the grammar, but in many applications the grammar is not known and so this approach is not always practical. In the case of the widely used Wratislava corpus [Skibinski et al, 2007], out of seven XML documents, only two provide an XML Schema (enwikibooks and enwikinews), two reference a DTD (shakespeare and dblp), while the others use no schema. Finally, even if an XML Schema is provided, it may define elements that never actually appear in the XML document to be compressed. In this paper, we describe a queryable, grammar-free XML compressor, called XSAQCT (pronounced exact). Our technique borrows from other XML compressors in that it separates the document structure from the text values and attribute values (collectively called data values), which makes up the content of the document. What is new in our technique is that we first encode the document to succinctly store information about the input document. Next, we apply the appropriate back-end data compressors to the container that stores the document structure and to the containers storing the data values (the type of the data, derived from the containers, may be used to guide the choice of back-end compressors used for various containers). It is well known that, on average, the structure of the XML document represents between 10 and 20 percent of the size of the entire document, and the remaining 80 percent represents text and attribute values. Since the main focus of our work is on queryable compression, our encoding of the document structure supports lazy decompression, i.e. during the querying process of the compressed document; we decompress “as little as possible”. Well-known XML compressors differ in their use of container granularity; some compressors use a single container, while others tend to create many separate containers for related values. The former approach is based on the promise that standard data compressors achieve better results when they get large data sets, but require complete decompression in order to perform a query. On the other hand, the latter approach may suffer from poor compression ratios, but it requires the decompression of only a few (possibly just one) containers. In our approach, we attempt to strike a balance between these two extremes; using containers that will be large enough so that they can be effectively compressed, but at the same time the container structure does not require a full decompression to answer a query. In addition, while our design supports lazy decompression, it is designed to support future extensions and performs operations directly on compressed data, without any decompression. In what follows, we provide a more detailed description of XSAQCT.


Citations (5)


... Additionally, in the column-store environment, the contents of a single entity is often stored in many locations, which then requires additional logic to combine the attributes for joining and grouping attributes; (this is exactly how many permuting XML compressors work). Also, in [Fry, 2011] and [Corbin et al., 2013] XSAQCT was compared to many different XML-database engines using the BerkelyDB Key-value database, and the results were promising. Figure 2 depicts the architectural layout of a modified XSAQCT. ...

Reference:

Column-Oriented Database Systems and XML Compression
Parallelization of Permuting XML Compressors
  • Citing Conference Paper
  • May 2014

Lecture Notes in Computer Science

... Authors in [17] provide fine comparison between Column-Oriented Database Systems and XML their work explained the relationships between XML compressors and column-stores. They illustrated that a permuting XML compressor, called XSAQCT with the DBMS back-end has essentially the same functionality as a column-store (while ignoring things such as SQL Joins), including in their work a specific kind of compression, Also they test the compression ratio achieved with the compressor they made, experiments were performed on an XML corpus and the test showed a very good results that make their work strong an applicable to use instead of the XML they also describes the existing XML compressor showing the similar inherent between its compression technique and column-stores. ...

Column-oriented Database Systems and XML Compression
  • Citing Article
  • January 2014

... This paper investigates succinct, client-based implementation of RBAC policies for schema-less XML documents. The policy for the XML document D can specify occurrences of individual nodes of D, entire subtrees of D or specific text elements in D. The compression process is based on an XML compressor, called XSAQCT, see [19] and [20], (for details, see Section II). However, the designer of policies does not need to be aware of inner-workings of this compressor, or the encryption tools. ...

XSAQCT: XML Queryable Compressor

... There are other online compressors, e.g., TREECHOP [12], but XSAQCT has a number of distinctive features, such as it is queryable using lazy decompression (i.e., with minimal decompression) and updateable [7], and finally it can be parallelized to execute faster on multi-core machines [13]. Various possible educational applications of XSAQCT are described in [14]. Similarly to TREECHOP, XSAQCT supports both the compression where the decompressor's output is exactly the same as the original input (including the white space), and generating a canonicalized [15] XML document. ...

Updateable Educational Applications based on Compressed XML Documents.
  • Citing Conference Paper
  • January 2011

... More precisely, the problem lies in handling the task of assigning a single role to each user which can be complicated in this case. Role parameterization have been proved to be an efficient solution in these types of scenarios [65]. The global proposed RBAC/WoT architecture, as explained in the paper, can be summarized in of permission that will grant privileges to the users and eventually the Objects. ...

Parameterized Role-Based Access Control Policies for XML Documents

Information Security Journal A Global Perspective