C. Oswald’s research while affiliated with Indian Institute of Information Technology, Design and Manufacturing, Kancheepuram and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (21)


Smart Multimedia Compressor—Intelligent Algorithms for Text and Image Compression
  • Article

November 2021

·

22 Reads

·

7 Citations

The Computer Journal

C Oswald

·

B Sivaselvan

A number of text compression algorithms have been proposed in the past decades, which have been very effective and usually operate on conventional character/word-based approaches. A novel data compression perspective of data mining is explored in this research and the paper focuses on novel frequent sequence/pattern mining approach to text compression. This work attempts to make use of longer-range correlations between words in languages for achieving better text compression. We propose a novel and efficient method by making the compression of any word-level text in a universal manner for corpora across domains referred as Universal Huffman Tree-based encoding. The major contribution of this work is in terms of avoidance of code table communication to the decoder. Simulation results over benchmark datasets indicate that Universal Huffman encoding employing frequent sequence mining (achieves [20%, 89%] improvement in compression in reduced time. The paper also contributes a usable interface for Data compression that employs the proposed frequent sequent mining-based data compression algorithm. The interface supports features such as feedback, consistency, usability, navigation, visual appeal, performance and accessibility on par with existing compression softwares. The work results in an intelligent data compression software employing knowledge engineering perspective.


An efficient and novel data clustering and run length encoding approach to image compression

January 2021

·

70 Reads

·

3 Citations

Concurrency and Computation Practice and Experience

The paper explores the domain of lossy compression, specifically incorporating data mining techniques in the process of image encoding. Clustering is employed to group similar pixels in the image and henceforth use cluster labels in compressing the image. The proposed approach replaces the Discrete Cosine Transform phase of Conventional JPEG with a combination of clustering and Run Length Encoding so as to handle redundant data in the image effectively. Simulation with respect to benchmark data indicates improved compression (42.5%) in relation to existing solutions. Image quality metrices such as PSNR, structural similarity have also been tested and it is observed that the proposed approach achieves significant compression ratio with negligible loss in visual quality.



Frequent Sequence Mining Approach to Video Compression: Third International Conference, ICC3 2017, Coimbatore, India, December 14-16, 2017, Proceedings

September 2018

·

13 Reads

·

1 Citation

Communications in Computer and Information Science

This work provides an approach of using frequent sequence mining in video compression. This paper focuses on reducing redundancies in a video by mining frequent sequences and then replacing it by the sequence identifiers. If we consider a video file as a sequence of raw RGB pixel values, we can observe a lot of redundancies and patterns/sequences that are repeated throughout the video. Redundant information and repeating sequences take up unnecessary space. The main motive of this system is to reduce these redundancies by employing data mining and coding techniques. The high cost of time and space required for mining sequences from large videos are reduced by dividing the video into multiple small blocks. Simulations of the proposed algorithm show a significant reduction in redundant parts of the video.


Hierarchical Clustering Approach to Text Compression

July 2018

·

103 Reads

·

3 Citations

Advances in Intelligent Systems and Computing

C. Oswald

·

V. Akshay Vyas

·

K. Arun Kumar

·

[...]

·

B. Sivaselvan

A novel data compression perspective is explored in this paper and focus is given on a new text compression algorithm based on clustering technique in Data Mining. Huffman encoding is enhanced through clustering, a non-trivial phase in the field of Data Mining for lossless text compression. The seminal hierarchical clustering technique has been modified in such a way that optimal number of words (patterns which are sequence of characters with a space as suffix) are obtained. These patterns are employed in the encoding process of our algorithm instead of single character-based code assignment approach of conventional Huffman encoding. Our approach is built on an efficient cosine similarity measure, which maximizes the compression ratio. Simulation of our proposed technique over benchmark corpus clearly shows the gain in compression ratio and time of our proposed work in relation to conventional Huffman encoding.


Figure 1: Flow Chart of GA78. 
Table 1 : Simulation results of various Text Corpora.
Figure 2: Graph of text T for GA78. 
Figure 3: min_supp vs Compression ratio for bible. 
Figure 4: min_supp vs time for bible. 

+3

Text and Image Compression based on Data Mining Perspective
  • Article
  • Full-text available

June 2018

·

479 Reads

·

2 Citations

Data Science Journal

Data Compression has been one of the enabling technologies for the on-going digital multimedia revolution for decades which resulted in renowned algorithms like Huffman Encoding, LZ77, Gzip, RLE and JPEG etc. Researchers have looked into the character/word based approaches to Text and Image Compression missing out the larger aspect of pattern mining from large databases. The central theme of our compression research focuses on the Compression perspective of Data Mining as suggested by Naren Ramakrishnan et al. wherein efficient versions of seminal algorithms of Text/Image compression are developed using various Frequent Pattern Mining(FPM)/Clustering techniques. This paper proposes a cluster of novel and hybrid efficient text and image compression algorithms employing efficient data structures like Hash and Graphs. We have retrieved optimal set of patterns through pruning which is efficient in terms of database scan/storage space by reducing the code table size. Moreover, a detailed analysis of time and space complexity is performed for some of our approaches and various text structures are proposed. Simulation results over various spare/dense benchmark text corpora indicate 18% to 751% improvement in compression ratio over other state of the art techniques. In Image compression, our results showed up to 45% improvement in compression ratio and up to 40% in image quality efficiency.

Download

An optimal text compression algorithm based on frequent pattern mining

June 2018

·

191 Reads

·

25 Citations

Journal of Ambient Intelligence and Humanized Computing

Data Compression as a research area has been explored in depth over the years resulting in Huffman Encoding, LZ77, LZW, GZip, RAR, etc. Much of the research has been focused on conventional character/word based mechanism without looking at the larger perspective of pattern retrieval from dense and large datasets. We explore the compression perspective of Data Mining suggested by Naren Ramakrishnan et al. where in Huffman Encoding is enhanced through frequent pattern mining (FPM) a non-trivial phase in Association Rule Mining (ARM) technique. The paper proposes a novel frequent pattern mining based Huffman Encoding algorithm for Text data and employs a Hash table in the process of Frequent Pattern counting. The proposed algorithm operates on pruned set of frequent patterns and also is efficient in terms of database scan and storage space by reducing the code table size. Optimal (pruned) set of patterns is employed in the encoding process instead of character based approach of Conventional Huffman. Simulation results over 18 benchmark corpora demonstrate the betterment in compression ratio ranging from 18.49% over sparse datasets to 751% over dense datasets. It is also demonstrated that the proposed algorithm achieves pattern space reduction ranging from 5% over sparse datasets to 502% in dense corpus.


Frequent Pattern Mining Guided Tabu Search

February 2018

·

42 Reads

·

1 Citation

Communications in Computer and Information Science

The paper focuses on the search perspective of Data Mining. Genetic Algorithms, Tabu Search are a few of the evolutionary search strategies in the literature. As a part of this work, we relate Data mining in the context of Tabu Search to perform search for global optimum in a guided fashion. Sequence Pattern Mining (SPM) in generic and a variant of it, namely Maximal Sequence Pattern (MSP) is incorporated as a part of Tabu Search. The limitation of conventional Tabu Search is that the convergence rate is dependent on the initial population. The project proposes to arrive at an initial population employing SPM and hence result in improved convergence, in relation to conventional Tabu Search. The proposed algorithm has been tested for the N-Queens problem and empirical results indicate approximately 21% improved convergence.


A Frequent and Rare Itemset Mining Approach to Transaction Clustering

February 2018

·

31 Reads

·

6 Citations

Communications in Computer and Information Science

Data clustering is the unsupervised learning procedure of grouping related objects based on similarity measures. Intra cluster similarity is maximized and inter cluster similarity is minimized in the clustering technique. Distance based similarity measures are employed in k-means, k-mediods, etc. which are some of the clustering algorithms. This project explores the scope of guided clustering, wherein frequent and rare patterns shall be employed in the process of clustering. The work focuses on having a better centre of a cluster, employing variants of frequent and rare itemsets such as Maximal Frequent Itemset (MFI) and Minimal Rare Itemset (MRI). The literature supports several instance of association rule based classification and the effort is to have a MFI/MRI based clustering. The proposed model employs MFI and MRI in the process of choosing cluster centers. The proposed algorithm has been tested over benchmark datasets and compared with centroid based hierarchical clustering and large items based transaction clustering algorithms and the results indicate improvement in terms of cluster quality.



Citations (12)


... In animated pictures it is commonly used. [32] [33]. ...

Reference:

Image Compression Using Neural Networks: A Review
An efficient and novel data clustering and run length encoding approach to image compression
  • Citing Article
  • January 2021

Concurrency and Computation Practice and Experience

... Many initiatives [Gunta et al. (2018); de Souza Filho et al. (2019)] have focused on usability tests, by proposing frameworks that strengthen the design and the evaluation phase with the implementation of a set of heuristics to make the life cycle of the design shorter, and the evaluation faster and easier. These heuristics are based particularly on the recall of tools to keep the user focused with the relevant, contextual and judicious use of external rewards, socialization, etc. ...

Gamification Paradigm for WebApps Design Framework
  • Citing Conference Paper
  • February 2018

... These tools simplify the decision-making process by providing real-time product insights and support. The model also prioritizes saving resources by minimizing the chances of overstocking or running out of a product using artificial intelligence's predictive capabilities [3]. AI helps financial management by automating data entry, spending classification, fraud detection, and risk assessment. ...

A Novel Gamification Approach to Recomendation Based Mobile Applications
  • Citing Conference Paper
  • December 2017

... Again, lossy compression algorithms such as Huffman coding gives relatively good quality as well as compression rates with images but blocky look of reconstructed images [5]. The reverse is the case of LZW in which compressed image data quality is retained at the expense of little size decreases [6]. ...

Text and Image Compression based on Data Mining Perspective

Data Science Journal

... To avoid the social panic caused by the release of rare disease data, we should protect those special clinical data and preprocess the original data before releasing medical data to other departments or the outside world. To summarize, rare itemset mining [10,29,37] has attracted much attention, and privacy-preserving rare itemset mining deserves to be studied as well [11]. ...

A Frequent and Rare Itemset Mining Approach to Transaction Clustering
  • Citing Chapter
  • February 2018

Communications in Computer and Information Science

... TS suffers from sensitivity to parameter settings, difficulty in escaping local optima, and high memory requirements. TS strategies are refined and applied to pattern mining at large but rarely to HUIM (Avula et al. 2018). ...

Frequent Pattern Mining Guided Tabu Search
  • Citing Chapter
  • February 2018

Communications in Computer and Information Science

... On the basis of the analysis of the nature of syllables, this method implements encoding by counting repeated syllables and setting special symbols, such as spaces and English characters. Oswald et al. [17] proposed an algorithm that further reduces text redundancy by finding patterns in the text and using these patterns with the LZ78 algorithm. Bharathi et al. [18] proposed an incremental compression method. ...

A Graph-Based Frequent Sequence Mining Approach to Text Compression
  • Citing Chapter
  • November 2017

Lecture Notes in Computer Science

... One of the classic compression algorithms that do not lose information during its work was invented by the world davied Huffman and is widely used in many fields [16]. The output of this method is codes of different lengths where the shortest code is allocated to the most frequent symbols and the longest code to less frequent so in this way there will be a reduction in size of the data to be compressed [17,18]. ...

An optimal text compression algorithm based on frequent pattern mining

Journal of Ambient Intelligence and Humanized Computing

... All these recent techniques have not approached the problem of Text Compression with the perspective of Data Mining which does not give optimal set of patterns (Köppl and Sadakane, 2016;Pratas et al., 2016). Oswald et al. (Oswald et al., 2015a, b;Oswald et al., 2016) have shown that text compression by Frequent Pattern Mining(FPM)/Clustering technique is better than conventional Huffman encoding(CH). Lossy image compression techniques include JPEG, JPEG2000, Chroma Subsampling, Transform Coding, Fractal Compression, Vector Quantization, Block Truncation etc (Wallace, 1991). ...

Hierarchical Clustering Approach to Text Compression
  • Citing Chapter
  • July 2018

Advances in Intelligent Systems and Computing

... There are many types of lossless text compression. They include the Burrows-Wheeler transform, Huffman coding, arithmetic coding, run-length coding, Deflate, Lempel-Ziv 77 (LZ77), Lempel-Ziv-Welch (LZW), GNU zip (Gzip), Bzip2, Brotli, and many more [1,11]. Some statistical methods assign a shorter binary code of variable length to the most frequently repeated characters, and examples of this method are Huffman and arithmetic coding. ...

An Efficient Text Compression Algorithm - Data Mining Perspective
  • Citing Conference Paper
  • January 2015

Lecture Notes in Computer Science