
Altan MesutTrakya University · Department of Computer Engineering
Altan Mesut
PhD
About
40
Publications
8,414
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
95
Citations
Introduction
Skills and Expertise
Publications
Publications (40)
Learning-based data compression methods have gained significant attention in recent years. Although these methods achieve higher compression ratios compared to traditional techniques, their slow processing times make them less suitable for compressing large datasets, and they are generally more effective for short texts rather than longer ones. In...
It is important to save space storing the generated data. To achieve this, compression algorithms are used. Stored data is compressed once but accessed many times to search on it. For this reason, the biggest disadvantage of compressed data is that it needs to be decompressed when it will be used. This disadvantage can be eliminated by using a fast...
The complexity and dimensions of deep learning models are increasing. Along with the growing complexity, vector databases have been proposed to store high-dimensional data required by the models. Vector databases aim to store high-dimensional vectors and perform similarity calculations on these vectors. In this study, the insertion and query perfor...
In this study, a method calledWord-based Static Dictionary Compression (WSDC), which can compress short texts at a high ratio, and a model that uses iterative clustering to create static dictionaries used in this method are proposed. The number of static dictionaries to be created can vary by running the k-Means clustering algorithm iteratively acc...
B-Tree based text indexes used in MongoDB are slow compared to different structures such as inverted indexes. In this study, it has been shown that the full-text search speed can be increased significantly by indexing a structure in which each different word in the text is included only once. The Multi-Stream Word-Based Compression Algorithm (MWCA)...
Using data compression algorithms in data transmission and storage provides advantages in terms of time and storage cost. There are
several methods for compressing texts created in natural language which is one of the most produced data types. Many traditional
methods are not successful in compressing short texts. Compressing short texts requires d...
Still images have a significant share in the data created today. These images take up a lot of space in their raw form. Image compression algorithms provide a much smaller representation of images in order to solve this problem. The JPEG group offered many alternatives to the JPEG algorithm, which became insufficient due to the increase in resoluti...
It has become a necessity to use compression algorithms in image data. One of the most used image compression algorithms, JPEG performs lossy compression on an image and the loss varies according to the given quality factor. At low quality factor values, the file size gets smaller and artifacts become noticeable. In high quality factors, quality of...
In this article, we present a novel word-based lossless compression algorithm for text files using a semi-static model. We named this method the ‘Multi-stream word-based compression algorithm (MWCA)’ because it stores the compressed forms of the words in three individual streams depending on their frequencies in the text and stores two dictionaries...
Compressing image files after splitting them into certain number of parts can increase compression ratio. Acquired compression ratio can also be increased by compressing each part of the image using different algorithms, because each algorithm gives different compression ratios on different complexity values. In this study, statistical compression...
In this study a new semistatic data compression model that has a fast coding process and that allows compressed pattern matching is introduced. The name of the proposed model is chosen as tagged word-based compression algorithm (TWBCA) since it has a word-based coding and word-based compressed matching algorithm. The model has two phases. In the fi...
We propose a new compression algorithm that compresses plain texts by using a dictionary-based model and a compressed string-matching approach that can be used with the compressed texts produced by this algorithm. The compression algorithm (CAFTS) can reduce the size of the texts to approximately 41% of their original sizes. The presented compresse...
İki boyutlu durağan görüntülerin sayısal ortamda saklanırken kayıplı veya kayıpsız sıkıştırma yöntemlerinin kullanılmasıyla daha az yer kaplamalarını sağlamak mümkündür. Bunun için fotoğraf türündeki görüntülerde etkili olan JPEG, JPEG2000 ve JPEG XR veya karmaşıklığın (entropinin) az olduğu görüntülerde iyi sonuç veren GIF ve PNG gibi yöntemler ku...
In literature, several methods are available to combine both low spatial multispectral and low spectral panchromatic resolution images to obtain a high resolution multispectral image. One of the most common problems encountered in these methods is spectral distortions introduced during the merging process. At the same time, the spectral quality of...
Teknolojinin gelişmesiyle birlikte dijital ortamdaki verilerin güvenliğini sağlamak gerekliliği ortaya çıkmıştır. Bilgi güvenliği sağlamak amacıyla genellikle şifreleme teknikleri ve steganografi teknikleri kullanılmaktadır. Bu iki yöntem tek başlarına kullanılabildikleri gibi güvenliği arttırmak amacıyla birlikte de kullanılabilmektedir. Şifreleme...
We developed a fast text compression method based on multiple static dictionaries and named this algorithm as STECA (Static Text Compression Algorithm). This algorithm is language dependent because of its static structure; however, it owes its speed to that structure. To evaluate encoding and decoding performance of STECA with different languages,...
In this paper, a new lossless data compression method that is based on digram coding is introduced. This data compression method uses semi-static dictionaries: All of the used characters and most frequently used two character blocks (digrams) in the source are found and inserted into a dictionary in the first pass, compression is performed in the s...
The aim of this work is to provide caption and summary of the latest news that are taken from the particular web sites for PDA and smartphone users. A web content mining application named Cep Gazetecisi is developed for this aim. This application removes unnecessary data in the source HTML file and transfers the clean data to a mobile device. By us...
In this paper, a new approach on dictionary-based lossless compression method is introduced. In our two-pass compression algorithm (SSDC), most frequently used two character blocks (digrams) are found in source file in the first-pass, and they are inserted into free spaces in ASCII table which are unused by the document in the second-pass. In our m...
zetçe İnternet üzerindeki verilerin büyük bir çoğunluğu HTML yapısındadır. HTML, verilerinin gösterilmesine yönelik geliştirilmiş olan bir dil olduğu için bu biçimdeki bir belgeyi işleyerek içindeki gerekli bilgileri elde edebilmek oldukça çaba gerektirmektedir. Bu çalışmada internet üzerinden Web Robotları tarafından indirilen HTML sayfalarının iç...
zetçe Metin işleme konusunda dizgi eşleme önemli bir yere sahiptir. Dizgi eşlemeye yönelik farklı birçok algoritma geliştirilmiştir. Dizgi eşleme algoritmalarında; kullanılan alfabe, varsa algoritmanın ön işlem aşaması ve dizgiyi bulmak için yaptığı karşılaştırma (deneme) sayısı algoritmanın performansına etki eden önemli parametrelerdir. Bu çalışm...
In this study, Word Based String Matching Algorithm named as WordMatch is represented. This algorithm makes pattern matching over compressed natural language text documents with word based text compression algorithm that is developed by us. Since the compression algorithm based on words, only word based string matching can be done on compressed tex...
A new word-based natural language text transform algorithm that improves compression ratio of compression algorithms is presented in this paper. The algorithm, named as KESIDAL, uses word-based static dictionaries. The main criteria while selecting the words for these dictionaries is the gain of these words in compression ratio. It was seen from ou...
The object-oriented programming and distributed computing techniques made significant impact on modern software development over the past ten years. Older generation client/server architectures, such as RPC (Remote Procedure Call), do not have an object-oriented model. In these architectures, the client must be aware of where the server is located...
In this work, a word-rank based crawler which makes the link analyze according to the words of a page is developed to show how the different crawling strategies affects the performance. It was seen from the test results that; word-rank based crawler can eliminate the web pages that includes advertisements and other types of inessential data. Theref...
In this work, we are trying to find the changes in the performance of the lossless compression algorithms when they are used with different natural languages. For this purpose, we prepared 8 different text corpuses which have the same size for every natural language. We choose the most popular dictionary based and statistical based algorithms for m...
In this study, the changes in the performance of the string matching algorithms when they are used with different natural languages are searched. For this purpose, 8 different text corpuses which are the same in size are prepared for 8 different natural languages. 6 different string matching algorithms are used with these corpuses and the searching...
ZET Web kullanım madenciliğine yönelik .NET ortamında geliştirdiğimiz ve "Web Sunucusu Analizcisi -WSA" olarak isimlendirdiğimiz uygulama, web sitesi tasarımcılarına tasarladıkları sitenin kullanımının ve etkinliğinin değerlendirilmesinde yardımcı olacak bilgiler üretir. WSA veri olarak web sunucusu erişim günlük dosyalarını kullanır, bu verilerden...
Developing technology have raised the need of protecting our important data. For this aim, to increase the security of data needed for us, some techniques such as secret sharing schemes are used together with encryption or data hiding techniques. In this way, our important data becomes more reliable against malicious attacks. In this study, we exam...
ÖZET Bu tezin amac?, günümüzde yayg?n olarak kullan?lmakta olan kay?pl? ve kay?ps?z veri s?k??t?rma yöntemlerinin incelenmesi, eskiden geli?tirilmi? olan yöntemler ile yak?n zamanda geli?tirilmi? olan yöntemler aras?ndaki farkl?l?klar?n belirlenmesi ve yeni yöntemlerin kendisinden önceki yöntemleri ne yönde geli?tirdi?inin ara?t?r?lmas?d?r. Tezde y...