Deepti Gupta
Research interests
-
InterestsFuzzy Clustering, Information retreival, Crawler, Neural Network, Search Engine, Fuzzy Logic
Publications
-
Data Mining and Analysis of COS Markers in Burma Agrimony
Conserved ortholog set (COS) markers are important molecular tools for comparative genome analysis among the species in post-genomics era. They are single-copy genes that share a common ancestor among closely related and divergent plant species. In this s. 01/2011; Vol 4 No. 1:682.
Conserved ortholog set (COS) markers are important molecular tools for comparative genome analysis among the species in post-genomics era. They are single-copy genes that share a common ancestor among closely related and divergent plant species. In this study, we have identified 163 COS markers from... [more] Conserved ortholog set (COS) markers are important molecular tools for comparative genome analysis among the species in post-genomics era. They are single-copy genes that share a common ancestor among closely related and divergent plant species. In this study, we have identified 163 COS markers from Burma Agrimony by mining information from database of express sequence tags (ESTs). Data mining was done from biological database resource, dbEST of NCBI. This effort will accelerate the development of molecular markers for this lesser-studied plant member of Asteraceae family, having medicinal value. The markers were developed by identifying Burma Agrimony’s ESTs that have single best match in assembled Arabidopsis genome. The identified COS markers were annotated and assigned to functional role categories. The in silico mapping of identified markers was done on Arabidopsis chromosomes.
-
Prediction of the Query of Search Engine Using Back Propagation Algorithm
International Journal on Computer Science and Engineering (IJCSE). 01/2011; Vol. 3 No. 5 May 2011:1946.
The information user is depending on the Search Engine; therefore search engines are required as a prediction system to predict the next query hit by the user. Web mining techniques, like neural network can be used for this purpose. In this paper, a novel approach to predict the oncoming query for... [more] The information user is depending on the Search Engine; therefore search engines are required as a prediction system to predict the next query hit by the user. Web mining techniques, like neural network can be used for this purpose. In this paper, a novel approach to predict the oncoming query for the search engine has been highlighted. This approach helps search engine to predict the oncoming query domain, by which search engine keeps most relevant web pages in the repository.
-
Retrieval of Web Documents Using a Fuzzy Hierarchical Clustering
International Journal of Computer Applications. 01/2010;
The World Wide Web has huge amount of information that isretrieved using information retrieval tool like Search Engine. Pagerepository of Search Engine contains the web documentsdownloaded by the crawler. This repository contains variety ofweb documents from different domains. In this paper, a techn... [more] The World Wide Web has huge amount of information that isretrieved using information retrieval tool like Search Engine. Pagerepository of Search Engine contains the web documentsdownloaded by the crawler. This repository contains variety ofweb documents from different domains. In this paper, a techniquecalled “Retrieval of Web documents using a fuzzy hierarchicalclustering” is being proposed that creates the clusters of webdocuments using fuzzy hierarchical clustering.
-
A Novel Architecture for Domain Specific Parallel Crawler
Indian Journal of Computer Science and Engineering. 01/2010;
The World Wide Web is an interlinked collection of billions of documents formatted using HTML. Due to the growing and dynamic nature of the web, it has become a challenge to traverse all URLs in the web documents and handle these URLs, so it has become imperative to parallelize a crawling process. T... [more] The World Wide Web is an interlinked collection of billions of documents formatted using HTML. Due to the growing and dynamic nature of the web, it has become a challenge to traverse all URLs in the web documents and handle these URLs, so it has become imperative to parallelize a crawling process. The crawler process is further being parallelized in the form ecology of crawler workers that parallely download information from the web. This paper proposes a novel architecture of parallel crawler, which is based on domain specific crawling, makes crawling task more effective, scalable and load-sharing among the different crawlers which parallel download web pages related to different domains specific URLs.
-
A Novel Indexing Technique for Web Documents using Hierarchical Clustering
International Journal of Computer Science and Network Security. 01/2009; Vol 9 No 9:168.
The information on the WWW is growing at an exponential rate; therefore, search engines are required to index the downloaded Web documents more efficiently. Web mining techniques like clustering can be used for this purpose. In this paper, a novel technique to index the documents is being proposed ... [more] The information on the WWW is growing at an exponential rate; therefore, search engines are required to index the downloaded Web documents more efficiently. Web mining techniques like clustering can be used for this purpose. In this paper, a novel technique to index the documents is being proposed that not only indexes the documents more efficiently but also uses hierarchical clustering to keep the information based upon similarity measure and fuzzy string matching. This technique keeps the related documents in the same cluster so that searching of documents becomes more efficient in terms of time complexity.
Following (32)
-
Russell J Frohardt
St. Edwards University -
Valerie J H Powell
Robert Morris University -
Pooja Khandelwal
Jamia Millia Islamia -
John Durrett
Texas State University-San Marcos -
Dr. jyoti Chaudhary
high performance computing research lab