Are you H. Abolhassani?

Claim your profile

Publications (3)0 Total impact

  • S. Kharazmi, A.F. Nejad, H. Abolhassani
    [Show abstract] [Hide abstract]
    ABSTRACT: Progressive use of Web based information retrieval systems such as general purpose search engines and dynamic nature of the Web make it necessary to continually maintain Web based information retrieval systems. Crawlers facilitate this process by following hyperlinks in Web pages to automatically download new and updated Web pages. Freshness (recency) is one of the important maintaining factors of Web search engine crawlers that takes weeks to months. Many large Web crawlers start from seed pages, fetch every links from them, and continually repeat this process without any policies that help them to better crawling and improving performance of those. We believe that data mining techniques can help us to improve the freshness parameter by extracting knowledge from crawling data. In this paper we propose a Web crawler that uses extracted knowledge by data mining techniques as policies for crawling. For this purpose we include a component to collect additional crawling information. This crawler starts by non-preferential crawling. After a few crawling, it trained by using mining techniques on crawling data and then uses policies for preferential crawling to improve freshness time. Our research represented that crawling with determined polices has better freshness than generic general purpose Web crawlers.
    Internet Technology and Secured Transactions, 2009. ICITST 2009. International Conference for; 12/2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: The growth of database application usage requires database management systems (DBMS) that are accessible, reliable, and dependable. One approach to handle these requirements is replication mechanism. Replication mechanism can be divided into various categories. Some related works consider two categories for replication mechanisms: heterogeneous and homogenous however majority of them classify them in three groups: physical, trigger-based and log-based schema. Log-based replication mechanisms are the most widely used category among DBMS vendors. Adapting such approach for heterogeneous systems is a complex task, because of lack of log understanding in the other end. Semantic technologies provide a suitable framework to address heterogeneity problems in large scale and dynamic resources. In this paper we introduce a new approach to tackle replication problem in a heterogeneous environment by utilizing ontologies.
    Advances in Databases, Knowledge, and Data Applications, 2009. DBKDA '09. First International Conference on; 04/2009
  • Proceedings of the Third International Conference on Weblogs and Social Media, ICWSM 2009, San Jose, California, USA, May 17-20, 2009; 01/2009