Conference Paper

An agent-based search engine based on the Internet search service on the CORBA

February 1999

February 1999

DOI:10.1109/DOA.1999.793979

Source
IEEE Xplore

Conference: Distributed Objects and Applications, 1999. Proceedings of the International Symposium on

Authors:

Shyan-Ming Yuan

National Yang Ming Chiao Tung University

Winston Lo

Tunghai University

Search services are important tools in the World Wide Web. In general, these standard Web search engines are far from ideal. Many researchers have therefore implemented the multi-engine search service (MESS) using meta-broker. However these MESS prove difficult when integrating a new search engine. On the other hand, applications that need the search service ability also prove difficult using these MESS. In this paper we propose an Internet search service (ISS) based on CORBA. We follow the style of Common Object Service Specification to define the interface of ISS, so that it is not only easily to integrate any search engine into multi-search services, but can also be queried by application programs. In addition, two search engine agents are implemented in our project, one is for Yahoo and the other is for AltaVista. Programmers can use this interface to code their search engine agent or to query the search service in their applications. Finally, we build a heterogeneous search engine agent based on this architecture

Structured Intelligent Search Engine for effective information retrieval using query clustering technique and Semantic Web

Conference Paper

Full-text available

Nov 2014

Survey on Heterogeneous Data for Recognizing Threat

Article

Full-text available

Dec 2011

Currently, much of the information is now in textual form, this information can be correlate and appropriate for solving problem on a particular problem. This could be data from the web, library data, logging, and past information that are stored as archives, these data can form a pattern of specific information. It gives a collection of datasets, we were asked to examine a sample of such data and look for pattern which may exist between certain pattern methods over time. In this paper, we showed a data mining approach to collecting scattered information in routine update regularly from provider or security community. This paper addresses problems and existing theories in possible future research in this field.

The Design and Implementation of a Multi-Threaded Object Request Broker.

Article

Full-text available

May 2000
J INF SCI ENG

Multi-threaded programming is a well-known technique for improving the performance of applications. In a CORBA environment, clients can invoke shared remote objects. If these objects are single-threaded, the performance of the system in the large distributed applications is affected. This paper presents a detailed description of the design and implementation of a multi-threaded Object Request Broker (ORB) on CORBA. The ORE was implemented on top of Windows NT and the underlying TCP protocol. The system's performance in both one-way and two-way requests is compared with that of a well-known commercial product, the IONA Orbix.

Design and implementation of multi-threaded object request broker

Conference Paper

Jan 1999

The distributed object oriented computing model is the next logical step to develop distributed applications. In recent years, several object models have been proposed, such as COM/DCOM, CORBA, and JAVA Bean etc. In CORBA, which was announced by OMG, object request broker is a software bus to connect applications and object components. In addition, multi threaded programming is a well known technique to improve the performance of applications. In a CORBA environment, clients can invoke the remote objects that are shared. If those objects are single threaded it will affect system performance in large distributed applications. We describe in detail the design and implementation of multi threaded object request broker based on CORBA. Our ORB was implemented atop Windows NT and underlying TCP protocol. Finally, we compare our system's performance with IONA's Orbix, which is a well known commercial product, in both one-way and two-way request

Designing a Distributed search engine for Farsi/English web pages

Article

Full-text available

In this paper we have tried to model, design and test a prototype of Farsi/English search engine. The engine has the duty of covering the web media features such as heterogeneity, volatility and huge amount of unstructured worldwide information. These features as well as the rapid advance in technology, challenge the effectiveness of classical Information Retrieval (IR) techniques. Although a growing number of sites with Farsi language support exist, still few research works have been done regarding the computational linguistic approach to this language, particularly in the field of thesaurus construction and stemming. In this paper, we have tried to utilize the past research experiences to design Farsi/English search engine. It seems that Unicode is sufficiently capable of preparing a conclusive environment within this respect specially regarding the issues of indexing and searching web pages. Many common Farsi code-pages are now being used which are to be converted into Unicode, in order to cover most of the existing Farsi web pages. To handle the complexity in analysis and design of the system and generate a visual easy-to-scale model Unified Modeling Language (UML) was utilized; and to assure scalability, distributed functionality and reliability, we tried to use some successful industrial solutions: Relational Database in managing the web-page indices and Clustering techniques to balance the high workload on the user interface and index management unit. We've chosen Common Object Request Broker Architecture (CORBA) due to distributed object-oriented design of the system and our agent-oriented trends in future. We've tried to apply CORBA design ideas to build a scalable and platform-independent framework.

Hybrid Routing for Ad Hoc Wireless Networks

Chapter

Full-text available

Jan 2011

The Zone Routing Protocol (ZRP) is a hybrid routing protocol that proactively maintains routes within a local region of the network (which we refer to as the routing zone). Here, we describe the motivation of ZRP and its architecture also the query control mechanisms, which are used to reduce the traffic amount in the route discovery procedure. In this paper, we address the issue of configuring the ZRP to provide the best performance for a particular network, at any time. Through NS2 simulation, we draw conclusions about the performance of the protocol. KeywordsZone Routing Protocol–Routing zone–Query control mechanisms

Intelligent Agent Based Resource Sharing in Grid Computing

Chapter

Jan 2011

Most of the resource present in grid are underutilized these days. Therefore one of the most important issue is the best utilization of grid resource based on users request. The architecture of intelligent agent proposed to handle this issue consists of four main parts. We discuss the need and functionality of such an agent and propose a solution for resource sharing which satisfies problems faced by today’s grid. A J2EE based solution is developed as a proof of concept for the proposed technique. This paper addresses issues such as resource discovery, performance, security and decentralized resource sharing which are of concern in current grid environment. Keywordsgrid–resource sharing–intelligent agent–decentralization

SOLO: an MPEG-7 optimum search tool

Conference Paper

Feb 2000

Studies surrounding MPEG-7 have so far concentrated only on the normative components. While feature extraction has been an active research subject in the content based retrieval domain, work on proper search engines is scarce. The paper presents SOLO, a perceived MPEG-7 optimum search tool prototype being developed at the University of Sydney. SOLO is built on the meta-search engine and mobile code paradigms, and is equipped with computational intelligence technology

The Distributed Information Search Component (DISCO) and the World Wide Web

Article

Full-text available

May 1997
SIGMOD REC

The Distributed Information Search COmponent (DISCO) is a prototype heterogeneous distributed database that accesses underlying data sources. The DISCO prototype currently focuses on three central research problems in the context of these systems. First, since the capabilities of each data source is different, transforming queries into subqueries on data source is difficult. We call this problem the weak data source problem. Second, since each data source performs operations in a generally unique way, the cost for performing an operation may vary radically from one wrapper to another. We call this problem the radical cost problem. Finally, existing systems behave rudely when attempting to access an unavailable data source. We call this problem the ungraceful failure problem. DISCO copes with these problems. For the weak data source problem, the database implementor defines precisely the capabilities of each data source. For the radical cost problem, the database implementor (optionally) defines cost information for some of the operations of a data source. The mediator uses this cost information to improve its cost model. To deal with ungraceful failures, queries return partial answers. A partial answer contains the part of the final answer to the query that was produced by the available data sources. The current working prototype of DISCO contains implementations of these solutions and operations over a collection of wrappers that access information both in files and on the World Wide Web.

The Distributed Information Search Component (Disco) and the World Wide Web

Conference Paper

Full-text available

Jun 1997

The Distributed Information Search COmponent (DISCO) is a prototype heterogeneous distributed database that accesses underlying data sources. The Disco prototype currently focuses on three central research problems in the context of these systems. First, since the capabilities of each data source is different, transforming queries into subqueries on data source is difficult. We call this problem the weak data source problem. Second, since each data source performs operations in a generally unique way, the cost for performing an operation may vary radically from one wrapper to another. We call this problem the radical cost problem. Finally, existing systems behave rudely when attempting to access an unavailable data source. We call this problem the ungraceful failure problem. DISCO copes with these problems. For the weak data source problem, the database implementor defines precisely the capabilities of each data source. For the radical cost problem, the database implementor (optionally) defines cost information for some of the operations of a data source. The mediator uses this cost information to improve its cost model. To deal with ungraceful failures, queries return partial answers. A partial answer contains the part of the final answer to the query that was produced by the available data sources. The current working prototype of DISCO contains implementations of these solutions and operations over a collection of wrappers that access information both in files and on the World Wide Web.

Experiences with Selecting Search Engines Using Metasearch

Article

Full-text available

Apr 2000

This article describes and evaluates SavvySearch, a metasearch engine designed to intelligently select and interface with multiple remote search engines. The primary metasearch issue examined is the importance of carefully selecting and ranking remote search engines for user queries. We studied the efficacy of SavvySearch's incrementally acquired metaindex approach to selecting search engines by analyzing the effect of time and experience on performance. We also compared the metaindex approach to the simpler categorical approach and showed how much experience is required to surpass the simple scheme.

STARTS: Stanford proposal for Internet meta-searching

Article

Jun 1997
SIGMOD REC

Document sources are available everywhere, both within the internal networks of organizations and on the Internet. Even individual organizations use search engines from different vendors to index their internal document collections. These search engines are typically incompatible in that they support different query models and interfaces, they do not return enough information with the query results for adequate merging of the results, and finally, in that they do not export metadata about the collections that they index (e.g., to assist in resource discovery). This paper describes STARTS , an emerging protocol for Internet retrieval and search that facilitates the task of querying multiple document sources. STARTS has been developed in a unique way. It is not a standard, but a group effort coordinated by Stanford's Digital Library project, and involving over 11 companies and organizations. The objective of this paper is not only to give an overview of the STARTS protocol proposal, but also to discuss the process that led to its definition.

Integrating Heterogeneous WWW Search Engines

Article

Daniel Dreilinger

The Common Object Request Broker: Architecture and Specification Version 3

Article

Jan 1991

Massachusett Framingham

Infomaster: a virtual information system

Article

Jan 1995

Merging Ranks from Heterogeneous Internet Sources.

Conference Paper

Jan 1997

Many sources on the Internet and elsewhere rank the objects in query results according to how well these objects match the origi- nal query. For example, a real-estate agent might rank the available houses according to how well they match the user's preferred lo- cation and price. In this environment, "meta- brokers" usually query multiple autonomous, heterogeneous sources that might use varying result-ranking strategies. A crucial problem that a meta-broker then faces is extracting from the underlying sources the top objects for a user query according to the meta-broker's ranking function. This problem is challeng- ing because these top objects might not be ranked high by the sources where they appear. In this paper we discuss strategies for solv- ing this "meta-ranking" problem. In particu- lar, we present a condition that a source must satisfy so that a meta-broker can extract the top objects for a query from the source with- out examining its entire contents. Not only is this condition necessary but it is also suf- ficient, and we show an algorithm to extract the top objects from sources that satisfy the given condition.

STARTS: Stanford proposal for Internet meta-searching

Conference Paper

Jun 1997

Document sources are available everywhere, both within the internal networks of organizations and on the Internet. Even individual organizations use search engines from different vendors to index their internal document collections. These search engines are typically incompatible in that they support different query models and interfaces, they do not return enough information with the query results for adequate merging of the results, and finally, in that they do not export metadata about the collections that they index (e.g., to assist in resource discovery). This paper describes STARTS, an emerging protocol for Internet retrieval and search that facilitates the task of querying multiple document sources. STARTS has been developed in a unique way. It is not a standard, but a group effort coordinated by Stanford's Digital Library project, and involving over 11 companies and organizations. The objective of this paper is not only to give an overview of the STARTS protocol proposal, but...

Object-Oriented Design with Applications

Book

Jan 1990

Grady Booch

Customizable Multi-Engine Search Tool with Clustering.

Article

Sep 1997

The dozens of existing search tools and the keyword-based search model have become the main issues of accessing the ever growing WWW. Various ranking algorithms, which are used to evaluate the relevance of documents to the query, have turn out to be impractical. This is because the information given by the user is too few to give good estimation. In this paper, we propose a new idea of searching under the multi-engine search architecture to overcome the problems. These include clustering of the search results and extraction of co-occurrence keywords which with the user's feedback better refines the query in the searching process. Besides, our system also provides the construction of the concept space to gradually customize the search tool to fit the usage for the user at the same time.

The quest for correct information on the Web: Hyper search engines

Conference Paper

Apr 1997

Finding the right information in the World Wide Web is becoming a fundamental problem, since the amount of global information that the WWW contains is growing at an incredible rate. In this paper, we present a novel method to extract from a web object its ''hyper'' informative content, in contrast with current search engines, which only deal with the ''textual'' informative content. This method is not only valuable per se, but it is shown to be able to considerably increase the precision of current search engines, Moreover, it integrates smoothly with existing search engines technology since it can be implemented on top of every search engine, acting as a post-processor, thus automatically transforming a search engine into its corresponding ''hyper'' version. We also show how, interestingly, the hyper information can be usefully employed to face the search engines persuasion problem. (C) 1997 Published by Elsevier Science B.V.

CORBA services: Common Object Services Specification

Book

Jan 1998

Object Oriented Design With Applications

Article

Jan 1991

Grady Booch

A Softbot-Based Interface to the Internet

Article

Mar 2002

The Internet Softbot (software robot) is a fully implemented AI agent developed at the University of Washington. It uses a Unix shell and the World-Wide Web to interact with a wide range of Internet resources. It uses a Unix shell and the World-Wide Web to interact with wide range of Internet resources. Effectors include ftp, telnet, mail, and numerous file manipulation commands: sensors include Internet facilities such as archie, gopher, netfind, and many more. The softbot is designed to incorporate new facilities into its repertoire as they become available.

A Brief Survey of Systems Providing Process Or Object Migration Facilities

Article

Mar 1997

Mark Nuttall

Migration is the movement of an active entity from one machine to another during execution. Such migration may be used for dynamic load balancing purposes with the aim of gaining increased performance from a group of processors than may be gained by schemes simply allocating processes to processors at run time. Schemes providing object migration also offer object persistence, improved fault tolerance and potentially more efficient remote object invocation (RPC). The survey covers systems providing process migration over both modified and unmodified UNIX and various experimental operating systems. Task migration over two modern microkernel-based operating systems is followed by a section on a number of object migration facilities with objects of varying granularity. 1 Introduction This report details a survey of systems providing process or object migration. Migration allows for processes to be moved from one machine to another during execution, hopefully with no loss of funct...

Amalthaea: An Evolving Multi-Agent Information Filtering and Discovery System for the WWW

Article

Feb 1998

Amalthaea is an evolving, multiagent ecosystem for personalized filtering, discovery and monitoring of information sites. Amalthaea's primary application domain is the WorldWide -Web and its main purpose is to assist its users in finding interesting information. Two different categories of agents are introduced in the system: filtering agents that model and monitor the interests of the user and discovery agents that model the information sources. A market-like ecosystem where the agents evolve, compete and collaborate is presented: agents that are useful to the user or other agents reproduce while lowperforming agents are destroyed. Results from various experiments with different system configurations and varying ratios of user interests versus agents in the system are presented. Finally issues like fine-tuning the initial parameters of the system and establishing and maintaining equilibria in the ecosystem are discussed. Keywords: Agents, Evolution, Information Filtering, W...

Multi-Service Search and Comparison Using the MetaCrawler

Article

Feb 2000

Standard Web search services, though useful, are far from ideal. There are over a dozen different search services currently in existence, each with a unique interface and a database covering a different portion of the Web. As a result, users are forced to repeatedly try and retry their queries across different services. Furthermore, the services return many responses that are irrelevant, outdated, or unavailable, forcing the user to manually sift through the responses searching for useful information. This paper presents the MetaCrawler, a fielded Web service that represents the next level up in the information "food chain." The MetaCrawler provides a single, central interface for Web document searching. Upon receiving a query, the MetaCrawler posts the query to multiple search services in parallel, collates the returned references, and loads those references to verify their existence and to ensure that they contain relevant information. The MetaCrawler is sufficiently lightweight to reside on a user's machine, which facilitates customization, privacy, sophisticated filtering of references, and more. The MetaCrawler also serves as a tool for comparison of diverse search services. Using the MetaCrawler's data, we present a "Consumer Reports" evaluation of six Web search services: Galaxy[5], InfoSeek[1], Lycos[15], Open Text[20], WebCrawler[22], and Yahoo[9]. In addition, we also report on the most commonly submitted queries to the MetaCrawler. Keywords: MetaCrawler, WWW, World Wide Web, search, multi-service, multi-threaded, parallel, comparison This paper appears in the Proceedings of the 1995 World Wide Web Conference 1 1

Query By Images Content Home Page

IBM, Inc. Query By Images Content Home Page. http://wwwqbic.almaden.ibm.com/~qbic/qbic.html.

The Virtual Tourist Home Page

Brandon Plewe

Brandon Plewe, The Virtual Tourist Home Page. http://wings.buffalo.edu/world.

WWW Home Pages Harvest Broker

Michael Schwartz

Michael Schwartz etc. al., WWW Home Pages Harvest Broker. http://town.hall.org/Harvest/broker/www-home-pages/.

Orbix Programming's Guide

Jan 1994

" Orbix Programming's Guide ", IONA Technologies Ltd., Novermber 1994.

Jan 1995

K Brockschmidt

K. Brockschmidt, " Inside OLE, " 2 nd, ed., Microsoft Press, Redmond, Washington (1995).

Home Pages Harvest Broker

M Schwartz

An agent-based search engine based on the Internet search service on the CORBA

Abstract

No full-text available

Recommended publications

Effective Internet Search Strategies: Internet Search Engines, Meta-Indexes and Web Directories

Image searching on the EXCITE Web search engine

Obelix Searches Internet Using Customer Data.

The Anatomy of a Large-Scale Hyper Textual Web Search Engine

Paying Your Way to the Top: Search Engine Advertising

A natural language processing based Internet agent

The evaluation of WWW search engines

Do logos and brand labels impact peoplesʼ selection behaviour in web search

The design of World Wide Web search engines: A critical review

Tuning up the search engine

運用RDF知識描述機制之網際網路資訊搜尋

Building a search engine for music and audio on the World Wide Web