ArticlePDF Available

Abstract and Figures

Recently search engines have provided a truly amazing search service, especially in finding general information on the Web. However, the question arises, does search engine perform the same when seeking domain specific information such as medical, geographical or agriculture information? Along with that issue, an experiment has been conducted to test the effectiveness of today's search engines from the aspect of information searching in a specific domain. There were four search engines have been selected namely Google, Bing, Yahoo and DuckDuckGo for the experiment. While for the domain specific, we chose to test information about the popular fruit in Southeast Asia that is durian. Precision metric has been used to evaluate the retrieval effectiveness. The findings show that Google has outperformed the other three search engines. Nevertheless, the mean average precision value 0.51 given by Google is still low to be satisfied neither by the researcher nor the information seekers.
Copyright © 2018 Authors. This is an open access article distributed under the Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
International Journal of Engineering & Technology, 7 (4.33) (2018) 1-4
International Journal of Engineering & Technology
Website: www.sciencepubco.com/index.php/IJET
Research paper
A Comparative Evaluation of Search Engines on Finding
Specific Domain Information on the Web
Azilawati Azizan1*, Zainab Abu Bakar2, Nurazzah Abd Rahman3, Suraya Masrom1, Nurkhairizan Khairuddin1
1Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Perak Branch, Tapah Campus, Tapah Road,
35400 Perak, Malaysia
2Al-Madinah International University, Shah Alam, 40100 Selangor, Malaysia
3Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Shah Alam, 40450 Selangor, Malaysia
*Corresponding author E-mail: azila899@perak.uitm.edu.my
Abstract
Recently search engines have provided a truly amazing search service, especially in finding general information on the Web. However, the
question arises, does search engine perform the same when seeking domain specific information such as medical, geographical or agricul-
ture information? Along with that issue, an experiment has been conducted to test the effectiveness of today’s search engines from the
aspect of information searching in a specific domain. There were four search engines have been selected namely Google, Bing, Yahoo and
DuckDuckGo for the experiment. While for the domain specific, we chose to test information about the popular fruit in Southeast Asia that
is durian. Precision metric has been used to evaluate the retrieval effectiveness. The findings show that Google has outperformed the other
three search engines. Nevertheless, the mean average precision value 0.51 given by Google is still low to be satisfied neither by the re-
searcher nor the information seekers.
Keywords: Search engine evaluation; Precision; Specific domain; Durian.
1. Introduction
The Web has become the largest unorganized repository of data and
information [1]. In fact, in the present it has turn out to be an infor-
mation deluge which causing information search to be more chal-
lenging. Hence, it is not easy to find a piece of information without
assistance of the search engine. Search engine also has become a
primary need since searching activity has been a daily routine now-
adays. Therefore we need to have a very good search engine so that
it can fulfill the user’s needs.
The purpose of this experiment is to evaluate the effectiveness of
the commercial search engine on searching domain specific infor-
mation. So, the research is done to confirm the needs of improve-
ments in this searching technology. The findings from this experi-
ment also proved that the general problem statement in Information
Retrieval field (to retrieve all relevant documents to a user query
while retrieving as few non-relevant documents as possible) is still
relevant until today [2].
This paper is organized as follows: Section 2 shares several previ-
ous works related to search engines evaluation. Then, section 3 de-
scribes the methodology employed to evaluate the relevance of the
search results in terms of precision value. While section 4 exhibits
and discusses the result and the last section concludes the paper in-
cluding the issues and challenges on searching the Web.
2. Related Works
Many studies about search engine effectiveness have been done by
various researchers worldwide, and mostly is a comparative type
study. Most of them tested the effectiveness by using general topic
query. Among the comparisons ever done, were against the key-
word-based search engines and the semantic-based search engines
[3-4]; commercial search engines against dedicated search engines
[5] and English search engines against other language search en-
gines [6-7]. Some researcher also did compare the effectiveness us-
ing short queries and long queries [8]; natural language queries [9],
reformulated queries [10] and many more.
Even so, the comparative study involving specific domain search is
still lacking. Among the available publications is the research done
by [11]. They evaluated the search engines effectiveness on finding
health information domain. They chose to compare between general
search engines (Google, Bing, Yahoo, Sapo) and health-specific
search engines (MedlinePlus, SapoSaude, WebMD). They found
that general search engines have surpassed all the health-specific
search engines and Google has the highest precision value in the top
ten results.
In [12] has evaluated three search engines’ application program-
ming interfaces (API) on finding geographic web services. They
chose Google, Bing and Yahoo and they reported that discovering
geographic web services using search engine does not require the
use of advanced search operator. They also reported Yahoo has out-
performed the other search engines in discovering the geographic
web services domain.
In [13] compared the performance of 4 international search engines
(Google, Yahoo, Altavista, Exalead) and 4 Greek search engines
(Google.gr, In.gr, Robby.gr,Find.gr) in the point of view of Greek
librarians. He concluded that most librarians were satisfied and pre-
ferred to use international search engines.
Due to this limited analysis addressed by the researches in compar-
ing search effectiveness involving specific domain search, we de-
cided to conduct an experiment on comparing the current popular
International Journal of Engineering & Technology
search engines in finding information on fruit domain. The inspira-
tion for this study is to motivate researchers and search engine pro-
viders towards producing better search technology in the future.
3. Methodology
The methodology being employed in this experiment is adopting a
common approach being used in many search engine evaluation re-
search works. Generally the first step starts by selecting the search
engine, and then a list of search queries will be identified [14]. The
setting of the queries might be chosen from a variety of features
such as simple, complex, natural language or multi language que-
ries. Next step is to submit or run the queries to the chosen search
engine and subsequently record all the search results. Before the
analysis is made, the researcher will first identify the eligible person
and resources to do the relevance judgment process. Lastly, the
analysis is made based on the standard evaluation measure that are
precision and recall metric. There are also many other evaluation
measures can be used such as Mean Average Precision (MAP), Av-
erage Precision at n (P@n), R-Precision, Precision Histogram,
Mean Reciprocal Rank (MRR), E-Measure and F-Measure [2].
Though, the most widely used measurement metric is the standard
precision measure.
In order to find out how those search engines performed when
searching for domain specific information, we decided to do a com-
parative experiment among the popular search engines. Therefore,
the selection of the search engines to be tested in this experiment is
based on the most popular and most successful search engine rated
by several search engine optimization websites such as Search En-
gine Watch [15], Search Engine Journal [16] and Alexa.com [17].
They have listed more than 10 popular search engines based on their
traffic statistics, market shares and user responses. In the list,
Google has been always on the first ranking compared to other
search engines. For that reason, we chose Google and another 3 top
ranking search engines that are Bing, Yahoo and DuckDuckGo. Ta-
ble 1 shows the list and URL of the selected search engines.
Table 1: Selected Search Engine for the Experiment
No.
Search Engine
URL
1
Google
https://www.google.com/
2
Bing
https://www.bing.com/
3
Yahoo
https://yahoo.com/
4
DuckDuckGo
https://duckduckgo.com/
3.1. Search queries and domain
Prior to the selection of the queries, a survey has been done through
online forums / groups, social media and also from the Google sug-
gestion system features (Google Instant). The purpose of doing the
survey was to get a general idea of the questions that users often ask
about durian. Apart from getting the collection of queries, the sur-
vey also indirectly revealed the information that user commonly
want to find about durian. Initially 290 queries about durian were
collected from the survey. In order to validate the queries and facts
about durian domain, we do collaborate with the domain experts
that are the durian experts from MARDI (Malaysian Agricultural
Research and Development Institute) and the durian farmers. Fi-
nally we decided to run the pre-test by using only 8 queries that has
been identified and validated by MARDI as a very commonly ques-
tion being asked by the user about durian. List of the queries is
shown in Table 2.
Each query listed in Table 2 was submitted to each of the selected
search engine and the results were captured. It retrieved tons of re-
sults, but only the first top 20 results (links) being analyzed. This is
because many studies in search behavior field reported that most
Web users will only inspect the top 10 search results [18] and it is
relatively uncommon for a user to inspects beyond the top 20 results
[19].
Table 2: Test Queries
3.2. Evaluation criteria
In order to maintain the evaluation quality of the web search en-
gines, it typically uses human judgements to indicate which results
are relevant for a given query [2]. Therefore, in this experiment, all
the links (search results) were evaluated using human relevance
judgment. The judgment is made based on the facts provided by
MARDI. Each link has been classified as ‘relevant’ or ‘not relevant’.
All those steps were repeated until all the queries being run on all 4
selected search engines. In total, 640 links have been evaluated by
the same author so that the judgments made are more consistent.
Besides that, all the searches and evaluations were performed in
minimal time space to ensure a stable performance measurement of
the search engines.
For the purpose of retrieval evaluation, a standard precision and re-
call metrics were used to evaluate the retrieval quality. Precision is
defined as the fraction of relevant documents retrieved to the num-
ber of total documents retrieved. It is formulated as in (1). While,
recall is the fraction of relevant documents retrieved to the number
of relevant documents in the collection as shown in (2).
In the case of evaluating commercial search engine, recall value is
quite impossible to be calculated since we do not know the total
number of relevant document in the entire search engine collection.
Therefore, we only do the precision measure, which we considered
the ‘links retrieved’ (search results) as the ‘document retrieved’.
Comparing retrieval evaluation for different algorithm or methods
over a set of queries is commonly use the average precision values
as in (3), where
 is the average precision at recall level and
 is the precision at recall level for the i-th query over to-
tal number of queries.

  


(3)
Single value summary of the evaluation can be presented using
Mean Average Precision (MAP). The mean value precision over a
set of queries is defined as in (4), where 
 is the sum-
mation of average precision obtained for all relevant documents and
is the total number of queries.
  
 
(4)
It is a common practice when evaluating search engine, precision
of the search results always being measured at the top positions in
the ranking. Typically, precision is measured at cut-off point 5, 10
and 20 [2]. It means the precision value is being calculated when 5,
10 and 20 documents / links have been seen. In practice, it is being
Query Number
Queries
Q1
List of insect pests that attack the durian tree
Q2
When is the durian season in Malaysia
Q3
What are the varieties of durian in Malaysia
Q4
What are the characteristics of good quality durian
Q5
How to plant durian
Q6
How to control durian tree disease
Q7
What are the products of durian
Q8
What are the side effects of eating durian to health
Precision =
No. of relevant document retrieved
(1)
No. of document retrieved
Recall =
No. of relevant document retrieved
(2)
No. of relevant document in the
collection
International Journal of Engineering & Technology
3
written as precision at 5 (P@5), precision at 10 (P@10) and preci-
sion at 20 (P@20). In our evaluation, we did consider P@15 as an
additional value to be analysed.
4. Results and discussion
The number of relevant links retrieved for all search engines ac-
cording to query is shown in Table 3. The data shows that Google
has retrieved the most relevant link followed by Yahoo, Duck-
DuckGo and Bing. Google has retrieved 61 relevant links out of 160
retrieved links overall which gives 38.13%. Google surpasses Ya-
hoo by 3.75%; Yahoo surpasses DuckDuckGo by 3.13% while
DuckDuckGo surpasses Bing very thinly by 0.62%. Data in Table
3 also showed that query number 8 (Q8) has the highest total rele-
vant retrieved, while Q4 has the lowest relevant retrieved by all
search engines.
Table 3: Relevant Links Retrieved for each Query and Search Engines
Query
Google
Bing
Yahoo
DuckDuckGo
Q1
4
4
4
4
Q2
5
3
4
4
Q3
14
9
11
10
Q4
3
1
2
1
Q5
10
8
11
11
Q6
5
7
5
7
Q7
8
2
6
3
Q8
12
15
12
10
Total
61
49
55
50
Relevant Re-
trieved %
38.13
30.63
34.38
31.25
Mean of
relevant
7.63
6.13
6.88
6.25
Average precision for all search engines was compared and por-
trayed in a line graph in Figure 1. We can see clearly Bing has the
lowest line in the graph, while to identify the highest line is quite
difficult because the graph lines for Google, Yahoo and Duck-
DuckGo do not show much difference.
Fig. 1: Average Precision Over Retrieved Link-i
Since the average precision graph in Figure 1 does not help much
in seeing the difference, so we analyzed the results at several cut-
off points. Table 4 shows the average precision at cut-off point 5,
10, 15 and 20 for all search engines.
Table 4: Average Precision at n
Search Engine
P@5
P@10
P@15
P@20
Google
0.650
0.450
0.417
0.381
Bing
0.425
0.425
0.325
0.306
Yahoo
0.575
0.475
0.392
0.344
DuckDuckGo
0.550
0.413
0.367
0.313
The values show that Google has outperformed at three cut-off
point that are P@5, P@15 and P@20. On the other hand, Yahoo
has the highest average precision at cut-off point 10. The compari-
sons among all the search engines can be seen clearly in Figure 2.
There is quite a significant difference of precision value at cut-off
point 5, while at cut-off point 10, all search engines achieved almost
similar value.
Fig. 2: A Graph for Average Precision at n
To summarize the results, mean average precision (MAP) for each
search engine over queries were calculated and illustrated in Figure
3. Google achieved 0.505 mean value, followed by Yahoo (0.488),
DuckDuckGo (0.440) and Bing (0.403). Mean value between the
highest (Google) and the lowest (Bing) is 0.102 which gives 20.2%
difference.
Fig. 3: A Graph for Average Precision at n
Many comparative studies reported that Google always outperform
other search engines [7, 11]. As an example result from Deka’s
evaluation [20] reported that Google has the highest rate of perfor-
mance, followed by Yahoo and Live, Ask and AOL search engines.
Findings in our experiment also reported almost similar results
which Google is at the top rank and followed by Yahoo and other
search engines. It proved that Google always surpass all his com-
petitors.
5. Conclusion
The result shows that Google surpass the precision of other search
engines at three cut-off points (P@5, P@15, P@20), while Yahoo
has the highest precision at cut-off 10. Many other researchers also
reported Google always outperform in their experiments, for exam-
ple in [19] claimed that Google has outperformed Hakia in his ex-
periment as Google had mean precision at 0.64 as compared to
Hakia at 0.54 for general topic search. Whereas in our experiment,
Google achieved lower mean average precision that is 0.51 for spe-
cific domain search (durian fruit information). So we concluded that
even though Google always outperformed other search engines, but
mean precision value 0.51 given by Google for finding specific do-
main information particularly in durian fruit is still unsatisfactory.
This means Google only achieve half from the perfect mean value
that is 1.0. This analysis also reveals how search engines differ in
their responses when seeking for specific domain information such
as fruit information (e.g.: durian) on the Web.
International Journal of Engineering & Technology
Acknowledgement
We would like to thank Mr. Bahari Mohd Nasaruddin (Director of
MARDI Perak Malaysia), Mr. Muhamad Afiq Tajol Ariffin (Senior
Scientist-Senior Research Officer, Horticultural Center, MARDI
Sintok Kedah Malaysia) and durian farmers in Kedah and Perak for
collaborating and also to Universiti Teknologi MARA for the finan-
cial support of this project.
References
[1] R. Baeza-Yates (2003), Information retrieval in the Web: Beyond
current search engines. Int. J. Approx. Reason. 34 (23), 97104.
[2] R. Baeza-Yates & B. Ribeiro-Neto (1999), Modern Information
Retrieval. ACM Press. Addison Wesley.
[3] J. Singh (2013), A comparative study between keyword and semantic
based search engines. Proceedings of the International Conference
on Cloud, Big Data and Trust, pp. 130134.
[4] D. Tümer, M. A. Shah & Y. Bitirim (2009), An empirical evaluation
on semantic search performance of keyword-based and semantic
search engines: Google, Yahoo, Msn and Hakia. Proceedings of the
Fourth Int. Conf. Internet Monit. Prot., pp. 5155.
[5] Y. Peng & D. He (2006), Direct comparison of commercial and
academic retrieval system: An initial study. Proceedings of the
International Conference on Information and Knowledge
Management, pp. 12.
[6] Y. Bitirim & A. K. Görür (2017), A comparative evaluation of
popular search engines on finding Turkish documents for a specific
time period. Teh. Vjesn. - Tech. Gaz. 24, 565569.
[7] J. Zhang, W. Fei & T. Le (2013), A comparative analysis of the
search feature effectiveness of the major English and Chinese search
engines. Online Inf. Rev. 37, 217230.
[8] A. K. Mariappan & V. S. Bharathi (2012), A comparative study on
the effectiveness of semantic search engine over keyword search
engine using TSAP measure. Proceedings of the International Con-
ference on E-Governance and Cloud Computing Services, pp. 46.
[9] N. Hariri (2013), Do natural language search engines really
understand what users want? A comparative study on three natural
language search engines and Google. Online Inf. Rev. 37, 287303.
[10] A. Azizan, Z. A. Bakar & S. A. Noah (2014), Analysis of retrieval
result on ontology-based query reformulation. Proceedings of the
IEEE International Conference on Computer, Communication, and
Control Technology, pp. 244248.
[11] C. T. Lopes & C. Ribeiro (2011), Comparative evaluation of web
search engines in health information retrieval. Online Inf. Rev. 35,
869892.
[12] F. J. Lopez-Pellicer, A. J. Florczyk, R. Béjar, P. R. Muro-Medrano,
& F. Javier Zarazaga-Soria (2011), Discovering geographic web
services in search engines. Online Inf. Rev. 35, 909927.
[13] E. Garoufallou (2012), Evaluating search engines: A comparative
study between international and Greek SE by Greek librarians.
Program 46, 182198.
[14] D. Hawking, N. Craswell, P. Bailey & K. Griffiths (2001),
Measuring search engine quality. Inf. Retr. Boston. 4, 3359.
[15] Search Engine Watch (2018). https://searchenginewatch.com/.
[16] Search Engine Journal (2018).
https://www.searchenginejournal.com/.
[17] Alexa.Com-TopSites (2018).
https://www.alexa.com/topsites/category/Computers/Internet/Searc
hing/Search_Engines.
[18] B. J. Jansen, D. L. Booth & A. Spink (2009), Patterns of query
reformulation during web searching. Journal of the American Society
for Information Science and Technology, 60(7), 13581371.
[19] M. Andago, T. P. L. Phoebe & B. A. M. Thanoun (2010), Evaluation
of a semantic search engine against a keyword search engine using
first 20 precision. Int. J. Adv. Sci. Arts 1(2), 5563.
[20] S. K. Deka & N. Lahkar (2010), Performance evaluation and
comparison of the five most used search engines in retrieving web
resources. Online Inf. Rev. 34, 757771.
... Computer aided search engines can potentially offer widespread and cheap access to large research repositories with relevant reports and publications (Top and Wigham, 2015;Azizan et al., 2018). In such systems the user usually enters a few search terms, the system returns a ranked list of documents and the user refines the search terms if needed. ...
Article
A key challenge in agriculture, as in other disciplines, is taking a large body of research-based knowledge and making it meaningful to the user-audience. Computer aided search engines potentially can offer widespread access to large repositories with relevant reports and publications, however the usefulness of such systems for the practitioners who are dealing with multi-faceted and context-related issues is often limited. Building search engines with user-centered ontologies offer a means of resolving this as it provides a vocabulary common to different stakeholders and can optimise the interaction between practitioner users and the expert system. The paper critically reflects on the methodology used to construct a user-centered ontology in the development of a search engine designed to help agricultural practitioners (farmers and advisers)find useful research outputs. This involved the iterative participation of domain experts, adviser practitioners and stakeholder communities in ten diverse case studies across Europe. Specifically it analyses the design, validation and evaluation phases of the ontology development drawing on qualitative data (reports, observations, interviews)from four case studies and asks: How effective is the process of co-constructing an ontology with experts, practitioners and other stakeholders in enabling the search for useful and meaningful knowledge? In doing this, it contributes to a deeper theoretical understanding of shared concepts and meanings in the context of digital communications in the agricultural arena by adapting Carlile's (2004)framework of syntactic, semantic and pragmatic capacities.
Article
Full-text available
This study evaluates the popular search engines, Google, Yahoo, Bing, and Ask, on finding Turkish documents by comparing their current performances with their performances measured six years ago. Furthermore, the study reveals the current information retrieval effectiveness of the search engines. First of all, the Turkish queries were run on the search engines separately. Each retrieved document was classified and precision ratios were calculated at various cut-off points for each query and engine pair. Afterwards, these ratios were compared with the six years ago ratios for the evaluations. Besides the descriptive statistics, Mann-Whitney U and Kruskal-Wallis H statistical tests were used in order to find out statistically significant differences. All search engines, except Google, have better performance today. Bing has the most increased performance compared to six years ago. Nowadays: Yahoo has the highest mean precision ratios at various cut-off points; all search engines have their highest mean precision ratios at cut-off point 5; dead links were encountered in Google, Bing, and Ask; and repeated documents were encountered in Google and Yahoo.
Conference Paper
Full-text available
This paper investigates the semantic search performance of search engines. Initially, three keyword-based search engines (Google, Yahoo and Msn) and a semantic search engine (Hakia) were selected. Then, ten queries, from various topics, and four phrases, having different syntax but similar meanings, were determined. After each query was run on each search engine; and each phrase containing a query was run on the semantic search engine, the first twenty documents on each retrieval output was classified as being ldquorelevantrdquo or ldquonon-relevantrdquo. Afterwards, precision and normalized recall ratios were calculated at various cut-off points to evaluate keyword-based search engines and the semantic search engine. Overall, Yahoo showed the best performance in terms of precision ratio, whereas Google turned-out to be the best search engine in terms of normalized recall ratio. However, it was found that semantic search performance of search engines was low for both keyword-based search engines and the semantic search engine.
Article
Full-text available
Purpose There is an open discussion in the geographic information community about the use of digital libraries or search engines for the discovery of resources. Some researchers suggest that search engines are a feasible alternative for searching geographic web services based on anecdotal evidence. The purpose of this study is to measure the performance of Bing (formerly Microsoft Live Search), Google and Yahoo! in searching standardised XML documents that describe, identify and locate geographic web services. Design/methodology/approach The study performed an automated evaluation of three search engines using their application programming interfaces. The queries asked for XML documents describing geographic web services, and documents containing links to those documents. Relevant XML documents linked from the documents found in the search results were also included in the evaluation. Findings The study reveals that the discovery of geographic web services in search engines does not require the use of advanced search operators. Data collected suggest that a resource‐oriented search should combine simple queries to search engines with the exploration of the pages linked from the search results. Finally the study identifies Yahoo! as the best performer. Originality/value This is the first study that measures and compares the performance of major search engines in the discovery of geographic web services. Previous studies were focused on demonstrating the technical feasibility of the approach. The paper also reveals that some technical advances in search engines could harm resource‐oriented queries.
Conference Paper
Users normally facing difficulties to expresses their information needs into a query format that the search system can use to process. Therefore users frequently modify their query with intention to retrieve better results. The goal of this experiment is to compare the precision of the retrieved results by employing domain ontology to reformulate the user query. We proposed to reformulate the query (initially in natural language) by replacing it with the terms retrieved from ontology. Then we improve the approach by combining the terms from ontology with keyword/s extracted from the initial user's natural language query. Precision measure had been conducted for six queries and the results showed that the idea of combining ontology terms with keyword from query gave a promising result. It achieved higher precision in comparison when using ontology terms alone. This experimental result indicates that query reformulation is more efficient when ontology term/s combined with key element from the user's query.
Article
Purpose – The main purpose of this research is to determine whether the performance of natural language (NL) search engines in retrieving exact answers to the NL queries differs from that of keyword searching search engines. Design/methodology/approach – A total of 40 natural language queries were posed to Google and three NL search engines: Ask.com, Hakia and Bing. The first results pages were compared in terms of retrieving exact answer documents and whether they were at the top of the retrieved results, and the precision of exact answer and relevant documents. Findings – Ask.com retrieved exact answer document descriptions at the top of the results list in 60 percent of searches, which was better than the other search engines, but the mean value of the number of exact answer top list documents for three NL search engines (20.67) was a little less than Google's (21). There was no significant difference between the precision for Google and three NL search engines in retrieving exact answer documents for NL queries. Practical implications – The results imply that all NL and keyword searching search engines studied in this research mostly employ similar techniques using keywords of the NL queries, which is far from semantic searching and understanding what the user wants in searching with NL queries. Originality/value – The results shed light into the claims of NL search engines regarding semantic searching of NL queries.
Article
Purpose The purpose of this paper to investigate the effectiveness of selected search features in the major English and Chinese search engines and compare the search engines’ retrieval effectiveness. Design/approach/methodology The search engines Google, Google China, and Baidu were selected for this study. Common search features such as title search, basic search, exact phrase search, PDF search, and URL search, were identified and used. Search results from using the five features in the search engines were collected and compared. One‐way ANOVA and regression analysis were used to compare the retrieval effectiveness of the search engines. Findings It was found that Google achieved the best retrieval performance with all five search features among the three search engines. Moreover Google achieved the best webpage ranking performance. Practical implications The findings of this study improve the understanding of English and Chinese search engines and the differences between them in terms of search features, and can be used to assist users in choosing appropriate and effective search strategies when they search for information on the internet. Originality/value The original contributions of this paper are that the Chinese and English search engines in both languages are compared for retrieval effectiveness. Five search features were evaluated, compared, and analysed in the two different language environments by using the discounted cumulative gain method.
Article
Purpose The use of search engines is the most widely acceptable way for information foraging on the net. Their wide use as an information retrieval tool has created the need for their evaluation as a means of improving their performance. This research attempts to record Greek librarians' views on search engines: their performance and characteristics. Design/methodology/approach A total of 16 librarians were asked to search for a specific topic using eight search engines; four international (google.com, altavista.com, yahoo.com, exalead.com) and four Greek (google.gr (searching only for Greek text), in.gr, robby.gr, find.gr). Eight questionnaires were completed by each participant; one for every search engine. A total of 128 initial searches were performed by the librarians, followed by 86 further searches with changed search terms. The librarians recorded their experiences in retrieving information and evaluated the first 20 results according to the criteria of precision, relevancy of the retrieved records and the way the results were displayed by each search tool. Findings Analysis of the results leads to conclusions about librarians' familiarity with search engines and their views on the retrieved information. The results indicated that participants were satisfied by the presentation, the visualization, the quality and value of results and they were very satisfied with the search engines' interfaces. Thus, most retrieved items were relevant and so the degree of precision was satisfactory. Users preferred mainly international search engines rather than Greek search engines. It was evident that most librarians were very satisfied with the performance of the search engines and felt that their queries had been answered successfully. Originality/value The paper presents one of the few studies regarding international and Greek search engines and their use by librarians. The study gathered data with regard to the views of Greek librarians on the use of search engines and their characteristics. In addition, it recorded the attitude of librarians to both the search process and subsequent information retrieval, using Greek and international search engines. It compared different search engines and studied parameters like quality, precision, presentation and value of the search results. This research could form the basis of further study of librarians' behavior in the use of search engines for satisfying their information needs and comparison of information retrieval systems.
Article
In this paper we briefly explore the challenges to expand information retrieval (IR) on the Web, in particular other types of data, Web mining and issues related to crawling. We also mention the main relations of IR and soft computing and how these techniques address these challenges.
Article
Purpose – The purpose of the paper is to evaluate the performance and efficiency of the five most used search engines, i.e. Google, Yahoo!, Live, Ask, and AOL, in retrieving internet resources at specific points of time using a large number of complex queries. Design/methodology/approach – In order to examine the performance of the five search engines, five sets of experiments were conducted using 50 complex queries within two different time frames. The data were evaluated using Excel and SPSS software. Findings – The paper results highlight the fact that different web search engines, which use different technology to find and present web information, yield different first page search results. The overall analysis of the findings of different measures reveals that Google has a significantly higher rate of performance in retrieving web resources as compared with the other four search engines. Yahoo! is the second best in terms of retrieval performance. The other three search engines did not performed satisfactorily compared with Google and Yahoo! Originality/value – The paper will provide important insight into the effectiveness of major search engines and their ability to retrieve relevant internet resources. This paper has produced key findings that are important for all web search engine users and researchers, and the web industry. The findings will also assist search companies to improve their services.