ArticlePDF Available

The Role of Backlinks in Search Engine Ranking

Authors:
  • Jagan Nath University, Jaipur, India

Abstract and Figures

Search engines are designed to crawl and index the web pages efficiently for better search results. There is huge contribution of link building for website popularity. The number of backlinks play an important role for website ranking in search engines. Backlink is any link received by a website from another website. This paper describes the techniques of making backlinks with its effect in improving search engine ranking.
Link Building Techniques 1. Article Submission: Submission of articles in Reputed Directories is one of the best techniques for making backlinks. The examples of reputed directories are: www.sooperarticles.com www.hubpages.com etc. 2. Directory Submission: Submission of website link into reputed web directories and internet directories is termed as Directory Submission. In the process of directory submission, the submission of url, title, description and other information of website is submitted. Example: Dmoz(www.dmoz.org), Yahoo(www.dir.yahoo.com) etc. 3. Comment Posting: In this technique, backlinks are created by commenting on different blogs and websites. The blog commenting is really effective on the same niche blog and websites. 4. Forum Posting: Forum Posting is a technique of link building in which website links are associated with good forum posts. In this technique, efforts are made to make backlinks from forum discussion in related niche forum websites. 5. Press Release Submission: In Press Release Submission, the latest news is submitted in related websites. 6. Social Bookmarking: Social Bookmarking is the best method to organize, store, manage and search for bookmarks of online resources. There are several social bookmarking websites available like Digg.com, Stumbleupon.com etc. 7. Classified Submission: Classified Ads are basically online ads which are placed on classified related websites. This is also the best method of getting backlinks. 8. Videos Submission: Backlinks can also be created by submitting videos in video sharing websites like youtube, vimeo etc. [8]. Importance of PageRank and Anchor Text in Search Engine Ranking: For website search engine ranking, anchor text and page rank are really important factors. Anchor text is represented as <a href= " url of webpage " >keyword</a>. As backlinks on keywords increase, the probability of search engine ranking on a particular keyword also increases. The link associated with higher pagerank has greater importance as compared to the lower pagerank. Link Practices that should be avoided: The backlinks on similar niche websites are important for better search engine ranking; but some webmasters focus on making many counts of backlinks instead of quality of links, which is not a good strategy of optimization. The reciprocal link exchange should also be avoided. In a link exchange, one webmaster places a link on his website that points to another webmaster's website, and vice versa. Backlinks on same IP websites should be avoided, as this can be the biggest reason to hurt search engine ranking of all the websites [9][10].
… 
Content may be subject to copyright.
© 2013, IJARCSSE All Rights Reserved Page | 596
Volume 3, Issue 4, April 2013 ISSN: 2277 128X
International Journal of Advanced Research in
Computer Science and Software Engineering
Research Paper
Available online at: www.ijarcsse.com
The Role of Backlinks in Search Engine Ranking
Ayush Jain1, Meenu Dave2
1M.Tech Scholar, Department of CSE, Jagan Nath University, Jaipur, India
2Assistant Professor, Department of CSE, Jagan Nath University, Jaipur, India
Abstract Search engines are designed to crawl and index the web pages efficiently for better search results. There is
huge contribution of link building for website popularity. The number of backlinks play an important role for website
ranking in search engines. Backlink is any link received by a website from another website. This paper describes the
techniques of making backlinks with its effect in improving search engine ranking.
Keywords Backlink, Crawler, PageRank, Ranking, Search Engines, SERP, Websites.
I. Introduction
Websites are growing day by day. Te world wide web makes massive use of the search engines which play a critical role.
Search engines work in two steps: the first step is crawling and the second is indexing. Search engines have automated
programs called robots or spiders that make use of hyperlink structure to crawl through the pages, and after the crawling
process, the webpage is indexed with the search engine. All search engines like Google, Yahoo, Bing etc. crawl through the
web pages according to the on page content, quality and number of links from other websites (known as Backlinks). As the
backlinks increase in number, the ranking in search engine also increases.
II. Background
Search Engine: Search Engine is an internet tool that searches index of documents for a particular keyword. Search engine
extracts the information from the World Wide Web. The results are presented in form of SERP (search engine results page).
Many search engines are available for the purpose, like yahoo, Google, Bing etc., but the most popular search engine is
Google [1].
Search Engine Working:
The working of the search engine is divided into two parts:
(1.) Crawling The Web: Search engines have automated programs called robots or spider, that use the hyperlink structure to
crawl the pages and documents that make up the world wide web [2]. A crawler is a program that visits websites and reads
their pages in order to create entries for a search engine index.
Once the pages have been crawled, its content can be indexed and stored in the large database of search engine [3].
Figure 1: Search Engine Diagram
Crawler: A Crawler is a program that retrieves web pages, commonly for use by the search engines [4].
Jain et al., International Journal of Advanced Research in Computer Science and Software Engineering 3(4),
April - 2013, pp. 596-599
© 2013, IJARCSSE All Rights Reserved Page | 597
Types of Web Crawlers:
1. Server Side Crawlers: The server side crawlers are business oriented, scalable, reliable and resource hungry, e.g. Google,
AltaVista etc.
2. Client Side Crawlers: Client Side Crawlers are more customer oriented, have much smaller requirements and need
guidance to proceed, e.g. Teleport Pro, Web Snake etc.
Crawler starts with the URL for initial point p0. It retrieves the content until the end of URL, and a queue would be made for
this. The frequency and permission of web crawler depends upon many factors of website like robots.txt, .htaccess file, web
servers, firewalls, sitemap and specially the content of website. The designing of crawler is a very difficult process. The
crawler uses some important parameters and metrices for crawling the entire web.
1. Backlink Count: The value of I(p) is the number of links to p that appear over the entire web. We use IB(p) to refer to the
important metric. Intuitively, a page p that is referred to by many pages is more important than one that is seldom referenced.
IB(p) metric treats all links equally.
2. PageRank: PageRank was developed by Google founders Larry Page and Seregy Brin. PageRank is a link analysis
algorithm, used by Google, that assigns a numeric weight to each of the hyperlinked set of documents [5]. Page Rank varies
from 0 to 10. PageRank determines the importance of page. There are some official websites of Apple, Google, Microsoft,
Macromedia, NASA etc [6], that have pagerank 10.
In Backlinks count, all links are treated equally, but in case one link is from Google blog and the other is from normal
webpage, then the importance of the above links will vary. Google is more important than the other website URL, so it has a
higher IB count.
3.Forward Link Count: It is represented as IF(p) that counts the number of links that is from p. A web page with many
numbers of outbound links is really important.
4.Location Metric: It is basically represented as IL(p). If URL u leads to p, then IL(p) is a function of u. for example, URL
ending with .edu is more important than any other domain.
The Crawler Methodology:
Any crawler consists of a set of pages S that are to be visited, and a set of links L retrieved from the page being currently
parsed. A graph is constructed as the pages are being parsed, the graph would show the links from one page to the other, and
this graph is later used for indexing.
For programming of the crawlers, data structures like stacks , queues etc are used for S and L. The process of crawler is
known as C-proc [7].
Backlinks: Backlinks are links from other websites to your website. Backlinks, as shown in figure 2, are important for SEO
(search engine optimization), because search engine’s algorithm gives credit if any website has large number of backlinks. As
the backlinks increase, website’s popularity and search engine ranking will increase.
Figure 2: Backlinks
Jain et al., International Journal of Advanced Research in Computer Science and Software Engineering 3(4),
April - 2013, pp. 596-599
© 2013, IJARCSSE All Rights Reserved Page | 598
Techniques of Making Backlinks:
There are some techniques for the creation of backlinks as shown in figure 3.
Figure 3: Link Building Techniques
1. Article Submission: Submission of articles in Reputed Directories is one of the best techniques for making backlinks.
The examples of reputed directories are:
www.sooperarticles.com
www.hubpages.com etc.
2. Directory Submission: Submission of website link into reputed web directories and internet directories is termed as
Directory Submission. In the process of directory submission, the submission of url, title, description and other
information of website is submitted.
Example: Dmoz(www.dmoz.org),
Yahoo(www.dir.yahoo.com) etc.
3. Comment Posting: In this technique, backlinks are created by commenting on different blogs and websites. The blog
commenting is really effective on the same niche blog and websites.
4. Forum Posting: Forum Posting is a technique of link building in which website links are associated with good forum
posts. In this technique, efforts are made to make backlinks from forum discussion in related niche forum websites.
5. Press Release Submission: In Press Release Submission, the latest news is submitted in related websites.
6. Social Bookmarking: Social Bookmarking is the best method to organize, store, manage and search for bookmarks of
online resources. There are several social bookmarking websites available like Digg.com, Stumbleupon.com etc.
7. Classified Submission: Classified Ads are basically online ads which are placed on classified related websites. This is also
the best method of getting backlinks.
8. Videos Submission: Backlinks can also be created by submitting videos in video sharing websites like youtube, vimeo etc.
[8].
Importance of PageRank and Anchor Text in Search Engine Ranking: For website search engine ranking, anchor text
and page rank are really important factors. Anchor text is represented as <a href=”url of webpage”>keyword</a>. As
backlinks on keywords increase, the probability of search engine ranking on a particular keyword also increases. The link
associated with higher pagerank has greater importance as compared to the lower pagerank.
Link Practices that should be avoided: The backlinks on similar niche websites are important for better search engine
ranking; but some webmasters focus on making many counts of backlinks instead of quality of links, which is not a good
strategy of optimization. The reciprocal link exchange should also be avoided. In a link exchange, one webmaster places a
link on his website that points to another webmaster’s website, and vice versa. Backlinks on same IP websites should be
avoided, as this can be the biggest reason to hurt search engine ranking of all the websites [9][10].
III. Conclusion
For good search engine ranking of website, the really important factor is content. Content should be original, not copied. For
good search engine ranking other than content, the links associated with other higher pagerank websites are really important,
Jain et al., International Journal of Advanced Research in Computer Science and Software Engineering 3(4),
April - 2013, pp. 596-599
© 2013, IJARCSSE All Rights Reserved Page | 599
that is called backlinks. Backlinks should be in relevant niche websites. As well as backlinks on particular keyword
increase, search engine ranking will increase on that keyword.
References and url:
[1] Sergey Brin and Lawrence Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine.online.
Available: http://infolab.stanford.edu/~backrub/google.html
[2] Junghoo Cho, Hector Garcia-Molina, Lawrence Page, “Efficient Crawling Through URL Ordering, Department of
Computer Science Stanford University”.online. Available: http://ilpubs.stanford.edu:8090/347/1/1998-51.pdf
[3] Carlos Castillo, Kumar Chellapilla, Brian D. Davison, “Adversarial Information Retrieval on the Web.”online.
Available: http://www.sigir.org/forum/2008J/2008j-sigirforum-castillo.pdf
[4] MS.S.Latha Shanmugavadivu, DR.M.Rajaram, “Sementic Based Multiple Web Search Engine” (IJCSE) International
Journal on Computer Science and Engineering Vol. 02, No. 05, 2010, 1722-1728
[5] Shesh Narayan, Mishra Alka Kaiswal, Asha Ambhaikar, “An Integrated Approach of Topic-Sensitive PageRank And
Weighted PageRank For Web Mining” International Journal of Advanced Research in Computer Science, Volume 3,
No. 3, May-June 2012
[6] The PageRank Citation Ranking:Bringing Order to the Web.online. Available:
http://www.cis.upenn.edu/~mkearns/teaching/NetworkedLife/pagerank.pdf
[7] Hemant Balakrishnan, “A Study On Web Crawlers”, Center of Parallel Computing School of Computer Science
University of Central Florida.online. Available: http://www.cs.ucf.edu/~hemant/WebCrawlers.htm
[8] “SEO Tutorial”.online. Available: http://www.webconfs.com/seo-tutorial/
[9] Andras A. Benczur, Karoly Csalogany, Tamas Sarlos, Mate Uher, SpamRank-Fully Automatic Link Spam
Detection Work In Progress.
[10] Alexandros Ntaulas, Mark Najork, Mark Manasse, Dennis Fetterly, “Detecting Spam Pages Through Content
Analysis .online. Available:
http://www.ra.ethz.ch/cdstore/www2006/develwww2006.ecs.soton.ac.uk/programme/files/pdf/3052.pdf
... Another variable that can affect the WIF of a website is the backlink. Backlink of a web site is the link from another website to computer science school's website [12]. The website with a higher number of backlink count shows that the website has higher popularity among the user. ...
... The backlink should come from the same context of the website. The increase of backlink on a certain keyword will increase the search engine ranking on that keyword [12]. The computer science school's website with the greatest number of backlinks should have a higher position in search engine ranking. ...
... Lecturers should be encouraged to upload a video of their teaching in class in video sharing website such as YouTube [12]. ...
Article
This paper aims to investigate 18 web domains of computer science and information technology academic websites of Malaysia universities.We collected more than two million web pages. A webometric analysis was used to explore the number of web pages, inbound links, the web impact factor (WIF) and link relationships. The results show Fakulti Teknologi dan Sains Maklumat (FTSM), Universiti Kebangsaan Malaysia (UKM) has the highest number of webpages while Fakulti Teknologi Kreatif dan Warisan (FTKW), Universiti Malaysia Kelantan (UMK) has the largest WIF score. Pearson’s rank correlation coefficient was used to detect the relationship between institutions subdomain age and WIF. Correlations point out that there is scant relationship between subdomain age and WIF score across all 18 Malaysia selected schools [ r =−.076, n = 18, p < .0005]. This is due to WIF are highly dependent on the quality of the content to attract backlinks and Google crawler algorithm that changes from time to time for the number of web pages. Subdomain age is independent to the year of establishment of the schools. These findings can be used as a guide to the implementation of university web content strategy.
... For good search engine ranking of website, the really important factor is content. Content should be original, not copied [18]. The first step is to list keywords that match the pecel mbok bean sambel then find how much Search Volume is worth, and SEO Difficulty through the Uber Suggest tool. ...
... Backlinks should be in relevant niche websites. As well as backlinks on keyword increase, search engine ranking will increase on that keyword [18]. Business. ...
Article
Full-text available
In new era digital now, internet is the important think for various activities including in the process to improve business promotion. One way to increase promotion through online is to improve digital marketing strategies using Search Engine Optimization (SEO) techniques. The aim of this research is to implementation SEO to improve digital marketing in F&B industry. Developing method that used are On Page and Off Page SEO to optimize the search engine. The Researcher using literature analysis and deep interview to get the data from SMEs in F&B Industry. To create the implementation of SEO, the researcher undertakes case studies from SMEs in F&B Industry who already run the business and need to improve the Digital Marketing through SEO. However, it is important to note that SEO optimization is not a one-time task but rather an ongoing process. It requires constant attention and effort to maintain high search engine rankings and ensure the continued success of online business. Beside website, to improve the business promotion the researcher also use tools in Google Business and E-Commerce to optimize the SEO.The finding suggest that the implementation of Search Engine Optimization can make impact for improvement in visibility search engines, attract more visitors, and ultimately drive business growth in the area Google Business, E-Commerce and website owned by SMEs F&B Industry. It is important to remain vigilant in implementing SEO techniques to maintain high search engine rankings and ensure the continued success of SMEs online business.
... Backlinks play a major role for getting ranked in search engines; which links to a webpage from another webpage. If webpage has a greater number of backlinks, the probability of ranking in search engine increases [5]. To assess how the top government and private Indian colleges and international colleges fare against each other, it is fundamental to analyze how much search engine optimized their websites are and find the correlation between the ranking of an institute and its website [6]. ...
Chapter
In this research work, we have examined the top five NIRF-ranked government and private universities along with five worldwide prestigious universities. Our study is focused on analyzing some of the SEO parameters such as referring domains, organic and inorganic search results, backlink analysis, keyword density, and backlink gap. These are some of the most important factors that contribute the effective outcome of SERP in SEO and enhancement of website ranking in the search engine result page (SERP). In this research, we have used some of the online SEO tools present in the search engines such as SEMrush, Similar Web, and Uber Suggest. It helps to examine the websites of the top worldwide prestigious universities. From our findings, we found that “IIT Roorkee” leads 50.92% in organic search and 2.53% leads in referral traffic search compared to other top ranking IIT’s. In private universities, “Vellore Institute of Technology” (VIT) receives 3.725 million users as monthly visits, and in global universities, Imperial College leads 1.46% in Paid search and “Massachusetts Institute of Technology” (MIT) leads with 4.13% in social media traffic. These are some of the outcomes of our findings. In this paper, we will be analyzing the various factors of SEO on top national and international universities’ website using SEO tools.
... The best way to make backlinks is to submit articles to reputed directories and submit links to reputed web directories and internet directories. Posting comments, posting in forums, posting press releases, posting on social networking sites, and posting classified ads are also ways to create backlinks [14]. ...
... oder .gov) als Faktoren in das Pageranking einfließen(Jain, Dave 2013;Ziakis, Vlachopoulou, Kyrkoudis, Karagkiozidou 2019). Die Backlink-Statistik wurde mit dem kommerziellen Dienst von www.ahrefs.com ...
Article
Soziale Medien ermöglichen es ihrem Publikum, Informationen zu liken, zu kommentieren und zu teilen. Nutzerinnen und Nutzer werden so selbst zum Informations-Gatekeeper, der Aufmerksamkeiten im Netzwerk auf bestimmte Themen und Ereignisse lenkt. Solche „Audience Gatekeeping“-Vorgänge wurden bisher in der Politik, kaum aber in der Wirtschaft untersucht. Gelegenheit dazu bietet die Insolvenz des Lieferkettenfinanzierers Greensill Capital im Frühling 2021. Anders als es politische Untersuchungen nahelegen, führte Audience Gatekeeping bei diesem Wirtschaftsereignis nicht zu einer alternativen Themendarstellung, die von der journalistischen Berichterstattung abwich, sondern stärkte die bestehende Informationshierarchie mit wenigen internationalen Leitmedien, die das Thema strukturierten. Gemeinsam mit den politischen Untersuchungen ist jedoch der Befund, dass Suchmaschinen und deren hierarchische Informationsauflistungen das Audience Gatekeeping beeinflussen und dass sich aus der Aufmerksamkeitslenkung qua Verlinkung eine extreme Ungleichverteilung zugunsten einer Handvoll Titel ergibt: Von insgesamt 943 verlinkten Medien erhielten drei Medientitel ein Viertel aller Links, während die Hälfte aller Quellen nur einmal verlinkt wurde. Einer proportionalen Power-Law-Verteilung, wie sie in verschiedenen Studien zur Medienaufmerksamkeit festgestellt wurde, folgt dieses Verteilmuster jedoch nur in abgeschwächter Form.
... Further, SEO focus on off-page optimization (link purchases and link building) or on-page optimization (focusing on high quality content and its presentation including website structure, multimedia, keyword management, accessibility and portability) [34]. The focus of this study would be on the former majorly link building, search engines often prioritize pages based on the number of back links to it [27]. The process of link building is known to improve the SERP/page rank of the page on the search engine. ...
Article
Full-text available
With the increased digital usage, web visibility has become critically essential for organizations when catering to a larger audience. This visibility on the web is directly related to web searches on search engines which is often governed by search engine optimization techniques liked link building and link farming amongst others. The current study identifies metrics for segregating websites for the purpose of link building for search engine optimization as it is important to invest resources in the right website sources. These metrics are further used for detecting websites outliers for effective optimization and subsequent search engine marketing. Two case studies of knowledge management portals from different domains are used having 1682 and 1070 websites respectively for validation of the proposed approach. The study evolutionary intelligence by proposing a k-means chaotic firefly algorithm coupled with k-nearest neighbor outlier detection for solving the problem. Factors like Page Rank, Page Authority, Domain Authority, Alexa Rank, Social Shares, Google Index and Domain Age emerge significant in the process. Further, the proposed chaotic firefly variants are compared to K-Means integrated firefly algorithm, bat algorithm and cuckoo search algorithm for accuracy and convergence showing comparable accuracy. Findings indicate that the convergence speeds are higher for proposed chaotic firefly approach for tuning absorption and attractiveness coefficients resulting in faster search for optimal cluster centroids. The proposed approach contributes both theoretically and methodologically in the domain of vendor selection for identifying genuine websites for avoiding investment on untrustworthy websites.
... The backlinks created must be relevant to the niche website. If the backlink on a keyword increases then the ranking of keywords on search engines will also increase [7]. ...
... There are several freelancing platforms including Blogmint, Influencer, Upwork and Craiglist that offer freelancers to build content on topics that may be utilized for generating back links and keywords for the customer website [9,10]. These techniques attract traffic to the customer website and artificially boost the website rank. ...
Conference Paper
In the current scenario, with the exponential increase in the use of internet, organizations are continuously thriving for visibility on the web. This has opened new avenues in influencer marketing. Several portals encourage these marketers to build content for the purpose of digital marketing. However, the content building process produces a lot of spam within these websites when done in bulk. This is often done in order to establish their presence by using techniques including article spinning and keyword stuffing. This study thus attempts to identify these spam websites using a dataset comprising 2751 websites using bio inspired outlier detection approaches. We use publically available key performance indicators (KPIs) through which websites that create spam content to boost the amount of text in the domain are identified. A hybrid wolf search algorithm (WSA) and bat algorithm (BA) integrated with K-means are used to classify these websites into spam. Findings indicate that metrics including Domain Authority, Page Authority, Moz Rank, Links In, External Equity Links, Spam Score, Alexa Rank, Citation Flow, Trust Flow, External Back Links, Referred Domains, SemRush URL Links and SemRush Hostname Links play an important role in identifying spam. The proposed approach may prove beneficial in segregating spam influencer websites for effective influencer marketing.
Article
This research focuses on optimizing religious moderation messages through Search Engine Optimization (SEO) on NU Online, an Islamic website platform in Indonesia. Using a constructivist paradigm with a descriptive qualitative approach, this study explores how religious moderation messages are constructed and disseminated through Computer-Mediated Communication (CMC) and SEO. Data was collected through interviews, observations, and documentation, with a specific analysis on the popular article titled "Moderation of Religion and Its Urgency" on NU Online. The results show that NU Online has effectively implemented keyword research and On-Page SEO but is lacking in Off-Page SEO implementation. This indicates that the religious moderation messages are not yet fully optimized on this website. These findings underscore the importance of comprehensive SEO strategies to enhance the visibility and dissemination of religious moderation messages. The study recommends increasing the use of Off-Page SEO on Islamic websites. This includes the development of models or practical guides to optimize online religious content, encompassing aspects of keyword research, inclusive content, and effective campaign strategies. These implications are crucial for expanding the reach and impact of religious moderation messages, especially in the current digital era. The study also opens opportunities for further research on the influence of SEO on the dissemination of religious messages on the internet. The implementation of effective SEO strategies is expected to enhance the effectiveness of disseminating religious moderation messages through NU Online and other Islamic websites in Indonesia.
Article
Full-text available
Los buscadores son el principal punto de acceso a los contenidos de los sitios web. El SEO es la práctica encaminada al aumento de la cantidad y calidad de tráfico hacia un sitio web a través de los resultados de búsqueda orgánicos procedentes de los buscadores. El trabajo SEO busca satisfacer ciertos factores de posicionamiento que tienen en cuenta los algoritmos de los buscadores en la ordenación de los resultados de búsqueda. En los últimos años hemos visto como estos algoritmos han ido virando hacia factores y señales orientados a priorizar aquellos resultados que mejor satisfacen la intención de búsqueda que se esconde tras la palabra clave utilizada, ofreciendo también la mejor experiencia de usuario posible en la página de destino. Tras un análisis bibliográfico de los factores relacionados con el análisis de la intención de búsqueda y los factores relacionados con la mejora de la experiencia de usuario desde un punto de vista SEO en el buscador de Google, se recogen un conjunto de acciones y estrategias que pueden implementarse con el objetivo de mejorar el posicionamiento de las páginas de un sitio web.
Article
Full-text available
In this paper we study in what order a crawler should visit the URLs it has seen, in order to obtain more “important” pages first. Obtaining important pages rapidly can be very useful when a crawler cannot visit the entire Web in a reasonable amount of time. We define several importance metrics, ordering schemes, and performance evaluation measures for this problem. We also experimentally evaluate the ordering schemes on the Stanford University Web. Our results show that a crawler with a good ordering scheme can obtain important pages significantly faster than one without.
Article
Full-text available
The ubiquitous use of search engines to discover and access Web content shows clearly the success of information retrieval algorithms. However, unlike controlled collections, the vast majority of Web pages lack an authority asserting their quality. This openness of the Web has been the key to its rapid growth and success, but this openness is also a major source of new adversarial challenges for information retrieval methods.
Conference Paper
Full-text available
In this paper, we continue our investigations of "web spam": the injection of artificially-created pages into the web in order to influence the results from search engines, to drive traffic to certain pages for fun or profit. This paper considers some previously-undescribed techniques for automatically detecting spam pages, examines the effectiveness of these techniques in isolation and when aggregated using classification algorithms. When combined, our heuristics correctly identify 2,037 (86.2
Article
Full-text available
With the tremendous growth of information available to end users through the Web, search engines come to play ever a more critical role. Nevertheless, because of their general-purpose approach, it is always less uncommon that obtained result sets provide a burden ofuseless pages. The next-generation Web architecture, represented by the Semantic Web, provides the layered architecture possibly allowing overcoming this limitation. Several search engines have been proposed, which allow increasing information retrieval accuracy by exploiting a key content of Semantic Web resources, that is, relations. To make the Semantic Web work, well-structured data andrules are necessary for agents to roam the Web [2]. XML and RDF are two important technologies: we can create our own structures by XML without indicating what they mean; RDF uses sets of triples which express basic concepts [2]. DAML is the extension of XML and RDF The aim of this project is to develop a search engine based on ontologymatching within the Semantic Web. It uses the data in Semantic Web form such as DAML or RDF. When the user input a query, the program accepts the query and transfers it to a machine learning agent. Then the agent measures the similarity between different ontology’s, and feedback the matched item to the user.
Article
In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/ To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical largescale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want. Keywords World Wide Web, Search Engines, Information Retrieval, PageRank, Google 1.
Conference Paper
Spammers intend to increase the PageRank of certain spam pages by creating a large number of links pointing to them. We propose a novel method based on the concept o f personalized PageRank that detects pages with an undeserved high PageRank value without the need of any kind of white or blacklists or other means of human intervention. We assume that spammed pages have a biased distribution of pages that contribute to the undeserved high PageRank value. We define SpamRank by penalizing pages that originate a suspicious PageRank share and personalizing PageRank on the penalties. Our method is tested on a 31 M page crawl of the .de domain with a manually classified 1000-page strati fied random sample with bias towards large PageRank values.
An Integrated Approach of Topic-Sensitive PageRank And Weighted PageRank For Web Mining The PageRank Citation Ranking:Bringing Order to the Web
  • Shesh Narayan
  • Mishra Alka Kaiswal
  • Asha Ambhaikar
Shesh Narayan, Mishra Alka Kaiswal, Asha Ambhaikar, " An Integrated Approach of Topic-Sensitive PageRank And Weighted PageRank For Web Mining " International Journal of Advanced Research in Computer Science, Volume 3, No. 3, May-June 2012 [6] " The PageRank Citation Ranking:Bringing Order to the Web ".online. Available: http://www.cis.upenn.edu/~mkearns/teaching/NetworkedLife/pagerank.pdf [7]
Available: http://www.cs.ucf
  • Hemant Balakrishnan
  • Web Study On
  • Crawlers
Hemant Balakrishnan, " A Study On Web Crawlers ", Center of Parallel Computing School of Computer Science University of Central Florida.online. Available: http://www.cs.ucf.edu/~hemant/WebCrawlers.htm [8] " SEO Tutorial ".online. Available: http://www.webconfs.com/seo-tutorial/ [9] Andras A. Benczur, Karoly Csalogany, Tamas Sarlos, Mate Uher, SpamRank-Fully Automatic Link Spam Detection Work In Progress.
Adversarial Information Retrieval on the Web online Available: http://www.sigir.org/forum Sementic Based Multiple Web Search Engine
  • Carlos Castillo
  • Kumar Chellapilla
  • Brian D Davison
  • Dr M Rajaram
Carlos Castillo, Kumar Chellapilla, Brian D. Davison, " Adversarial Information Retrieval on the Web. " online. Available: http://www.sigir.org/forum/2008J/2008j-sigirforum-castillo.pdf [4] MS.S.Latha Shanmugavadivu, DR.M.Rajaram, " Sementic Based Multiple Web Search Engine " (IJCSE) International Journal on Computer Science and Engineering Vol. 02, No. 05, 2010, 1722-1728
An Integrated Approach of Topic-Sensitive PageRank And Weighted PageRank For Web Mining
  • Shesh Narayan
  • Asha Mishra Alka Kaiswal
  • Ambhaikar
Shesh Narayan, Mishra Alka Kaiswal, Asha Ambhaikar, "An Integrated Approach of Topic-Sensitive PageRank And Weighted PageRank For Web Mining" International Journal of Advanced Research in Computer Science, Volume 3, No. 3, May-June 2012