Lawrence Page's research while affiliated with Stanford University and other places

Publications (9)

Article
The importance of a Web page is an inherently subjective matter, which depends on the readers interests, knowledge and attitudes. But there is still much that can be said objectively about the relative importance of Web pages. This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human...
Conference Paper
In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages...
Article
In this paper we study in what order a crawler should visit the URLs it has seen, in order to obtain more “important” pages first. Obtaining important pages rapidly can be very useful when a crawler cannot visit the entire Web in a reasonable amount of time. We define several importance metrics, ordering schemes, and performance evaluation measures...
Article
In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages...

Citations

... In theory and algorithm, Brin and Page ensured that the value of initial value (1) did not affect the convergence of node estimation value, and did not change the final ranking relation of important value [38], [39]. Therefore, the (1) value and (1) value in formula (12) and formula (14) are taken to facilitate the operation of (1) = [1,1, ⋯ ⋯ ,1] value and (1) = [1,1, ⋯ ⋯ ,1] value, respectively. ...
... Influence maximization (Kempe & Kleinberg, 2003) covers various optimization methods for selecting a subset of nodes in a network, commonly referred to as seed nodes to maximize the total exposure. Common heuristics to identify influential nodes use graph-based centrality measures, such as Betweenness Centrality (Brandes, 2001) or PageRank (Page et al., 1998). However, these algorithms merely seek to maximize connectivity and neither consider differences among users nor the existence of rival forces. ...
... An importance measure could be based on the number of downloads or an explicit rating system by users. Our system allows using such " base " measures but also computes importance in the style of PageRank [7]. A mashlet acquires importance from its use in important GPs; GPs similarly acquire importance from using important mashlets. ...
... common technical method of IR is to map textual language into symbol vectors which can be easily manipulated mathematically. The result set generated by IR is a rank ordered list of documents which likely contain information that the user has specified. Examples of common IR systems include Web search engines such as Google (http://www.google.com) [57] and Altavista (http://www.altavista.com), both of which use schemes that include tf-idf. The IR systems provide fast, but not always accurate, answers to the questions posed by Web users. Search engines are designed to handle generic collections of text, based on word frequency , regardless of the content of the collections. Articles ab ...
... The number of references (citations) to a thing is evidence of its importance; many Nobel Prizes are assigned according to this fact. Considering this, we can say that highly linked pages are more " important " than pages with few in-links [36]. L. Page and S. Brin proposed the Page Rank algorithm in [8, 36, 37] that calculates the importance of web pages using the link structure of the web. ...
... Our problem setting can be seen as a crawling problem. Developing efficient web crawlers is a long-standing problem in the literature [11,12,14,15,20,51]. In particular, focused crawling [5,12,26,29,43] is relevant to our problem setting. ...
... where d is a damping factor (Page et al., 1999) Triangles (Sá and Prudêncio, 2011) Technological similarity Adamic and Adar, 2003) Common neighbor Zhou et al., 2009) Technological context ...
... The PageRank calculation considers the in-degree of a given premise and the in-degree of its neighbors. Here a Google PageRank measure was used [58] Betweenness This is a node-level network metric where the extent to which a node lies on paths connecting other pairs of nodes, defined by the number of geodesics (shortest paths) going through a node [57] Clustering coefficient Measures the degree to which nodes in a network tend to cluster together (i.e., if A → B and B → C, what is the probability that A → C), with a range of values between zero and one. Here, we implemented the global cluster coefficient where the number of closed triplets (or 3 × triangles) in the network was divided over the total number of triplets (both open and closed) [57] Giant weakly connected component (GWCC) The proportion of nodes that are connected in the largest component when directionality of movement is ignored [57] Giant strongly connected component (GSCC) The proportion of the nodes that are connected in the largest component when directionality of movement is considered [57] Centralization A general method for calculating a graph-level centrality score based on a node-level centrality measure. ...
... In this setup a strategy which is influential is one which is widely played. As a test for the ability of trophic level to predict influence we compare the probability of a strategy being played with the trophic level ranking of a node and then compare this to ranking by traditional centrality metrics such as PageRank [43,44]. This is shown in figure 4 for a sample of strategy networks of differing incoherences. ...