Harry Halpin

The University of Edinburgh, Edinburgh, Scotland, United Kingdom

Are you Harry Halpin?

Claim your profile

Publications (61)2.02 Total impact

  • Source
    Alexandre Monnin, Harry Halpin
    [Show abstract] [Hide abstract]
    ABSTRACT: The advent of the Web is one of the defining technological events of the twentieth century, yet its impact on the fundamental questions of philosophy has not yet been explored, much less systematized. The Web, as today implemented on the foundations of the Internet, is broadly construed as the space of all items of interest identified by URIs. Originally a space of linked hypertext documents, today the Web is rapidly evolving as a universal platform for data and computation. Even swifter is the Web-driven transformation of many previously unquestioned philosophical concepts of privacy, belief, intelligence, cognition, and even embodiment in surprising ways. The ensuing essays in this collection hope to explore the philosophical foundation of the World Wide Web and open the debate on whether or not the changes caused by the Web to technology and society warrant the creation of a philosophy of the Web.
    12/2013; , ISBN: 978-1-118-70018-1
  • Harry Halpin, Alexandre Monnin
    [Show abstract] [Hide abstract]
    ABSTRACT: This is the first interdisciplinary exploration of the philosophical foundations of the Web, a new area of inquiry that has important implications across a range of domains. Contains twelve essays that bridge the fields of philosophy, cognitive science, and phenomenology Tackles questions such as the impact of Google on intelligence and epistemology, the philosophical status of digital objects, ethics on the Web, semantic and ontological changes caused by the Web, and the potential of the Web to serve as a genuine cognitive extension Brings together insightful new scholarship from well-known analytic and continental philosophers, such as Andy Clark and Bernard Stiegler, as well as rising scholars in “digital native” philosophy and engineering Includes an interview with Tim Berners-Lee, the inventor of the Web
    12/2013; Wiley-Blackwell., ISBN: 978-1-118-70018-1
  • Source
    Harry Halpin, Alexandre Monnin
    12/2013; , ISBN: 978-1-118-70018-1
  • Harry Halpin, Fiona McNeill
    [Show abstract] [Hide abstract]
    ABSTRACT: The world is increasingly full of data. Organisations, governments and individuals are creating increasingly large data sources, and in many cases making them publicly available. This offers massive potential for interaction and mutual collaboration. But using this data often creates problems. Those creating the data will use their own terminology, structure and formats for the data, meaning that data from one source will be incompatible with data from another source. When presented with a large, unknown data source, it is very difficult to ascribe meaning to the terms of that data source, and to understand what is being conveyed. Much effort has been invested in data interpretation prior to run-time, with large data sources being matched against each other off-line. But data is often used dynamically, and so to maximise the value of the data it is necessary to extract meaning from it dynamically. We therefore postulate that an essential competent of utilising the world of data in which we increasingly live is the development of the ability to discover meaning on the go in large, heterogenous data.This paper provides an overview of the current state-of-the-art, reviewing the aims and achievements in different fields which can be applied to this problem. We take a brief look at cutting edge research in this field, summarising four papers published in the special issue of the AI Review on Discovering Meaning on the go in Large Heterogenous Data, and conclude with our thoughts about where research in this field is going, and what our priorities must be to enable us to move closer to achieving this goal.
    Artificial Intelligence Review 08/2013; 40(2). · 1.57 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: An increasing amount of structured data on the Web has attracted industry attention and renewed research interest in what is collectively referred to as semantic search. These solutions exploit the explicit semantics captured in structured data such as RDF for enhancing document representation and retrieval, or for finding answers by directly searching over the data. These data have been used for different tasks and a wide range of corresponding semantic search solutions have been proposed in the past. However, it has been widely recognized that a standardized setting to evaluate and analyze the current state-of-the-art in semantic search is needed to monitor and stimulate further progress in the field. In this paper, we present an evaluation framework for semantic search, analyze the framework with regard to repeatability and reliability, and report on our experiences on applying it in the Semantic Search Challenge 2010 and 2011.
    Web Semantics: Science, Services and Agents on the World Wide Web. 01/2013; 21:14–29.
  • Workshop on Discovering Meaning On the Go in Large Heterogeneous Data 2011 (LHD-11), Barcelona, Spain, July 16, 2011; 01/2011
  • Source
    Harry Halpin, Victor Lavrenko
    [Show abstract] [Hide abstract]
    ABSTRACT: Relevance feedback is one method for creating a 'virtuous cycle' -as put by Baeza-Yates -between semantics and search. Previ-ous approaches to search have generally considered the Semantic Web and hypertext Web search to be entirely disparate, indexing and searching over different domains. While relevance feedback have traditionally improved information retrieval performance, rel-evance feedback is normally used to improve rankings of a single data-set. Our novel approach is to use relevance feedback from hy-pertext Web search to improve the retrieval of Semantic Web data. We also inspect whether relevance feedback from Semantic Web data can improve hypertext Web search results. In both cases, an evaluation based on certain kinds of informational queries (abstract concepts, people, and places) selected from a query log and human judges show that relevance feedback works: relevance feedback from hypertext Web search can improve the retrieval of Semantic Web data, and vice versa. We evaluate our work over a wide range of algorithms, and show it improves baseline performance on these queries for deployed systems as well, such as the Semantic Search engine FALCON-S and the commercial Web search engine Yahoo! search.
    J. Web Sem. 01/2011; 9:474-489.
  • Harry Halpin, Valentina Presutti
    [Show abstract] [Hide abstract]
    ABSTRACT: One of the major events that has caused a resurgence in the use of formal ontologies is the advent of the Semantic Web, which seeks to do for knowledge representation what the Web did for hypertext. Yet while the field of formal ontologies is well-understood, the nature of the Web is rather surprisingly cloaked in mystery. Unlike formal computer science, the Web is constructed mostly out of informally and operationally defined terms built from various specifications, in particular IETF RFCs and W3C Recommendations. In order to better understand the nature of the ‘Web’ in the Semantic Web, we created a formal ontology called the ‘Identity of Resources on the Web’ (IRW) ontology. The primary goal of the Semantic Web is to use URIs as a universal space to name anything, expanding from using URIs for web-pages to URIs for “real objects and imaginary concepts”, as phrased by Berners-Lee. This distinction has often been tied to the distinction between information resources, such as web-pages and multimedia files, and other kinds of Semantic Web ‘non-information’ resources used in Linked Data. This issue of defining the relationship between URIs and resources is not a mandarin metaphysical matter, but has technical repercussions: the W3C has recommended not to use the same URI for information resources and the resources needed to denote ‘non-information resources’ for the Semantic Web. Basing our work on the normative specifications of the W3C and IETF, we model the relationship between resources and representations formally in an ontology called IRW (Identity and Reference on the Web). From our point of view, IRW is a beautiful ontology. In this paper we motivate why we consider it as such through the identification of a number of criteria on which we based our evaluation.
    Applied Ontology. 01/2011; 6:263-293.
  • Harry Halpin, Victor Lavrenko
    [Show abstract] [Hide abstract]
    ABSTRACT: We investigate the possibility of using structured data to improve search over unstructured documents. In particular, we use relevance feedback to create a 'virtuous cycle' between structured data from the Semantic Web and web-pages from the hypertext Web. Previous approaches have generally considered searching over the Semantic Web and hypertext Web to be entirely disparate, indexing and searching over different domains. Our novel approach is to use relevance feedback from hypertext Web results to improve Semantic Web search, and results from the Semantic Web to improve the retrieval of hypertext Web data. In both cases, our evaluation is based on certain kinds of informational queries (abstract concepts, people, and places) selected from a real-life query log and checked by human judges. We show our relevance model-based system is better than the performance of real-world search engines for both hypertext and Semantic Web search, and we also investigate Semantic Web inference and pseudo-relevance feedback.
    IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16-22, 2011; 01/2011
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The primary problem confronting any new kind of search task is how to boot-strap a reliable and repeatable evaluation campaign, and a crowd-sourcing approach provides many advantages. However, can these crowd-sourced evaluations be repeated over long periods of time in a reliable manner? To demonstrate, we investigate creating an evaluation campaign for the semantic search task of keyword-based ad-hoc object retrieval. In contrast to traditional search over web-pages, object search aims at the retrieval of information from factual assertions about real-world objects rather than searching over web-pages with textual descriptions. Using the first large-scale evaluation campaign that specifically targets the task of ad-hoc Web object retrieval over a number of deployed systems, we demonstrate that crowd-sourced evaluation campaigns can be repeated over time and still maintain reliable results. Furthermore, we show how these results are comparable to expert judges when ranking systems and that the results hold over different evaluation and relevance metrics. This work provides empirical support for scalable, reliable, and repeatable search system evaluation using crowdsourcing.
    Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, July 25-29, 2011; 01/2011
  • Source
    Harry Halpin
    [Show abstract] [Hide abstract]
    ABSTRACT: We examine a crucial question for the World Wide Web: What does a Uniform Resource Identifier (URI) mean? Crucial for the next-generation Semantic Web, can it refer to things outside web-pages? The Web is a universal information space for naming and accessing information via URIs. However, the classical philosophical problems of meaning and reference that have been the source of debate within the philosophy of language return when the Web is given as the foundation for a knowledge representation with the Semantic Web. Debates on the Semantic Web about the meaning and referential status of a URI are explored as analogues to debates about the meaning and reference of names in the philosophy of language. Three main positions are inspected: the logical position, as exemplified by the descriptivist theory of reference, the direct reference position, as exemplified by Putnam and Kripke’s causal theory of reference, and a Wittgensteinian position that views URIs as a public language, as exemplified by Web search engines. These positions show that debates within the philosophy of language are alive and well on the Web, and so in the philosophy of computer science.
    Minds and Machines 01/2011; 21:153-178. · 0.46 Impact Factor
  • Source
    Harry Halpin, Patrick J. Hayes
    Proceedings of the WWW2010 Workshop on Linked Data on the Web, LDOW 2010, Raleigh, USA, April 27, 2010; 01/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In Linked Data, the use of owl:sameAs is ubiquitous in interlinking data-sets. There is however, ongoing discussion about its use, and potential misuse, particularly with regards to interactions with inference. In fact, owl:sameAs can be viewed as encoding only one point on a scale of similarity, one that is often too strong for many of its current uses. We describe how referentially opaque contexts that do not allow inference exist, and then outline some varieties of referentially-opaque alternatives to owl:sameAs. Finally, we report on an empirical experiment over randomly selected owl:sameAs statements from the Web of data. This theoretical apparatus and experiment shed light upon how owl:sameAs is being used (and misused) on the Web of data.
    The Semantic Web - ISWC 2010 - 9th International Semantic Web Conference, ISWC 2010, Shanghai, China, November 7-11, 2010, Revised Selected Papers, Part I; 01/2010
  • Source
    AI Magazine. 01/2010; 31:115-122.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In contrast to traditional search, semantic search aims at the retrieval of information from factual assertions about real-world objects rather than searching over web-pages with textual descriptions. One of the key tasks to address in this context is ad-hoc object retrieval, i.e. the retrieval of objects in response to user formulated keyword queries. Despite the significant commercial interest, this kind of semantic search has not been evaluated in a thorough and systematic manner. In this work, we discuss the first evaluation campaign that specifically targets the task of ad-hoc object retrieval. We also discuss the submitted sys-tems, the factors that contributed to positive results and the potential for future improvements in semantic search.
    01/2010;
  • Harry Halpin, Tom Baker
    Linked Data Meets Artificial Intelligence, Papers from the 2010 AAAI Spring Symposium, Technical Report SS-10-07, Stanford, California, USA, March 22-24, 2010; 01/2010
  • Source
    Dirk Bollen, Harry Halpin
    [Show abstract] [Hide abstract]
    ABSTRACT: Most tagging systems support the user in the tag selection process by providing tag suggestions, or recommendations, based on a popularity measurement of tags other users provided when tagging the same resource. In this paper we investigate the influence of tag suggestions on the emergence of power law distributions as a result of collaborative tag behavior. Although previous research has already shown that power laws emerge in tagging systems, the cause of why power law distributions emerge is not understood empirically. The majority of theories and mathematical models of tagging found in the literature assume that the emergence of power laws in tagging systems is mainly driven by the imitation behavior of users when observing tag suggestions provided by the user interface of the tagging system. This imitation behavior leads to a feedback loop in which some tags are reinforced and get more popular which is also known as the `rich get richer' or a preferential attachment model. We present experimental results that show that the power law distribution forms regardless of whether or not tag suggestions are presented to the users. Furthermore, we show that the real effect of tag suggestions is rather subtle; the resulting power law distribution is `compressed' if tag suggestions are given to the user, resulting in a shorter long tail and a `compressed' top of the power law distribution. The consequences of this experiment show that tag suggestions by themselves do not account for the formation of power law distributions in tagging systems.
    04/2009;
  • Source
    Harry Halpin, Valentina Presutti
    [Show abstract] [Hide abstract]
    ABSTRACT: The primary goal of the Semantic Web is to use URIs as a universal space to name anything, expanding from using URIs for web- pages to URIs for \real objects and imaginary concepts," as phrased by Berners-Lee. This distinction has often been tied to the distinction between information resources, like webpages and multimedia les, and non-information resources, which are everything from real people to ab- stract concepts like 'the integers.' Furthermore, the W3C has recom- mended not to use the same URI for information resources and non- information resources, and several communities like the Linked Data ini- tiative are deploying this principle. The denition put forward by the W3C, that non-information resources are things whose \essential nature is information" is a dicult distinction at best. For example, would the text of Moby Dick be an information resource? While this problem could safely be ignored up until recently, with the rise of Linked Data and projects like OKKAM, it appears that this problem should be modelled formally. An ontology called IRW (Identity and Reference on the Web) of various types of resources and their relationships, both for the hypertext Web and the Semantic Web, is presented. It builds upon Information Object Lite (an extension of DOLCE Ultra Lite for describing informa- tion objects) and IRE (an earlier ontology of and aligns with other work in this area. This ontology can be used as a tool to make the Semantic Web more self-describing and to allow inference to be used to test for membership in various classes of resources.
    The Semantic Web: Research and Applications, 6th European Semantic Web Conference, ESWC 2009, Heraklion, Crete, Greece, May 31-June 4, 2009, Proceedings; 01/2009
  • Source
    Harry Halpin, Valentina Presutti
    [Show abstract] [Hide abstract]
    ABSTRACT: The primary goal of the Semantic Web is to use URIs as a universal space to name anything, expanding from using URIs for webpages to URIs for "real objects and imagi-nary concepts," as phrased by Berners-Lee. This distinc-tion has often been tied to the distinction between infor-mation resources, like webpages and multimedia files, and non-information resources, which are everything from real people to abstract concepts like 'the integers.' Furthermore, the W3C has recommended not to use the same URI for information resources and non-information resources, and several communities like the Linked Data initiative are de-ploying this principle. The definition put forward by the W3C, that information resources are things whose "essen-tial nature is information" is a difficult distinction at best. For example, would the text of Moby Dick be an information resource? While this problem could safely be ignored up un-til recently, with the rise of Linked Data and projects like OKKAM, it appears that this problem should be modelled formally. An ontology called IRW (Identity and Reference on the Web) of various types of resources and their rela-tionships, both for the hypertext Web and Linked Data, is presented. It builds upon Information Object Lite (an ex-tension of DOLCE Ultra Lite for describing information ob-jects) and IRE (an earlier ontology of and aligns with other work in this area. This ontology can be used as a tool to make Linked Data more self-describing and to allow infer-ence to be used to test for membership in various classes of resources.
    01/2009;
  • Source
    Harry Halpin, Henry S. Thompson
    [Show abstract] [Hide abstract]
    ABSTRACT: Relevance feedback from hypertext Web searches considerably improves the performance of a semantic Web search engine, creating a mutually beneficial relationship between the hypertext and semantic Webs.
    IEEE Intelligent Systems. 01/2009; 24:27-31.