Content uploaded by Timofey Ermilov
Author content
All content in this area was uploaded by Timofey Ermilov
Content may be subject to copyright.
Weaving a Social Data Web
with Semantic Pingback
Sebastian Tramp, Philipp Frischmuth, Timofey Ermilov, and Sören Auer
Universität Leipzig, Institut für Informatik, AKSW,
Postfach 100920, D-04009 Leipzig, Germany,
{lastname}@informatik.uni-leipzig.de
http://aksw.org
Abstract. In this paper we tackle some of the most pressing obstacles
of the emerging Linked Data Web, namely the quality, timeliness and
coherence as well as direct end user benefits. We present an approach
for complementing the Linked Data Web with a social dimension by
extending the well-known Pingback mechanism, which is a technological
cornerstone of the blogosphere, towards a Semantic Pingback. It is based
on the advertising of an RPC service for propagating typed RDF links
between Data Web resources. Semantic Pingback is downwards com-
patible with conventional Pingback implementations, thus allowing to
connect and interlink resources on the Social Web with resources on the
Data Web. We demonstrate its usefulness by showcasing use cases of the
Semantic Pingback implementations in the semantic wiki OntoWiki and
the Linked Data interface for database-backed Web applications Triplify.
Introduction
Recently, the publishing of structured, semantic information as Linked Data has
gained much momentum. A number of Linked Data providers meanwhile publish
more than 200 interlinked datasets amounting to 13 billion facts1. Despite this
initial success, there are a number of substantial obstacles, which hinder the
large-scale deployment and use of the Linked Data Web. These obstacles are
primarily related to the quality, timeliness and coherence of Linked Data as well
as to providing direct benefits to end users. In particular for ordinary users of
the Internet, Linked Data is not yet sufficiently visible and (re-) usable. Once
information is published as Linked Data, authors hardly receive feedback on its
use and the opportunity of realizing a network effect of mutually referring data
sources is currently unused.
In this paper we present an approach for complementing the Linked Data
Web with a social dimension. The approach is based on an extension of the well-
known Pingback technology [9], which is one of the technological cornerstones of
the overwhelming success of the blogosphere in the Social Web. The Pingback
1http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/
DataSets/Statistics
mechanism enables bi-directional links between weblogs and websites in general
as well as author/user notifications in case a link has been newly established.
It is based on the advertising of a lightweight RPC service, in the HTTP or
HTML header of a certain Web resource, which should be called as soon as a
link to that resource is established. The Pingback mechanism enables authors
of a weblog entry or article to obtain immediate feedback, when other people
reference their work, thus facilitating reactions and social interactions. It also
allows to automatically publish backlinks from the original article to comments
or references of the article elsewhere on the Web, thus facilitating timeliness
and coherence of the Social Web. As a result, the distributed network of social
websites using the Pingback mechanism (such as the blogosphere) is much tighter
and timelier interlinked than conventional websites, thus rendering a network
effect, which is one of the major success factors of the Social Web.
With this work we aim to apply this success of the Social Web to the Linked
Data Web. We extend the Pingback mechanism towards a Semantic Pingback,
by adding support for typed RDF links on Pingback clients, servers and in the
autodiscovery process.
When an RDF link from a Semantic Pingback enabled Linked Data resource
is established with another Semantic Pingback enabled Linked Data resource, the
latter one can be automatically enriched either with the RDF link itself, with
an RDF link using an inverse property or additional information. When the
author of a publication, for example, adds bibliographic information including
RDF links to co-authors of this publication to her semantic wiki, the co-authors’
FOAF profiles can be enriched with backlinks to the bibliographic entry in an
automated or moderated fashion. The Semantic Pingback supports provenance
through tracking the lineage of information by means of a provenance vocabulary.
In addition, it allows to implement a variety of measures for preventing spam.
Semantic Pingback is completely downwards compatible with the conven-
tional Pingback implementations, thus allowing to seemlessly connect and in-
terlink resources on the Social Web with resources on the Data Web. A weblog
author can, for example, refer to a certain Data Web resource, while the pub-
lisher of this resource can get immediately notified and owl:seeAlso links can
be automatically added to the Data Web resource. In order to facilitate the
adoption of the Semantic Pingback mechanism we developed three complemen-
tary implementations: a Semantic Pingback implementation was included into
the semantic data wiki OntoWiki, we added support for Semantic Pingbacks to
the Triplify database-to-RDF mapping tool and provide a standalone implemen-
tation for the use by other tools or services.
The paper is structured as follows: We describe the requirements which
guided the development of Semantic Pingback in section 1. We present an archi-
tectural overview including communication behaviour and autodiscovery algo-
rithms of our solution in section 2. A description of our implementations based
on OntoWiki and Triplify as well as the standalone software is given in section 5.
Finally, we survey related work in section 6 and conclude with an outlook on
future work in section 7.
1 Requirements
In this section we discuss the requirements, which guided the development of
our Semantic Pingback approach.
Semantic links. The conventional Pingback mechanism propagates untyped
(X)HTML links between websites. In addition the Semantic Pingback mechanism
should be able to propagate typed links (e.g. OWL object properties) between
RDF resources.
Use RDFa-enhanced content where available. Since most traditional we-
blog and wiki systems are able to create semantically enriched content based
on RDFa annotations2, these systems should be able to propagate typed links
derived from the RDFa annotations to a Semantic Pingback server without any
additional modification or manual effort.
Downward compatibility with conventional Pingback servers. Conven-
tional Pingback servers should be able to retrieve and accept requests from Se-
mantic Pingback clients. Thus, widely used Social Web software such as Word-
Press or Serendipity can be pinged by a Linked Data resource to announce the
referencing of one of their posts. A common use case for this is a Linked Data
SIOC [4] comment which replies and refers to a blog post or wiki page on the
Social Web. Such a SIOC comment typically uses the sioc:reply_of object
property to establish a link between the comment and the original post3.
Downward compatibility for conventional Pingback clients. Conven-
tional Pingback clients should be able to send Pingbacks to Semantic Pingback
servers. Thus, a blogger can refer to any pingback-enabled Linked Data resource
in any post of her weblog. Hence, the conventional Pingback client should be
able to just send conventional Pingbacks to the Linked Data server. Unlike a
conventional Pingback server, the Semantic Pingback server should not create
a comment with an abstract of the blog post within the Linked Data resource
description. Instead an additional triple should be added to the Linked Data
resource, which links to the referring blog post.
Support Pingback server autodiscovery from within RDF resources.
The conventional Pingback specification keeps the requirements on the client
side at a minimum, thus supporting the announcement of a Pingback server
2This should be possible at least manually by using the systems HTML source editor,
but can be supported by extensions as for example described in [6] for Drupal.
3Since SIOC is a very generic vocabulary, people can also use more specific relations
as, for instance, disagreesWith or alternativeTo from the Scientific Discourse Re-
lationships Ontology [5].
RPC Layer
Resource Layer
Linking Resource
(Source)
Linked Resource
(Target)
(typed) linking
observes
announces
RPC request
autodiscovery
fetches
1
2
3
4
5
(updates)
6
Publisher
Link Reveiver
Pingback Server
Link Publisher
Pingback Client
(Link Propagator)
(notifies)
7
Fig. 1. Architecture of the Semantic Pingback approach.
through a <link>-Element in an HTML document. Since the Semantic Pingback
approach aims at applying the Pingback mechanism for the Web of Data, the
autodiscovery process should be extended in order to support the announcement
of a Pingback server from within RDF documents.
Provenance tracking. In order to establish trust on the Data Web it is
paramount to preserve the lineage of information. The Semantic Pingback mech-
anism should incorporate the provenance tracking of information, which was
added to a knowledge base as result of a Pingback.
Spam prevention. Another aspect of trust is the prevention of unsolicited
proliferation of data. The Semantic Pingback mechanism should enable the in-
tegration of measures to prevent spamming of the Data Web. These measures
should incorporate methods based on data content analysis and social relation-
ship analysis.
2 Architectural Overview
The general architecture of the Semantic Pingback approach is depicted in Fig-
ure 1. A linking resource (depicted in the upper left) links to another (Data)
Web resource, here called linked resource (arrow 1). The linking resource can be
either an conventional Web resource (e.g. wiki page, blog post) or a Linked Data
resource. Links originating from Linked Data resources are always typed (based
on the used property), links from conventional Web resources can be either un-
typed (i.e. plain HTML links) or typed (e.g. by means of RDFa annotations).
The Pingback client (lower left) is either integrated into the data/content man-
agement system or realized as a separate service, which observes changes of the
Web resource (arrow 2). Once the establishing of a link was noted, the Pingback
client tries to autodiscover a Pingback server from the linked resource (arrow 3).
If the autodiscovery was successful, the respective Pingback RPC server is called
(arrow 4), with the parameters linking resource (i.e. source) and linked resource
(i.e. target). In order to verify the retrieved request (and to obtain information
about the type of the link in the semantic case), the Pingback server fetches (or
dereferences) the linking resource (arrow 5). Subsequently, the Pingback server
can perform a number of actions (arrows 6,7), such as updating the linked re-
source (e.g. adding inverse links) or notifying the publisher of the linked resource
(e.g. via email). This approach is compatible with the conventional Pingback
specification [9].
The following scenario, which was introduced in the above mentioned speci-
fication, illustrates the chain of communication steps executed for a single Ping-
back request:
1. Alice posts to her blog. The post she has made (source resource ) includes a
link to a post on Bob’s blog (target resource).
2. Alice’s blogging system (Pingback client) contacts Bob’s blogging system
(Pingback server) and propagates that a link to a post inside Bob’s environ-
ment was established.
3. Bob’s blogging system then verifies, that Alice’s post indeed includes a link
to Bob and adds a link back to Alice’s post on his original post.
4. Readers of Bob’s article can follow this link to Alice’s post to read her
opinion.
This scenario as well as the general architecture introduce four components,
which we now describe in more detail:
Pingback client. Alice’s blogging system comprises the Pingback client. The
Pingback client establishes a connection to the Pingback server on a certain
event (e.g. on submitting a new blog post) and starts the Pingback request.
Pingback server. Bob’s blogging system acts as the Pingback server. The
Pingback server accepts Pingback request via XML-RPC and reacts as config-
ured by the owner. In most cases, the Pingback server saves information about
the Pingback in conjunction with the target resource.
Target resource. Bob’s article is called the target resource and is identified
by the target URI. The target resource can be either a web page or an RDF
resource, which is accessible through the Linked Data mechanism. A target re-
source is called pingback-enabled, if a Pingback client is able to glean information
about the target resource’s Pingback server (see section 3.1 for autodiscovery of
Pingback server information).
:Pingback-
Server
:Target
:Pingback-
Client :Source
scan for links
links
server autodiscovery
header or document
XML-RPC request (ping)
fetch and check
document with link(s) to target
XML response
:Source-
Publisher :Target-
Publisher
updates
observes
updates informs
Fig. 2. Sequence diagram illustrating the (Semantic) Pingback workflow.
Source resource. Alice’s post is called the source resource and is identified by
the source URI. Similar as the target resource, the source resource can be either
a web page or an RDF resource. The source resource contains some relevant
information chunks regarding the target resource.
These information chunks can belong to one or more of the following cate-
gories:
–An untyped (X)HTML link in the body of the web page (this does not apply
for Linked Data resources).
–A (possible RDFa-encoded) RDF triple linking the source URI with the
target URI trough an arbitrary RDF property. That is, the extracted source
resource model contains a direct relation between the source and the target
resource. This relation can be directed either from the source to the target
or in the opposite direction.
–A (possible RDFa-encoded) RDF triple where either the subject or the ob-
ject of the triple is the target resource. This category represents additional
information about the target resource including textual information (e.g.
an additional description) as well as assertions about relations between the
target resource and a third resource. This last category will most likely ap-
pear only in RDFa enhanced web pages since Linked Data endpoints are less
likely to return triples describing foreign resources.
Depending on these categories, a Semantic Pingback server will handle the
Pingback request in different ways. We describe this in more detail later in
section 4.
Figure 2 illustrates the complete life-cycle sequence of a (Semantic) Pingback.
Firstly, the source publisher updates the source resource, which is observed by
a Pingback client. The Pingback client then scans the source resource for links
(typed or untyped) to other resources. Each time the client detects a suitable link,
it tries to determine a Pingback server by means of an autodiscovery process.
Once a Pingback server was determined, the client pings that server via an XML-
RPC request. Section 3 contains a more detailed description of these steps. Since
the requested Pingback server only receives the source and target URIs as input,
it tries to gather additional information. At least the source document is fetched
and (possibly typed) links are extracted. Furthermore the target resource is
updated and the publisher of the target resource is notified about the changes.
In section 4 the server behavior is described in more detail. Finally, the Pingback
server responds with an XML result.
3 Client Behavior
One basic design principle of the original Pingback specification is to keep the
implementation requirements of a Pingback client as simple as possible. Con-
sequently, Pingback clients do not even need an XML/HTML parser for basic
functionality. There are three simple actions to be followed by a Pingback client:
(1) Determine suitable links to external target resources, (2) detect the Pingback
server for a certain target resource and (3) send an XML-RPC post request via
HTTP to that server. Conventional Pingback clients would naturally detect (un-
typed) links by scanning HTML documents for <a>-elements and use the href-
attribute to determine the target. Semantic Pingback clients will furthermore
derive suitable links by examining RDFa annotated HTML or RDF documents.
Both conventional and Semantic Pingback clients are able to communicate with
a Semantic Pingback server, since the Semantic Pingback uses exactly the same
communication interface. In particular, we did not change the remote procedure
call, but we introduce a third possible autodiscovery mechanism for Semantic
Pingback clients in order to allow the propagation of server information from
within RDF documents. On the one hand, this enables the publisher of a re-
source to name a Pingback server, even if the HTTP header cannot be modified.
On the other hand, this allows caching and indexing of Pingback server infor-
mation in a Semantic Web application.
3.1 Server autodiscovery
The server autodiscovery is a protocol followed by a Pingback client to determine
the Pingback server of a given target resource. The Pingback mechanism sup-
ports two different autodiscovery mechanisms which can be used by the Pingback
client:
–an HTTP header attribute X-Pingback and
–alink-element in the HTML head with a relation attribute rel="pingback".
Both mechanisms interpret the respective attribute value as URL of a Ping-
back XML-RPC service, thus enabling the Pingback client to start the request.
The X-Pingback HTTP header is the preferred autodiscovery mechanism
and all Semantic Pingback server must implement it in order to achieve the
required downward compatibility. We define an additional autodiscovery method
for Linked Data resources which is based on RDF and integrates better with
Semantic Web technologies.
Therefore, we define an OWL object property service4, which is part of
the Pingback namespace and links a RDF resource with a Pingback XML-RPC
server URL. The advantage compared to an HTTP header attribute is that this
information can be stored along with a cached resource in an RDF knowledge
base. Another benefit is, that different resources identified by hash URIs can
be linked with different Pingback servers. However, a disadvantage (as for the
HTML link element too) is that Pingback clients need to retrieve and parse the
document instead of requesting the HTTP header only.
4 Server Behavior
While the communication behavior of the server is completely compatible with
the conventional Pingback mechanism (as described in [9]), the manipulation of
the target resource and other request handling functionality (e.g. sending email
notifications) is implementation and configuration dependent. Consequently, in
this section we focus on describing guidelines for the important server side manip-
ulation and request handling issues spam prevention, backlinking and provenance
tracking.
4.1 Spam Prevention
At some point every popular service on the Internet, be it Email, Weblogs,
Wikis, Newsgroups or Instant Messaging, had to face increasing abuse of their
communication service by sending unsolicited bulk messages indiscriminately.
Each service dealt with the problem by implementing technical as well as orga-
nizational measures, such as black- and whitelists, spam filters, captchas etc.
The Semantic Pingback mechanism prevents spamming by the following ver-
ification method. When the Pingback Server receives the notification signal, it
automatically fetches the linking resource, checking for the existence of a valid
incoming link or an admissible assertion about the target resource. The Ping-
back server defines, which types of links and information are admissible. This
can be based on two general strategies:
–Information analysis. Regarding an analysis of the links or assertions, the
Pingback server can, for example, dismiss assertions which have logical im-
plications (such as domain, range or cardinality restrictions), but allow label
and comment translations into other languages.
4http://purl.net/pingback/service
–Publisher relationship analysis. This can be based e.g. on the trust level of
the publisher of the linking resource. A possibility to determine the trust level
is to resolve foaf:knows relationships from the linked resource publisher to
the linking resource publisher.
If admissible links or assertions exist, the Pingback is recorded successfully,
e.g. by adding the additional information to the target resource and notifying
its publisher. This makes Pingbacks less prone to spam than e.g. trackbacks5.
In order to allow conventional Pingback servers (e.g. WordPress) to receive
links from the Data Web, this link must be represented in a respective HTML
representation of the linking resource (managed by the Pingback client) at least
as an untyped X(HTML) link. This enables the server to verify the given source
resource even without being aware of Linked Data and RDF.
4.2 Backlinking
The initial idea behind propagating links from the publisher of the source re-
source to the publisher of the target resource is to automate the creation of
backlinks to the source resource. In typical Pingback enabled blogging systems,
a backlink is rendered in the feedback area of a target post together with the
title and a short text excerpt of the source resource.
To retrieve all required information from the source resource for verifying the
link and gather additional data, a Semantic Pingback server will follow these
three steps:
1. Try to catch an RDF representation (e.g. RDF/XML) of the source resource
by requesting Linked Data with an HTTP Accept header.
2. If this is not possible, the server should try to gather an RDF model from
the source resource employing an RDFa parser.
3. If this fails, the server should at least verify the existence of an untyped
(X)HTML link in the body of the source resource.
Depending on the category of data which was retrieved from the source re-
source, the server can react in different ways:
–If there is only an untyped (X)HTML link in the source resource, this link can
be created as an RDF triple with a generic RDF property like dc:references
or sioc:links_to in the servers knowledge base.
–If there is at least one direct link from the source resource to the target
resource, this triple should be added to the servers knowledge base.
–If there is any other triple in the source resource where either the subject or
the object of the triple corresponds to the target resource, the target resource
can be linked using the rdfs:seeAlso property with the source resource.
In addition to the statements which link the source and the target resource,
metadata about the source resource (e.g. a label and a description) can be stored
as well.
5http://en.wikipedia.org/wiki/Trackback
4.3 Provenance Tracking
Provenance information can be recorded using the provenance vocabulary [8]6.
This vocabulary describes provenance information based on data access and data
creation attributes as well as three basic provenance related types: executions,
actors and artifacts. Following the specification in [8], we define a creation guide-
line for Pingback requests, which is described in this paper, and identified by
the URI http://purl.net/pingback/Request. A specific Pingback request ex-
ecution is then performed by a Pingback data creating service, which uses the
defined creation guideline.
The following listing shows an example provenance model represented in N3:
1@ pr e fi x : < h t tp :/ / p u rl . o rg / n e t / p ro v en a n ce / n s # > .
2@ pr e fi x rd f : < h t tp :/ / w ww . w 3 . or g / 1 99 9 /0 2 / 22 - r df - s y nt ax - n s # > .
3@ pr e fi x rd fs : <h t tp :/ / w ww . w 3 . or g / 2 0 00 / 01 / r df - s ch e m a #> .
4@ pr e fi x s i oc : < h t tp : // r df s . or g / s io c / ns # > .
5
6[ a r df : S t at e me nt ;
7rd f : s u bj e ct < ht t p :/ / e x am p l e1 . o r g / So ur c e > ;
8rd f : p r ed i ca t e s i oc : l i n ks _ to ;
9rd f : o b je c t < h tt p : // e xa m pl e 2 . o rg / T ar g et > ;
10 :containedBy [
11 a : D at aI t em ;
12 : c re a te dB y [
13 a :DataCreation;
14 : pe r fo rm e dA t " 20 10 - 02 - 12 T 12 : 00 :0 0 Z" ;
15 :performedBy [
16 a :DataCreatingService;
17 r df s : la b el " Se m an t ic P in g ba c k S e rv i ce " ];
18 : u se dD a ta [
19 a : D at aI t em ;
20 : c on t a in e d By < ht t p :/ / e x am p l e1 . o r g / So ur c e > ] ;
21 : u se d G ui d e li n e < h t tp : / / p ur l . ne t / p i ng b ac k / R e qu es t > ] ];
22 ].
This provenance model describes a Pingback from http://example1.org/
Source to http://example2.org/Target. The Pingback was performed Friday,
12 February at noon and resulted in a single statement, which links the source
resource to the target resource using a sioc:links_to property.
5 Implementation and Evaluation
In this section we describe the implementation and evaluation of Semantic Ping-
back in three different scenarios. We implemented Semantic Pingback server and
client functionality for OntoWiki in order to showcase the semantic features of
the approach. Semantic Pingback server functionality was integrated in Triplify,
6The Provenance Vocabulary Core Ontology Specification is available at http://
trdf.sourceforge.net/provenance/ns.html.
thus supporting the interlinking with relational data on the Data Web. Finally,
we implemented a standalone Semantic Pingback server (also available as ser-
vice), that can be utilized by arbitrary resources that do not provide a Pingback
service themselves.
5.1 OntoWiki
OntoWiki [2]7is a tool for browsing and collaboratively editing RDF knowledge
bases. Since OntoWiki enables users to add typed links on external resources, we
integrated a Semantic Pingback client component. A recently added feature is the
ability to expose the data stored in OntoWiki via the Linked Data mechanism.
Based on that functionality, a Semantic Pingback server component was also
integrated.
OntoWiki Pingback client. The Pingback client consists of a plugin that
handles a number of events triggered when statements are added or removed
from the knowledge base. Each time a statement is added or removed, the plugin
first checks, whether:
–the subject resource is a URI inside the namespace of the OntoWiki envi-
ronment,
–the subject resource is (anonymously) accessible via the Linked Data mech-
anism8and
–the object of the statement is a resource with an de-referenceable URI outside
the namespace of the OntoWiki environment.
If the above steps are successfully passed, the plugin tries to autodiscover a
Pingback server. This process follows the algorithm described in the original
Pingback specification but adds support for target resources represented in RDF
as described in section 3.1. If a server was discovered, an XML-RPC post request
is send.
OntoWiki Pingback server. The OntoWiki Pingback server is an extension
consisting of a plugin handling some request cycle related events, as well as a
component that provides a Pingback XML-RPC service. The plugin is respon-
sible for exposing the X-Pingback HTTP-header in conjunction with the URL
of the RPC service.
The provided Pingback service initially checks, whether the target resource
is valid, i.e. is inside the namespace of the OntoWiki environment and accessible
via the Linked Data mechanism. If a valid target resource was passed, the service
takes the following steps:
7http://ontowiki.net
8This step is added to the process since OntoWiki is able to handle various access
control mechanisms and we thus ensure that the Pingback server of the target re-
source is definitely able to access either the RDF or the (X)HTML representation of
the source resource.
Fig. 3. OntoWiki backlinks are rendered in the "Instances Linking Here" side box.
The example visualizes a personal WebID with three different backlinks using different
relations.
1. The server tries to request the target resource as RDF/XML. If an RD-
F/XML document is retrieved, all relevant triples are extracted.
2. If the above step fails or no relevant triples are found, the OntoWiki Pingback
server utilizes a configurable RDFa extraction service (e.g. the W3C RDFa
Distiller9), which dynamically creates an RDF/XML representation from a
target Web page.
3. If the second step fails, the target resource is requested without an additional
Accept-header. If an HTML document is retrieved, all links in the document
are checked. If a link to the target resource is found, a generic triple with
the property sioc:links_to is formed together with the source as subject
and the target resource as object.
Relevant triples are all triples that have either the source resource as sub-
ject and the target resource as object or vice versa. If no such statements were
found, but the graph contains at least one statement that has the target re-
source as subject, a rdfs:seeAlso link is established from target resource to
source resource.
All relevant statements are added to the knowledge base containing the tar-
get resource. By using the versioning functionality of OntoWiki, provenance
information of statements added via Pingback requests can be determined, thus
allowing the service to delete statements that are no longer contained by the
source resource.
9http://www.w3.org/2007/08/pyRdfa/
Backlinks that were established via the Pingback service are displayed in the
standard OntoWiki user interface. The "Instances Linking Here" box shows all
incoming links for a given resource in conjunction with the type of the link, as
visualized in figure 3.
5.2 Triplify
Triplify [1] enables the publication of Linked Data from relational databases. It
utilizes simple mappings to map HTTP-URLs to SQL queries and transforms
the relational result into RDF statements. Since a large quantity of currently
available web data is stored in relational databases, the number of available
Linked Data resources increases. As people start to link to those resources, it
becomes handy to notify the respective owner. Therefore, we integrated a Seman-
tic Pingback server into Triplify, which exposes an X-Pingback HTTP header
and handles incoming RPC requests.
The RPC service creates a new database table and stores all registered Ping-
backs persistently. Pingbacks are unique for a given source, target and relation
and hence can be registered only once. Each time the Pingback service is executed
for a given source and target, invalid Pingbacks are removed automatically.
Triplify was extended to export statements for all registered Pingbacks re-
garding a given target resource along with the instance data. The following listing
shows an excerpt of a Triplify export:
1# .. .
2
3< p os t / 1 >
4a sioc:Post ;
5s io c : h a s _ cr e a t or < us e r / 1 > ;
6d ct er ms : c re at ed "2 01 0 -0 2 -1 7 T0 5 :4 8 :1 1" ;
7d ct e rm s : ti t le " He ll o wo rl d ! " ;
8sio c : c o n t e n t " W e l c o m e t o Wo r dP r e s s . This is your .. . " .
9
10 # .. .
11
12 < ht t p :/ / b lo g . a ks w . or g / 20 0 8/ p in g ba ck - t e st / >
13 s io c : l i n ks _ t o < p o st /1 > .
5.3 Standalone implementation
Since a large amount of available RDF data on the Web is contained in plain
RDF files (e.g. FOAF files), we implemented a standalone Semantic Pingback
server10, that can be configured to allow Pingbacks also on external resources.
Based on this implementation, we offer a Semantic Pingback service at: http:
//pingback.aksw.org. It is sufficient to add an RDF statement to an arbitrary
web-accessible RDF document stating that the AKSW Pingback service should
10 Available at: http://aksw.org/Projects/SemanticPingBack
be used employing the pingback:service property. Once a Pingback was send
to that service, the owner of the document gets notified via email. This works
well for FOAF profiles, since the service can detect a foaf:mbox statement in
the profile, which relates the WebID to a mailto:-URI. If no such statement
is found, the service looks for statements that relate the target resource via a
foaf:maker,dc:creator,sioc:has_creator or sioc:has_owner relation to a
resource for which an email address can be obtained.
6 Related Work
Pingback [9] is one of three approaches which allow the automated generation
of backlinks on the Social Web. We have chosen the Pingback mechanism as the
foundation for this work, since it is widely used and less prone to spam than
for example Trackbacks11. Pingback supports the propagation of untyped links
only and is hence not directly applicable to the Data Web.
The PSI BackLinking Service for the Web of Data12 supports the manual
creation of backlinks on the Data Web by employing a number of large-scale
knowledge bases, as for example, data of the UK Public Sector Information
domain. Since it is based on crawling a fixed set of knowledge bases, it cannot
be applied for the entire Data Web. Another service that amongst others is
integrated with the PSI BackLinking Service is SameAs.org13 [7]. Other than
the Semantic Pingback it crawls the Web of Data in order to determine URIs
describing the same resources. OKKAM [3] is a system that aims at unifying
resource identifiers by employing metadata about resources in order to match
them on entities.
The above approaches support interlinking of resources employing central-
ized hubs, but do not support decentralized, on-the-fly backlinking, since they
are based on crawling the Data Web on a regular basis. Consequently the pri-
mary goal of these approaches is to reveal resource identifiers describing the
same entities, rather than interlinking different resources - a key feature of the
Semantic Pingback approach.
7 Conclusion and Future Work
Although the Data Web is currently substantially growing, it still lacks a network
effect as we could observe for example with the blogosphere in the Social Web.
In particular coherence, information quality, timeliness, direct end-user benefits
are still obstacles for the Data Web to become an Web-wide reality. With this
work we aimed at extending and transferring the technological cornerstone of
the Social Web the Pingback mechanism towards the Data Web. The result-
ing Semantic Pingback mechanism has the potential to significantly improve
11 http://www.sixapart.com/pronet/docs/trackback_spec
12 http://backlinks.psi.enakting.org
13 http://sameas.org
the coherence on the Data Web, since linking becomes bi-directional. With its
integrated provenance and spam prevention measures it helps to increase the in-
formation quality. Notification services based on Semantic Pingbacks represent
direct end-user benefits and increase the timeliness. In addition these different
benefits will mutually strengthen each other. Due to its complete downwards
compatibility our Semantic Pingback also bridges the gap between the Social
and the Data Web. We also expect the Semantic Pingback mechanism to sup-
port the transition process from data silos to flexible, decentralized structured
information assets.
Future Work. Currently the Semantic Pingback mechanism is applicable to
relatively static resources, i.e. RDF documents or RDFa annotated Web pages.
We plan to extend the Semantic Pingback mechanism in such a way, that it is
also usable in conjunction with dynamically generated views on the Data Web
- i.e. SPARQL query results. This would allow end-users as well as applications
using remote SPARQL endpoints to get notified once results of a query change.
References
1. S. Auer, S. Dietzold, J. Lehmann, S. Hellmann, and D. Aumueller. Triplify –
Lightweight Linked Data Publication from Relational Databases. In Proceedings
of the 17th International Conference on World Wide Web, WWW 2009, 2009.
2. S. Auer, S. Dietzold, and T. Riechert. OntoWiki - A Tool for Social, Semantic
Collaboration. In The Semantic Web - ISWC 2006, 5th International Semantic
Web Conference, ISWC 2006, 2006.
3. P. Bouquet, H. Stoermer, C. Niederée, and A. Mana. Entity Name System: The
Back-Bone of an Open and Scalable Web of Data. In Proceedings of the 2th IEEE
International Conference on Semantic Computing (ICSC 2008), 2008.
4. J. Breslin, A. Harth, U. Bo jars, and S. Decker. Towards Semantically-Interlinked
Online Communities. In The Semantic Web: Research and Applications Second
European Semantic Web Conference, ESWC 2005, 2005.
5. P. Ciccarese, E. Wu, G. T. Wong, M. Ocana, J. Kinoshita, A. Ruttenberg, and
T. Clark. The SWAN biomedical discourse ontology. Journal of Biomedical Infor-
matics, 41(5):739–751, 2008.
6. S. Corlosquet, R. Cyganiak, A. Polleres, and S. Decker. RDFa in Drupal: Bringing
Cheese to the Web of Data. In Proc. of 5th Workshop on Scripting and Development
for the Semantic Web at ESWC 2009, 2009.
7. H. Glaser, A. Jaffri, and I. Millard. Managing Co-reference on the Semantic Web.
In Proceedings of the Linked Data on the Web Workshop (LDOW2009), 2009.
8. O. Hartig. Provenance Information in the Web of Data, 2009. LDOW2009, April
20, 2009, Madrid, Spain.
9. S. Langridge and I. Hickson. Pingback 1.0. Technical report, http://hixie.ch/
specs/pingback/pingback, 2002.