Article

Expressing self-referential usage policies for the Semantic Web

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Numerous forms of policies, licensing terms, and related conditions are associated with Web data and services. A natural goal for facilitating the re-use and re-combination of such content is to model usage policies as part of the data so as to enable their exchange and automated processing. This paper thus proposes a concrete policy modelling language. A particular difficulty are self-referential policies such as Creative Commons ShareAlike, that mandate that derived content is published under some license with the same permissions and requirements. We present a general semantic framework for evaluating such re-cursive statements, show that it has desirable formal properties, and explain how it can be evaluated using existing tools. We then show that our approach is com-patible with both OWL DL and Datalog, and illustrate how one can concretely model self-referential policies in these languages to obtain desired conclusions.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... However, the focus of this presentation is to motivate and explain the rationale behind our proposal. Formal proofs and further details are found in an extended report [20]. ...
... After enough witnesses have been included to refute all non-entailed containedIn facts, the construction ofˆIofˆ ofˆI is completed by defining suitable extensions for conformsTo where care is needed to do this for " unnamed " policies so that T ct is satisfied. A full formal argument is found in the technical report [20]. ...
... On the other hand, cardinality restrictions are unproblematic even though they are usually translated using a special built-in equality predicate ≈ that we did not allow in first-order logic in Section 5. The reason is that ≈ can easily be emulated in first-order logic using a standard equality theory [20], so that all of our earlier results carry over to this extension. To apply Theorem 1 for reasoning, we still must be able to express T ci of Definition 4 in OWL. ...
Conference Paper
Numerous forms of policies, licensing terms, and related conditions are associated with Web data and services. A natural goal for facilitating the reuse and re-combination of such content is to model usage policies as part of the data so as to enable their exchange and automated processing. This paper thus proposes a concrete policy modelling language. A particular difficulty are self-referential policies such as Creative Commons ShareAlike, that mandate that derived content is published under some license with the same permissions and requirements. We present a general semantic framework for evaluating such recursive statements, show that it has desirable formal properties, and explain how it can be evaluated using existing tools. We then show that our approach is compatible with both OWL DL and Datalog, and illustrate how one can concretely model self-referential policies in these languages to obtain desired conclusions.
... In line with the Web of Data philosophy [11], licenses for such datasets should be specified in RDF, for instance through the Dublin Core vocabulary 1 . In the latest years, a number of approaches have been proposed to model licenses in RDF, define licensing patterns [16,15], deal with self-referential licenses [12], or define composite licenses for query results containing heterogeneously licensed material [10]. Despite such approaches, still a lot of effort is needed to enhance the association of licenses to data on the Web, and to process licensed material possibly in an automated way. ...
... Truong et al. [18] address the issue of analyzing data contracts, based on ODRL-S again. Krotzsch and Speiser [12] present a semantic framework for evaluating ShareAlike recursive statements. Gordon [4] presents a legal prototype for analyzing open source licenses compatibility using the Carneades argumentation system. ...
Conference Paper
Full-text available
In the Web of Data, licenses specifying the terms of use and reuse are associated not only to datasets but also to vocabularies. However, even less support is provided for taking the licenses of vocabularies into account than for datasets, which says it all. In particular, this paper addresses the following issue: checking the compatibility among the set of licenses assigned to the vocabularies used to constitute a dataset, and the license that is intended to be associated to the dataset itself. We provide a framework called LIVE able to support data publishers in such compatibility checking step, taking into consideration both the licenses associated to the vocabularies and those assigned to the data.
... However, there are also some common points like the use of RDF for the representation of data licenses/contracts. Krotzsch and Speiser [7] present a semantic framework for evaluating ShareAlike recursive statements, developing a general policy modelling language for supporting selfreferential policies as expressed by CC. We address another kind of problem that is the composition of the licenses constraining the data into a unique composite license. ...
Article
Full-text available
The absence of clarity concerning the licensing terms does not encourage the reuse of the data by the consumers, thus preventing further publication and consumption of datasets, at the expense of the growth of the Web of Data itself. In this paper, we propose a general framework to attach the licensing terms to the data queried on the Web of Data. In particular, our framework addresses the following issues: (i) the various license schemas are collected and aligned taking as reference the Creative Commons schema, (ii) the compatibility of the licensing terms concerning the data affected by the query is verified, and (iii) if compatible, the licenses are combined into a composite license. The framework returns the composite license as licensing term about the data resulting from the query.
Chapter
Full-text available
The increasing diffusion of linked data as a standard way to share knowledge on the Web allows users and public and private organizations to fully exploit structured data from very large datasets that were not available in the past. Over the last few years, linked data developed into a large number of datasets with an open access from several domains leading to the linking open data (LOD) cloud. Similar to other types of information such as structured data, linked data suffers from quality problems such as inconsistency, inaccuracy , out-of-dateness, incompleteness, and inconsistency, which are frequent and imply serious limitations to the full exploitation of such data. Therefore, it is important to assess the quality of the datasets that are used in linked data applications before using them. The quality assessment allows users or applications to understand whether data is appropriate for their task at hand.
Chapter
Full-text available
In this chapter, we describe the motivations for, and development of, a rule-based policy management system that can be deployed in the open and distributed milieu of the World Wide Web. We discuss the necessary features of such a system in creating a "Policy Aware" infrastructure for the Web and argue for the necessity of such infrastructure. We then show how the integration of a Semantic Web rules language (N3) with a theorem prover designed for the Web (Cwm) makes it possible to use the Hypertext Transport Protocol (HTTP) to provide a scalable mechanism
Conference Paper
Full-text available
The processing of data is often restricted by contractual and legal requirements for protecting privacy and IPRs. Policies pro- vide means to control how and by whom data is processed. Conditions of policies may depend on the previous processing of the data. However, existing policy languages do not provide means to express such condi- tions. In this work we present a formal model and language allowing for specifying conditions based on the processing history. We base the model and language on XACML.
Conference Paper
Full-text available
Ontological metamodeling has a variety of applications yet only very restricted forms are supported by OWL 2 directly. We propose a novel encod- ing scheme enabling class-based metamodeling inside the domain ontology with full reasoning support through standard OWL 2 reasoning systems. We demon- strate the usefulness of our method by applying it to the OntoClean methodology. En passant, we address performance problems arising from the inconsistency di- agnosis strategy originally proposed for OntoClean by introducing an alternative technique where sources of conflicts are indicated by means of marker predicates.
Conference Paper
Full-text available
We introduce a unified framework that interrelates three different types of policies that will be used in autonomic computing system: action, goal, and utility function policies. Our policy framework is based on concepts from artificial intelligence such at: states, actions, and rational agents. We show how the framework can be used to support the use of all three types of policies within a single autonomic component or system, and use the framework to discuss the relative merits of each type.
Conference Paper
Full-text available
The Web allows users to share their work very effectively leading to the rapid re-use and remixing of content on the Web including text, images, and videos. Scientific research data, social networks, blogs, photo sharing sites and other such applications known collectively as the Social Web have lots of increasingly complex information. Such information from several Web pages can be very easily aggregated, mashed up and presented in other Web pages. Content generation of this nature inevitably leads to many copyright and license violations, motivating research into effective methods to detect and prevent such violations. This is supported by an experiment on Creative Commons (CC) attribution license violations from samples of Web pages that had at least one embedded Flickr image, which revealed that the attribution license violation rate of Flickr images on the Web is around 70-90%. Our primary objective is to enable users to do the right thing and comply with CC licenses associated with Web media, instead of preventing them from doing the wrong thing or detecting violations of these licenses. As a solution, we have implemented two applications: (1) Attribution License Violations Validator, which can be used to validate users’ derived work against attribution licenses of reused media and, (2) Semantic Clipboard, which provides license awareness of Web media and enables users to copy them along with the appropriate license metadata.
Article
Full-text available
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student submitted PDF version of thesis. Includes bibliographical references (p. 79-82). This thesis focuses on methods for detecting and preventing license violations, in a step towards policy aware content reuse on the Web. This framework builds upon the Creative Commons (CC) Rights Expression Language, which provides a very clear and a widely accepted set of licenses grounded in Semantic Web technologies. These licenses are machine readable, and indicates to a person who wishes to reuse the content exactly how it should be used. An experiment on CC attribution license violations on Flickr images revealed the attribution license violation rate on the Web to be around 70-90% from samples of Websites that had at least one embedded Flickr image. Therefore, it is evident that there should be robust mechanisms for detecting license violations on the Web and prevent those happening, if possible. The primary objective is to enable the user to do the right thing instead of preventing the user from doing the wrong thing. As a solution, we have implemented (1) "Attribution License Violations Validator" for Flickr images and, (2) the more generic "Semantic Clipboard". The "Attribution License Violations Validator" can be used to validate users' work against any attribution license violation. The "Semantic Clipboard", which is implemented as a component of the Tabulator Firefox extension, allows the user to copy an image with its license metadata expressed in Resource Description Framework in annotations (RDFa) in the original source document to any other document. by Oshani Wasana Seneviratne. S.M.
Conference Paper
Full-text available
Policy comparison is useful for a variety of applications, including policy validation and policy-aware service selection. While policy comparison is somewhat natural for policy languages based on description logics, it becomes rather difficult for rule-based policies. When policies have recursive rules, the problem is in general undecidable. Still most policies require some form of recursion to model - say - subject and object hierarchies, and certificate chains. In this paper, we show how policies with recursion can be compared by adapting query optimization techniques developed for the relational algebra. We prove soundness and completeness of our method, discuss the compatibility of the restrictive assumptions we need w.r.t. our reference application scenarios, and report the results of a preliminary set of experiments to prove the practical applicability of our approach.
Conference Paper
Full-text available
Explanations for decisions made by a policy framework allow end users to understand how the results were obtained, increase trust in the policy decision and enforcement process, and enable policy administrators to ensure the correctness of the policy. In our framework, an explanation for any statement including a policy decision is a representation of the list of reasons (known as dependencies) associated with its derivation. Dependency tracking involves maintaining the list of reasons (statements and rules) for the derivation of a new statement. In this paper, we describe our policy approach that (i) provides explanations for policy decisions, (ii) provides more efficient and expressive reasoning through the use of nested sub-rules and goal direction, and (Hi) is grounded in semantic Web technologies. We discuss the characteristics of our approach and provide a brief overview of the AIR policy language that implements it. We also discuss how relevant explanation information is identified and presented to end users and describe our preliminary graphical user interface.
Conference Paper
Full-text available
Data is often encumbered by restrictions on the ways in which it may be used. These restrictions on usage may be determined by statute, by contract, by custom, or by common decency, and they are used to control collection of data, diffusion of data, and the inferences that can be made over the data. In this paper, we present a data-purpose algebra that can be used to model these kinds of restrictions in various different domains. We demonstrate the utility of our approach by modeling part of the Privacy Act (5 USC xi552a)<sup>1</sup>, which states that data collected about US citizens can be used only for the purposes for which it was collected. We show (i) how this part of the Privacy act can be represented as a set of restrictions on data usage, (ii) how the authorized purposes of data flowing through different government agencies can be calculated, and (iii) how these purposes can be used to determine whether the Privacy Act is being enforced appropriately.
Conference Paper
Full-text available
We describe a policy language designed for pervasive computing applications that is based on deontic concepts and grounded in a semantic language. The pervasive computing environments under consideration are those in which people and devices are mobile and use various wireless networking technologies to discover and access services and devices in their vicinity. Such pervasive environments lend themselves to policy-based security due to their extremely dynamic nature. Using policies allows the security functionality to be modified without changing the implementation of the entities involved. However, along with being extremely dynamic, these environments also tend to span several domains and be made up of entities of varied capabilities. A policy language for environments of this sort needs to be very expressive but lightweight and easily extensible. We demonstrate the feasibility of our policy language in pervasive environments through a prototype used as part of a secure pervasive system.
Article
Full-text available
We study the complexity of the problem of answering queries using materialized views. This problem has attracted a lot of attention recently because of its relevance in data integration. Previous work considered only conjunctive view definitions. We examine the consequences of allowing more expressive view definition languages. The languageswe consider for view definitions and user queries are: conjunctive queries with inequality, positive queries, datalog, and first-order logic. We show that the complexity of the problem depends on whether views are assumed to store all the tuples that satisfy the view definition, or only a subset of it. Finally, we apply the results to the view consistency and view self-maintainability problems which arise in data warehousing. 1 Introduction The notion of materialized view is essential in databases [34] and is attracting more and more attention with the popularity of data warehouses [28]. The problem of answering queries using materialized views [24...
Article
Full-text available
Model Theory. That is, we have a non-empty family I of partial isomorphisms between two models M and N, which is closed under taking restrictions to smaller domains, and where the standard Back-and-Forth properties are now restricted to apply only to partial isomorphisms of size at most k. Proof. (A complete argument is in [16].) An outline is reproduced here, for convenience. First, k-variable formulas are preserved under partial isomorphism, by a simple induction. More precisely, one proves, for any assignment A and any partial isomorphism I 2 I which is defined on the A-values for all variables x 1 ; : : : ; x k , that M;A j= OE iff N; I ffi A j= OE: The crucial step in the induction is the quantifier case. Quantified variables are irrelevant to the assignment, so that the relevant partial isomorphism can be restricted to size at most k Gamma 1, whence a matching choice for the witness can be made on the opposite side. This proves "only if". Next, "if" has a proof analogous to...
Article
Full-text available
Terminological knowledge representation formalisms are intended to capture the analytic relationships between terms of a vocabulary intended to describe a domain. A term whose definition refers, either directly or indirectly, to the term itself presents a problem for most terminological representation systems because it is obvious neither whether such a term is meaningful, nor how it could be handled by a knowledge representation system in a satisfying manner. After some examples of intuitively sound terminological cycles are given, different formal semantics are investigated and evaluated with respect to the examples. As it turns out, none of the different styles of semantics seems to be completely satisfying for all purposes. Finally, consequences in terms of computational complexity and decidability are discussed. 1 Introduction When trying to represent an expert's knowledge about a sufficiently complex domain we have to account for the vocabulary used in this domain [Brachman and ...
Book
With more substantial funding from research organizations and industry, numerous large-scale applications, and recently developed technologies, the Semantic Web is quickly emerging as a well-recognized and important area of computer science. While Semantic Web technologies are still rapidly evolving, Foundations of Semantic Web Technologies focuses on the established foundations in this area that have become relatively stable over time. It thoroughly covers basic introductions and intuitions, technical details, and formal foundations. The book concentrates on Semantic Web technologies standardized by the World Wide Web Consortium: RDF and SPARQL enable data exchange and querying, RDFS and OWL provide expressive ontology modeling, and RIF supports rule-based modeling. The text also describes methods for specifying, querying, and reasoning with ontological information. In addition, it explores topics that are clearly beyond foundations, such as tools, applications, and engineering aspects. Written by highly respected researchers with a deep understanding of the material, this text centers on the formal specifications of the subject and supplies many pointers that are useful for employing Semantic Web technologies in practice. The book has an accompanying website with supplemental information.
Conference Paper
A common practice in conceptual modeling is to separate the intensional from the extensional model. Although very intuitive, this approach is inadequate for many complex domains, where the borderline between the two models is not clear-cut. Therefore, OWL-Full, the most expressive of the Semantic Web ontology languages, allows combining the intensional and the extensional model by a feature we refer to as metamodeling. In this paper, we show that the semantics of metamodeling adopted in OWL-Full leads to undecidability of basic inference problems, due to free mixing of logical and metalogical symbols. Based on this result, we propose two alternative semantics for metamodeling: the contextual and the HiLog semantics. We show that SHOIQ\mathcal{SHOIQ} — a description logic underlying OWL-DL — extended with metamodeling under either semantics is decidable. Finally, we show how the latter semantics can be used in practice to axiomatize the logical interaction between concepts and metaconcepts.
Article
The use of circumscription for formalizing commonsense knowledge and reasoning requires that a circumscription policy be selected for each particular application: we should specify which predicates are circumscribed, which predicates and functions are allowed to vary, and what priorities between the circumscribed predicates are established. The circumscription policy is usually described either informally or using suitable metamathematical notation. In this paper we propose a simple and general formalism which permits describing circumscription policies by axioms, included in the knowledge base along with the axioms describing the objects of reasoning. The new formalism is illustrated by recasting some of the familiar applications of circumscription in its terms.
Article
Motivated by medical terminology applications, we investigate the decidability of an expressive and prominent description logic (DL), SHIQ, extended with role inclusion axioms of the form . It is well known that a naive such extension leads to undecidability, and thus we restrict our attention to axioms of the form or , which is the most important form of axioms in the applications that motivated this extension. Surprisingly, this extension is still undecidable. However, it turns out that by restricting our attention further to acyclic sets of such axioms, we regain decidability. We present a tableau-based decision procedure for this DL and report on its implementation, which promises to behave well in practice and provides important additional functionality in a medical terminology application.
Article
The Open Provenance Model is a model of provenance that is designed to meet the following requirements: (1) Allow provenance information to be exchanged between systems, by means of a compatibility layer based on a shared provenance model. (2) Allow developers to build and share tools that operate on such a provenance model. (3) Define provenance in a precise, technology-agnostic manner. (4) Support a digital representation of provenance for any “thing”, whether produced by computer systems or not. (5) Allow multiple levels of description to coexist. (6) Define a core set of rules that identify the valid inferences that can be made on provenance representation. This document contains the specification of the Open Provenance Model (v1.1) resulting from a community effort to achieve inter-operability in the Provenance Challenge series.
Conference Paper
Cyclic definitions are often prohibited in terminological knowledge representation languages because, from a theoretical point of view, their semantics is not clear and, from a practical point of view, existing inference algorithms may go astray in the presence of cycles. In this paper, we shall consider terminological cycles in a very small KL-ONE-based language. For this language, the effect of the three types of semantics introduced by (Nebel 1987,1989,1989a) can be completely described with the help of finite automata. These descriptions provide a rather intuitive understanding of terminologies with cyclic definitions and give insight into the essential features of the respective semantics. In addition, one obtains algorithms and complexity results for subsumption determination. As it stands, the greatest fixed-point semantics comes off best. The characterization of this semantics is easy and has an obvious intuitive interpretation. Furthermore, important constructs - such as value-restriction with respect to the transitive or reflexive-transitive closure of a role - can easily be expressed.
Conference Paper
We consider a version of the relational model in which relation names may appear as arguments of other relations. Allowing relation names as arguments provides enhanced modelling capabilities, allowing some object-oriented features to be expressed within the relational model. We extend relational algebra with operators for accessing relations, and also define a relational calculus based on the logic HiLog. We prove two equivalence results between extensions of relational algebra provide higher expressive power than relational algebra on any given database. Finally, we argue that the extensions proposed here are relatively easy to provide in practice, and should be expressible within modern query languages.
Conference Paper
We extend the Semantic Web query language SPARQL by defining the semantics of SPARQL queries under the entailment regimes of RDF, RDFS, and OWL. The proposed extensions are part of the SPARQL 1.1 Entailment Regimes working draft which is currently being developed as part of the W3C standardization process of SPARQL 1.1. We review the conditions that SPARQL imposes on such extensions, discuss the practical difficulties of this task, and explicate the design choices underlying our proposals. In addition, we include an overview of current implementations and their underlying techniques.
Conference Paper
Formal policies allow the non-ambiguous definition of situations in which usage of certain entities are allowed, and enable the automatic evaluation whether a situation is compliant. This is useful for example in applications using data provided via standardized interfaces. The low technical barriers of integrating such data sources is in contrast to the manual evaluation of natural language policies as they currently exist. Usage situations can themselves be regulated by policies, which can be restricted by the policy of a used entity. Consider for example the Google Maps API, which requires that applications using the API must be available without a fee, i.e. the application’s policy must not require a payment. In this paper we present a policy language that can express such constraints on other policies, i.e. a self-policing policy language. We validate our approach by realizing a use case scenario, using a policy engine developed for our language.
Article
Open distributed environments such as the World Wide Web facilitate information sharing but provide limited support to the protection of sensitive information and resources. Trust negotiation (TN) frameworks have been proposed as a better solution for open environments, in which parties may get in touch and interact without being previously known to each other. In this paper we illustrate PROTUNE, a rule-based TN system. By describing PROTUNE, we will illustrate the advantages that arise from an advanced rule-based approach in terms of deployment efforts, user friendliness, communication efficiency, and interoperability. The generality and technological feasibility of PROTUNE's approach are assessed through an extensive analysis and experimental evaluations.
  • P F Patel-Schneider
  • Y Pan
  • B Glimm
  • P Hitzler
  • P Mika
  • J Pan
  • I Horrocks
Patel-Schneider, P.F., Pan, Y., Glimm, B., Hitzler, P., Mika, P., Pan, J., Horrocks, I. (eds.): Proc. 9th Int. Semantic Web Conf. (ISWC'10), LNCS, vol. 6496. Springer (2010)
Rights statements on the Web of Data. Nodalities Magazine pp
  • L Dodds
Dodds, L.: Rights statements on the Web of Data. Nodalities Magazine pp. 13–14 (2010)
The Description Logic Handbook
  • F Baader
  • D Calvanese
  • D Mcguinness
  • D Nardi
Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P. (eds.): The Description Logic Handbook. Cambridge University Press, second edn. (2007)
Requirements for Provenance on the Web Available at http://www.w3.org
  • J Cheney
  • Y Gil
  • P Groth
  • S Miles
Cheney, J., Gil, Y., Groth, P., Miles, S.: Requirements for Provenance on the Web. Available at http://www.w3.org/2005/Incubator/prov/wiki/User Requirements, W3C Provenance Incubator Group (2010)
CC in Review: Lawrence Lessig on Compatibility
  • L Lessig
Lessig, L.: CC in Review: Lawrence Lessig on Compatibility. Available at http:// creativecommons.org/weblog/entry/5709 (accessed 1st July 2011) (2005)