Towards a Threat Model for Provenance in e-Science.
ABSTRACT Scientists increasingly rely on workflow management systems to perform large-scale computational scientific experiments. These
systems often collect provenance information that is useful in the analysis and reproduction of such experiments. On the other
hand, this provenance data may be exposed to security threats which can result, for instance, in compromising the analysis
of these experiments, or in illegitimate claims of attribution. In this work, we describe our ongoing work to trace security
requirements for provenance systems in the context of e-Science, and propose some security controls to fulfill them.
- SourceAvailable from: Luiz Gadelha[Show abstract] [Hide abstract]
ABSTRACT: Secure provenance techniques are essential in generating trustworthy provenance records, where one is interested in protecting their integrity, confidentiality, and availability. In this work, we suggest an architecture to provide protection of authorship and temporal information in grid-enabled provenance systems. It can be used in the resolution of conflicting intellectual property claims, and in the reliable chronological reconstitution of scientific experiments. We observe that some techniques from public key infrastructures can be readily applied for this purpose. We discuss the issues involved in the implementation of such architecture and describe some experiments realized with the proposed techniques.Fourth International Conference on e-Science, e-Science 2008, 7-12 December 2008, Indianapolis, IN, USA; 01/2008
Article: How to Time-stamp a Digital Document[Show abstract] [Hide abstract]
ABSTRACT: The prospect of a world in which all text, audio, picture, and video documents are in digital form on easily modifiable media raises the issue of how to certify when a document was created or last changed. The problem is to time-stamp the data, not the medium. We propose computationally practical procedures for digital time-stamping of such documents so that it is infeasible for a user either to back-date or to forward-date his document, even with the collusion of a time-stamping service. Our procedures maintain complete privacy of the documents themselves, and require no record-keeping by the time-stamping service. Appeared, with minor editorial changes, in Journal of Cryptology, Vol. 3, No. 2, pp. 99--111, 1991. 0 Time's glory is to calm contending kings, To unmask falsehood, and bring truth to light, To stamp the seal of time in aged things, To wake the morn, and sentinel the night, To wrong the wronger till he render right. The Rape of Lucrece, l. 941 1 Introduction ...Journal of Cryptology 09/1999; · 0.77 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: As increasing amounts of valuable information are produced and persist digitally, the ability to determine the origin of data becomes important. In science, medicine, commerce, and government, data provenance tracking is essential for rights protection, regulatory compliance, management of intelligence and medical data, and authentication of information as it flows through workplace tasks. In this paper, we show how to provide strong integrity and confidentiality assurances for data provenance information. We describe our provenance-aware system prototype that implements provenance tracking of data writes at the application layer, which makes it extremely easy to deploy. We present empirical results that show that, for typical real-life workloads, the run-time overhead of our approach to recording provenance with confidentiality and integrity guarantees ranges from 1%-13%.7th USENIX Conference on File and Storage Technologies, February 24-27, 2009, San Francisco, CA, USA. Proceedings; 01/2009
Towards a Threat Model for Provenance in e-Science
Luiz M. R. Gadelha Jr.1, Marta Mattoso1, Michael Wilde2, and Ian Foster2
1Computer and Systems Engineering Program
Federal University of Rio de Janeiro, Brazil
University of Chicago / Argonne National Laboratory, USA
Abstract. Scientists increasingly rely on workflow management systems to per-
form large-scale computational scientific experiments. These systems often col-
lect provenance information that is useful in the analysis and reproduction of such
experiments. On the other hand, this provenance data may be exposed to security
threats which can result, for instance, in compromising the analysis of these ex-
periments, or in illegitimate claims of attribution. In this work, we describe our
ongoing work to trace security requirements for provenance systems in the con-
text of e-Science, and propose some security controls to fulfill them.
As an important paradigm of scientific research, computer simulations are increasingly
being used to perform computational scientific experiments. As the scale of these ex-
periments increase, scientific workflow management systems become a relevant tool
to specify, execute, and analyze them. These systems often collect provenance infor-
mation, often distributed in grids or remote clusters, that is useful in the analysis and
reproduction of such experiments. If the appropriate security controls are not in place,
provenance systems may be exposed to threats that may compromise the integrity, con-
fidentiality, or availability of provenance data. In this work, we describe our ongoing
work to trace security requirements for provenance systems in the context of e-Science,
and propose some security controls to fulfill them. The study of security issues in prove-
nance systems is relatively recent   . However, some important security require-
ments, described in section 2, were not yet identified in related academic works, to our
2Requirements for Secure Provenance Management in e-Science
The typical execution of a workflow involves specifying its flow using some mech-
anism, such as a parallel scripting language or a GUI-based workflow specification
tool. Later on, it can be executed by a workflow management system, this involves se-
lecting appropriate computational resources, submitting tasks to these resources, and
transferring data. After the experiment is executed, scientists typically face the chal-
lenge of analyzing a large number of output data files, provenance systems are useful
in this context since they can help to determine, for instance, which tasks where ex-
ecuted to generate a particular data object, and which parameters were used for these
tasks. This provenance data is usually collected and stored during workflow execution,
to describe causal relationships between tasks and data (retrospective provenance); or
during workflow specification, to describe the planned tasks, and data flow (prospective
provenance). In general, provenance data is accessed and analyzed by scientists using
a query language, such as SQL. In our ongoing threat modeling effort, we are enumer-
ating threats to each of these components of a provenance system. Many of these are
already taken into account by security frameworks for underlying technologies used
by provenance systems, such as databases and grids. Scientists, specially in the life
sciences, often avoid sharing details of experiments prior to publishing their results in
some academic journal or event, to assure correct attribution of scientific results. Dur-
ing this interval, scientific collaboration is prevented. Therefore, security controls that
prevent illegitimate claims attribution are an important security requirement for prove-
scientists can define which individuals can read, or modify provenance data.
This work describes our progress in defining a threat model and proposing security con-
trols for provenance systems in the context of e-Science. We identify the assurance of
correct attribution of scientific results as an important security requirement for these
systems. For this purpose, we proposed Kairos , a security architecture for prove-
nance that uses cryptographic timestamps  and digital signatures. We are currently
working on the implementation of the proposed techniques in Swift , a provenance-
enabled parallel scripting system. As future work, we plan to investigate fine-grained
access control techniques, and a data model to store and query security properties of
1. U. Braun, A. Shinnar, and M. Seltzer. Securing Provenance. In Proc. 3rd USENIX Workshop
on Hot Topics in Security (HotSec ’08), 2008.
2. L. Gadelha and M. Mattoso. Kairos: An Architecture for Securing Authorship and Temporal
Information of Provenance Data in Grid-Enabled Workflow Management Systems. In Proc.
4th IEEE International Conference on e-Science (e-Science 2008), pages 597–602, 2008.
3. S. Haber and W. Stornetta. How to Time-Stamp a Digital Document. Journal of Cryptology,
4. R. Hasan, R. Sion, and M. Winslett. The Case of the Fake Picasso: Preventing History Forgery
with Secure Provenance. In Proc. 7th USENIX Conference on File and Storage Technologies
(FAST ’09), pages 1–14, 2009.
5. M. Nagappan and M. Vouk. A Model for Sharing of Confidential Provenance Information
in a Query Based System. In Proc. 2nd International Provenance and Annotation Workshop
(IPAW 2008), volume 5272 of LNCS, pages 62–69. Springer, 2008.
6. M. Wilde, I. Foster, K. Iskra, P. Beckman, A. Espinosa, M. Hategan, B. Clifford, and I. Raicu.
Parallel Scripting for Applications at the Petascale and Beyond. IEEE Computer, 42(11):50–
60, November 2009.