[show abstract][hide abstract] ABSTRACT: Provenance is becoming an important issue as a reliable estimator of data quality. However, provenance collection mechanisms in the reservoir engineering domain often result in missing provenance information. In this paper, we address the problem of predicting missing provenance information in reservoir engineering. Based on the observation that data items with specific semantic "connections" may share the same provenance, our approach annotates data items with domain entities defined in a domain ontology, and represent these "connections" as sequences of relationships (also known as semantic associations) in the ontology graph. By analyzing annotated historical datasets with complete provenance information, we capture semantic associations that may imply identical provenance. A statistical analysis is applied to assign confidence values to the discovered associations, which indicate the trust of each association when it is used for future provenance prediction. The semantic associations, along with their confidence measures, are then used by a voting algorithm to predict the missing provenance information. Our evaluation shows that the average precision of our approach is above 85% when one third of the provenance information is missing.
Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on; 10/2011
[show abstract][hide abstract] ABSTRACT: The growing recognition of the importance of provenance for data intensive
and multidisciplinary domains is leading to careful collection of
provenance. One consequence of this is the proliferation of provenance
repositories hosted for individual organization or communities, with
limited ability to reconstruct and query for and on provenance across
them. Community standards like the Open Provenance Model (OPM) allow
uniform interpretation and exchange of provenance metadata but do
not prescribe query or service specifications to access provenance.
If data reuse and sharing across institutions is not accompanied
by passing provenance at the time of data exchange, we need to track
the provenance and query for them or over them across distributed
provenance repositories. In this article, we present approaches for
querying over distributed provenance information, and address two
common provenance query models that we formalize: provenance retrieval
query and provenance filter query. Our problem is motivated by Smart
Oilfield applications in the energy informatics domain, and we evaluate
the performance of our algorithms using synthetic workflows based
on the domain.
[show abstract][hide abstract] ABSTRACT: Data management and analysis has become an integral component in the area of reservoir engineering. An important metric that determines the overall effectiveness of data analysis is data quality. Data provenance, the metadata that pertains to the derivation history of data objects, has emerged as an invaluable asset in evaluating data quality. The reservoir facilities and software systems that collect provenance information are often distributed, thus making it difficult to analyze provenance data. Our primary contribution in this paper is an approach for provenance information integration in reservoir engineering.
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on; 10/2010
[show abstract][hide abstract] ABSTRACT: There has been a recent push towards applying information technology principles, such as workflows, to bring greater efficiency to reservoir management tasks. These workflows are data intensive in nature, and the data is derived from heterogenous data sources. This has placed an emphasis on the quality and reliability of data that is used in reservoir engineering applications. Data provenance is metadata that pertains to the history of the data and can be used to assess data quality. In this paper, we present an approach for collecting provenance information from application logs in the domain of reservoir engineering. In doing so, we address challenges due to: 1) the lack of a workflow orchestration framework in reservoir engineering and 2) the inability of many reservoir engineering applications to collect provenance information. We present an approach that uses the workflow instances detection algorithm and the Open Provenance Model (OPM) for capturing provenance information from the logs.
Seventh International Conference on Information Technology: New Generations, ITNG 2010, Las Vegas, Nevada, USA, 12-14 April 2010; 01/2010
[show abstract][hide abstract] ABSTRACT: The lower barrier to entry for users to create and share resources
through applications like Facebook and Twitter, and the commoditization
of social Web data has heightened issues of privacy, attribution,
and copyright. These make it important to track the provenance of
social Web data. We outline and discuss key engineering, privacy,
and monetization challenges in collecting and analyzing provenance
of social Web resources.
Provenance and Annotation of Data and Processes - Third International Provenance and Annotation Workshop, IPAW 2010, Troy, NY, USA, June 15-16, 2010. Revised Selected Papers; 01/2010
[show abstract][hide abstract] ABSTRACT: In reservoir engineering domain experts deal with many tasks and operations, ranging from reservoir simulation to well maintenance scheduling, with the goal of maximizing oil production. These operations involve different applications, and generate a large number of events at the time of their execution. Understanding these events and the relationships among them can aid the domain experts in decision making. The task of aggregating and analyzing the event information that is generated by different applications can be difficult and time consuming. In this paper, we discuss our approach on modeling events in reservoir engineering using Semantic Web techniques. We create an extensible event ontology using the Web Ontology Language (OWL) to model events in reservoir engineering. Event relationships are encoded using the Semantic Web Rule Language (SWRL). The work presented in this paper is an initial step towards a vision of integrated field event management, that uses semantic web technologies to create an efficient platform for event analysis.
Seventh International Conference on Information Technology: New Generations, ITNG 2010, Las Vegas, Nevada, USA, 12-14 April 2010; 01/2010
[show abstract][hide abstract] ABSTRACT: The Web 2.0 wave brings, among other aspects, the programmable Web: increasing numbers of Web sites provide machine-oriented APIs and Web services. However, most APIs are only described with text in HTML documents. The lack of machine-readable API descriptions affects the feasibility of tool support for developers who use these services. We propose a microformat called hRESTS (HTML for RESTful Services) for machine-readable descriptions of Web APIs, backed by a simple service model. The hRESTS microformat describes main aspects of services, such as operations, inputs and outputs. We also present two extensions of hRESTS: SA-REST, which captures the facets of public APIs important for mashup developers, and MicroWSMO, which provides support for semantic automation.
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on; 01/2009
[show abstract][hide abstract] ABSTRACT: We present work in the spatio-temporal-thematic analysis of citizen-sensor observations pertaining to real-world events. Using Twitter as a platform for obtaining crowd-sourced
observations, we explore the interplay between the 3 dimensions in extracting insightful summaries of observations. We present
our experiences in building a web mashup application, Twitris that also facilitates the spatio-temporal-thematic exploration of social signals underlying events.
Web Information Systems Engineering - WISE 2009, 10th International Conference, Poznan, Poland, October 5-7, 2009. Proceedings; 01/2009
[show abstract][hide abstract] ABSTRACT: In this paper we address the challenges that arise due to heterogeneities across independently created and autonomously managed
Web service requesters and Web service providers. Previous work in this area either involved significant human effort or in
cases of the efforts seeking to provide largely automated approaches, overlooked the problem of data heterogeneities, resulting
in partial solutions that would not support executable workflow for real-world problems. In this paper, we present a planning-based
approach to solve both the process heterogeneity and data heterogeneity problems. We adopt a declarative approach to capture
the partner specifications external to the process and demonstrate the usefulness of this approach in adding more dynamism
to Web processes. Our system successfully outputs a BPEL file which correctly solves a non-trivial real-world problem in the
[show abstract][hide abstract] ABSTRACT: On the newly programmable Web, mashups are flourishing. Designers create mashups by combining components of existing Web sites and applications. Although rapid mashup proliferation offers many opportunities, a lack of standarization and compatibility offers considerable challenges. IBM Sharable Code is an online service platform for developing and sharing situational Web 2.0 applications and mashups. The platform is based on an innovative domain-specific language that streamlines and standardizes the development and deployment of applications consuming and exposing Web APIs. Parts of the DSL and the resulting applications and mashups can be shared and reused by members of the IBM Sharable Code community. In this article, the authors offer an overview of the platform's architecture and the DSL language at its core.
IEEE Internet Computing 10/2008; · 2.04 Impact Factor
[show abstract][hide abstract] ABSTRACT: Mediation and integration of data are significant challenges because the number of services on the Web, and heterogeneities in their data representation, continue to increase rapidly. To address these challenges we introduce a new measure, mediatability, which is a quantifiable and computable metric for the degree of human involvement in XML schema mediation. We present an efficient algorithm to compute mediatability and an experimental study to analyze how semantic annotations affect the ease of mediating between two schemas. We validate our approach by comparing mediatability scores generated by our system with user-perceived difficulty. We also evaluate the scalability of our system.
Semantic Computing, 2008 IEEE International Conference on; 09/2008
[show abstract][hide abstract] ABSTRACT: Services Research Lab at the Knoesis center and the LSDIS lab at University of Georgia have played a significant role in advancing the state of research in the are as of workflow management, semantic Web services and service oriented computing. Starting with theMETEOR workflow management system in the 90's, researchers have addressed key issues in the area o f semantic Web services and more recently, in the domain of RESTful services and Web 2.0. In this article , we present a brief discussion on the various contributions of METEOR-S including SAWSDL, publi cation and discovery of semantic Web services, data mediation, dynamic configuration and adapta tion of Web processes. We finally discuss our current and future research in the area of RESTful servic es.
[show abstract][hide abstract] ABSTRACT: Web application hybrids, popularly known as mashups, are created by integrating services on the Web using their APIs. Support for finding an API is currently provided by generic search engines or domain specific solutions such as Google and ProgrammableWeb. Shortcomings of both these solutions in terms of and reliance on user tags make the task of identifying an API challenging. Since these APIs are described in HTML documents, it is essential to look beyond the boundaries of current approaches to Web service discovery that rely on formal descriptions. In this work, we present a faceted approach to searching and ranking Web APIs that takes into consideration attributes or facets of the APIs as found in their HTML descriptions. Our method adopts current research in document classification and faceted search and introduces the serviut score to rank APIs based on their utilization and popularity. We evaluate classification, search accuracy and ranking effectiveness using available APIs while contrasting our solution with existing ones.
2008 IEEE International Conference on Web Services (ICWS 2008), September 23-26, 2008, Beijing, China; 01/2008
[show abstract][hide abstract] ABSTRACT: Services based on the representational state transfer (REST) paradigm, a lightweight implementation of a service-oriented architecture, have found even greater success than their heavyweight siblings, which are based on the Web Services Description Language (WSDL.) and SOAP. By using XML-based messaging, RESTful services can bring together discrete data from different services to create meaningful data sets; mashups such as these are extremely popular today.
IEEE Internet Computing 12/2007; · 2.04 Impact Factor
[show abstract][hide abstract] ABSTRACT: The evolution of the Web 2.0 phenomenon has led to the increased adoption of the RESTful services paradigm. RESTful services often take the form of RSS/Atom feeds and AJAX based light weight services. The XML based messaging paradigm of RESTful services has made it possible to compose various services together. Such compositions of RESTful services is widely referred to as Mashups. In this paper, we outline the limitations in current approaches to creating mashups. We address these limitations by proposing a framework called as SA-REST. SA-REST adds semantics to RESTful services. Our proposed framework builds upon the original ideas in WSDL-S, our W3C submission, which was subsequently adapted for Semantic Annotation of WSDL (SAWSDL), now a W3C proposed recommendation. We demonstrate use of microformats for semantic annotation of RESTful services and then the use of such semantically enabled services with better support for interoperability for creating dynamic mashups called SMashups.
Semantic Computing, 2007. ICSC 2007. International Conference on; 10/2007
[show abstract][hide abstract] ABSTRACT: Traditional research in service composition has assumed perfect functional matching of service capabilities against stated requirements. In real life, however, this is a myth, as borne out within several SOA development and deployment organizations such as IBM in customer engagements. In particular, the variations in data, functional and nonfunctional requirements present a serious hurdle in reusing existing available services and creating service compositions at run-time. Current research in semantic Web services seeks to address this problem by creating meta-models that capture the domain and later on grounding the requirements and capabilities to this meta-model. Our research project, Variation-Oriented Service Composition and Adaptation (VOSCA), investigates the key research issues at modeling-time and run-time described above. Before the VOSCA vision is realized, however, some basic notions need to be first defined and clearly articulated. Hence this paper provides an initial first step by focusing on defining service variants and providing algorithms for service matching.
Services Computing, 2007. SCC 2007. IEEE International Conference on; 08/2007
[show abstract][hide abstract] ABSTRACT: We propose a semantic framework for automatically identifying events as a step towards developing an adaptive middleware for Service Oriented Architecture (SOA). Current related research focuses on adapting to events that violate certain non-functional objectives of the service requestor. Given the large of number of events that can happen during the execution of a service, identifying events that can impact the non-functional objectives of a service request is a key challenge. To address this problem we propose an approach that allows service requestors to create semantically rich service requirement descriptions, called semantic templates. We propose a formal model for expressing semantic templates and for measuring the relevance of an event to both the action being performed and the nonfunctional objectives. This model is extended to adjust the relevance of the events based on feedback from the underlying adaptation framework. We present an algorithm that utilizes multiple ontologies for identifying relevant events and present our evaluations that measure the efficiency of both the event identification and the subsequent adaptation scheme.
Web Services, 2007. ICWS 2007. IEEE International Conference on; 08/2007
[show abstract][hide abstract] ABSTRACT: Business processes in the global environment increasingly encompass multiple partners and complex, rapidly changing requirements. In this context it is critical that strategic business objectives align with and map accu- rately to systems that support flexible and dynamic business processes. To support the demanding requirements of global business processes, we propose a comprehensive, unifying 4 X 4 Semantic Model that uses Semantic Templates to link four tiers of implementation with four types of semantics. The four tiers are the Business Process Tier, the Workflow Enactment Tier, the Partner Services Tier, and the Middleware Services Tier. The four types of semantics are Data Semantics, Function Semantics, Nonfunctional Semantics, and Execution Semantics. Our model encompasses services architectures that include enterprise class WSDL-based Web services as well as the lightweight but broadly used REST-based services.
Enterprise Information Systems, 9th International Conference, ICEIS 2007, Funchal, Madeira, June 12-16, 2007, Revised Selected Papers; 01/2007
[show abstract][hide abstract] ABSTRACT: Web service composition has quickly become a key area of research in the services oriented architecture com- munity. One of the challenges in composition is the existence of heterogeneities across independently created and autonomously managed Web service requesters and Web service providers. Previous work in this area either involved significant human effort or in cases of the efforts seeking to provide largely automated ap- proaches, overlooked the problem of data heterogeneities, resulting in partial solutions that would not support executable workflow for real-world problems. In this paper, we present a planning-based approach to solve both the process heterogeneity and data heterogeneity problems. Our system successfully outputs a BPEL file which correctly solves a non-trivial real-world problem in the 2006 SWS Challenge.
ICEIS 2007 - Proceedings of the Ninth International Conference on Enterprise Information Systems, Volume SAIC, Funchal, Madeira, Portugal, June 12-16, 2007; 01/2007