J. Steven Hughes

J. Steven Hughes
California Institute of Technology | CIT · Jet Propulsion Laboratory

MS Computer Science

About

87
Publications
7,628
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
380
Citations
Citations since 2017
7 Research Items
94 Citations
2017201820192020202120222023051015202530
2017201820192020202120222023051015202530
2017201820192020202120222023051015202530
2017201820192020202120222023051015202530
Introduction
John Steven Hughes is a Principal Computer Scientist at the Jet Propulsion Laboratory and has worked on NASA’s Planetary Data System since its inception in roles ranging from project engineer to chief architect of the PDS4 Information Model. Other tasks at JPL include information model architect for the JPL Enterprise 2.0 project and editor of the Open Archival Information System (OAIS) Interoperability Framework CCSDS Blue Book. He is also ISO certified to audit and certify trustworthy digital repositories under ISO 16363:2012 and is a co-author of the OAIS Reference Model - ISO 14721:2012. He was awarded the NASA Exceptional Public Service Medal for exceptional service to NASA science missions and data archives, architecting and implementing data intensive systems, information models, and ontologies for three decades.
Additional affiliations
June 1983 - present
California Institute of Technology
Position
  • Principle Computer Scientist
Description
  • Information models and ontologies for model-driven information system architectures; Associate member of Jet Propulsion Laboratory’s Center for Data Science and Technology; Member of the Primary Trusted Digital Repository Accreditation Board (PTAB)

Publications

Publications (87)
Conference Paper
Full-text available
The aim of this paper is to describe and explain the most significant updates which have been made to the version 2 OAIS [1], which was published in 2012, from the point of view of the authors, who have all been deeply involved with the revision. These updates resulted in a draft which, at the time of writing, is the text to be submitted for Consul...
Article
Full-text available
The Planetary Data System has developed the PDS4 Information Model to enable interoperability across diverse science disciplines. The Information Model is based on an integration of International Organization for Standardization (ISO) level standards for trusted digital archives, information model development, and metadata registries. Where control...
Article
We describe here the parallels in astronomy and earth science datasets, their analyses, and the opportunities for methodology transfer from astroinformatics to geoinformatics. Using example of hydrology, we emphasize how meta-data and ontologies are crucial in such an undertaking. Using the infrastructure being designed for EarthCube - the Virtual...
Conference Paper
Full-text available
The Planetary Data System (PDS) recently released Version 1.5 of the PDS4 Information Model, the primary component of the PDS4 Information Architecture. The Information Model is now stable and is in use by three active missions and several missions in various phases of development. The Information Model drives the PDS4 Information System using a mu...
Conference Paper
Full-text available
Research has shown that the amount of data now available often overwhelms key functions of an information system. This situation necessitates the design of information architectures that scale to meet the challenges. The Planetary Data System, a NASA funded project, has developed an information architecture for the planetary science community that...
Conference Paper
Full-text available
The goal of the Planetary Data System (PDS) is the digital preservation of scientific data for long-term use by the scientific research community. After two decades of successful operation, the PDS found itself in a new era of big data, international cooperation, distributed nodes, and multiple ways of analysing and interpreting data. A project was...
Conference Paper
Full-text available
The Open Archival Information System (OAIS) Reference Model, published as ISO 14721, has been adopted as the “de facto” standard for systems that preserve data. ISO 16363, the standard for Audit And Certification Of Trustworthy Digital Repositories, is based on ISO 14721 and contains the criteria for auditing various kinds of repositories in terms...
Conference Paper
Full-text available
The goal of the Planetary Data System (PDS) is the digital preservation of scientific data for long-term use by the scientific research community. After two decades of successful operation, the PDS found itself in a new era of big data, international cooperation, distributed nodes, and multiple ways of analysing and interpreting data. A project was...
Article
Full-text available
PDS4 represents the new planetary data archive. Migration of the Mars Phoenix data to PDS4 tests improved data user access under the new standard.
Article
The International Planetary Data Alliance (IPDA) is an international collaboration of space agencies with a mission of providing access to scientific data returned from solar system missions archived at international data centers. In order to improve access and share scientific data, the IPDA was founded to develop data and software standards. The...
Article
Full-text available
The Planetary Data System's has just released the PDS4 system for first use. Its architecture is comprised of three principle parts, an ontology that captures knowledge from the planetary science domain, a federated registry/repository system for product identification, versioning, tracking, and storage, and a REST-based service layer for search, r...
Article
The International Planetary Data Alliance (IPDA) and the Planetary Data System (PDS) are working toward a next-generation system based on the PDS4 standard.
Article
Beta testing of the PDS4 websites for delivering the Mars Phoenix Lander atmospheric data have been reviewed by external reviewers for content and usability.
Conference Paper
For the past decade, the NASA Jet Propulsion Laboratory, in collaboration with Dartmouth University has served as the center for informatics for the Early Detection Research Network (EDRN). The EDRN is a multi-institution research effort funded by the U.S. National Cancer Institute (NCI) and tasked with identifying and validating biomarkers for the...
Article
The IPDA is an international collaboration of space agencies with a mission of providing access to scientific data returned from solar system missions archived at international data centers. In order to improve access and share scientific data, the IPDA was founded to develop data and software standards. The IPDA has focused on promoting standards...
Article
Full-text available
Introduction: The NASA Planetary Data System (PDS) is the distributed system of discipline nodes responsible for the archive of all planetary data acquired by robotic missions, manned missions, and observational campaigns through ground/space-based observation systems. Beginning late in 2012, the PDS will be publicly moving from version 3 to versio...
Article
The NASA Planetary Data System beginning late in 2012, will be publicly moving from version 3 to 4 of the archive. Maintaining data integrity and accessibility for past archived data is important to user confidence under the modernized system.
Article
Full-text available
Beginning late in 2012, the PDS will be moving from version 3 to 4 of its archival system. The first two missions to archive under the new system will be LADEE and MAVEN. These missions will exercise the new standards and aid in development of PDS4.
Article
To provide a framework for comparing and understanding open source software at NASA, the authors describe a set of relevant dimensions and decision points that NASA and other government agencies can use in formulating an open source strategy.
Article
Scientific discovery is largely a collaborative endeavor. From the design and execution of earth and planetary science missions to the evaluation of biomarkers that can identify a particular predisposition to cancer, the scientific community increasingly depends on multi-institutional collaboration as a key enabler of the discovery process. Science...
Article
Full-text available
A shared information model is vital for enabling correlative science, data system interoperability, and effective cross-discipline search. The use of common terminology enables scientists to communicate more precisely about their data and machines to inter-operate at levels far above the simple exchange of data structures. Furthermore research has...
Article
Science data digital repositories are entrusted to ensure that a science community's data are available and useful to users both today and in the future. Part of the challenge in meeting this responsibility is identifying the standards, policies and procedures required to accomplish effective data preservation. Subsequently a repository should be e...
Article
The Planetary Data System (PDS) has undertaken an effort to overhaul the PDS data architecture (including the data model, data structures, data dictionary, etc.) and to deploy an upgraded software system (including data services, distributed data catalog, etc.) that fully embraces the PDS federation as an integrated system while taking advantage of...
Chapter
Full-text available
Data-intensive software is increasingly prominent in today’s world, where the collection, processing, and dissemination of ever-larger volumes of data has become a driving force behind innovation in the early twenty-first century. The trend towards massive data manipulation is broad-based, and case studies can be examined in domains from politics,...
Article
Full-text available
The Planetary Data System (PDS) is in the midst of a major upgrade to its system. This upgrade is a critical modernization of the PDS as it prepares to support the future needs of both the mission and scientific community. It entails improvements to the software system and the data standards, capitalizing on newer, data system approaches. The upgra...
Chapter
Full-text available
The twenty-first century has transformed the world of science by breaking the physical boundaries of distributed organizations and interconnecting them into virtual science environments, allowing for systems and systems of systems to seamlessly access and share information and resources across highly geographically distributed areas. This e-science...
Article
Capturing, sharing, and publishing cancer biomarker research data are all fundamental challenges of enabling new opportunities to research and understand scientific data. Informatics experts from the National Cancer Institute's (NCI) Early Detection Research Network (EDRN) have pioneered a principled informatics infrastructure to capture and dissem...
Article
Software reuse has traditionally been a challenging proposition. While the allure of reusing software has great appeal to increasing stability and reducing software costs, there has been limited success in building software that can be efficiently reused. In many cases, reuse is limited to the reuse of software expertise or repurposing existing sof...
Article
The computing and storage demands force to optimize and manage complex and often conflicting software engineering challenges. Several domain-specific, independent software solutions have been developed to manage large amounts of data, including grid computing platforms- specifically, data-grid software packages such as the Globus Toolkit, DSpace, a...
Article
Full-text available
In the early 1980s NASA established an advisory committee, The Planetary Science Data Steer-ing Group. Advise from this group, along with that from the National Academies Committee on Data Management and Computation (CODMAC, 1982), established the framework of the Planetary Data System (PDS). The PDS was established 1989 as a distributed system to...
Article
The advent of the Web and languages such as XML have brought an explosion of online science data repositories and the promises of correlated data and interoperable systems. However there have been relatively few successes in meeting the expectations of science users in the internet age. For example a Google-like search for images of Mars will retur...
Article
Experience suggests that no single search paradigm will meet all of a community’s search requirements. Traditional forms based search is still considered critical by a significant percentage of most science communities. However text base and facet based search are improving the community’s perception that search can be easy and that the data is ava...
Conference Paper
Full-text available
Scientific digital libraries serve complex and evolving research communities. Justifications for the development of scientific digital libraries include the desire to preserve science data and the promises of information interconnectedness, correlative science, and system interoperability. Shared ontologies are fundamental to fulfilling these promi...
Conference Paper
Full-text available
The dramatic increase in data in the area of cancer research has elevated the importance of effectively managing the quality and consistency of research results from multiple providers. The U.S. National Cancer Institute's Early Detection Research Network (EDRN) is a prime example of a virtual organization, sponsoring distributed, collaborative wor...
Article
The Planetary Data System (PDS) information model has been captured in an ontology based tool framework. A generated specification document now provides a basis for improving the PDS standards for use both within PDS and internationally.
Conference Paper
Full-text available
Scientific digital libraries serve complex and evolving research communities. Justifications for the development of scientific digital libraries include the desire to preserve science data and the promises of information interconnectedness, correlative science, and system interoperability. Research (1) suggests single shared ontologies are fundamen...
Conference Paper
Full-text available
Modern research requires collaboration among geographically distributed scientists. This collaborative model is transforming scientific discovery by enabling sharing and validation of data across institutions. Informatics infrastructures are being developed to support cancer research, endowing scientists with the ability to capture and share data w...
Article
Full-text available
The International Planetary Data Alliance (IPDA) is an international organization with a purpose of developing compatible archives for the capture, management and distribution of planetary science data and results. With the increasing internationalization of planetary science missions, the IPDA is focusing on developing both data and technical stan...
Article
Full-text available
The Planetary Data System (PDS) information model is a mature but complex model that has been used to capture over 30 years of planetary science data for the PDS archive. As the de-facto information model for the planetary science data archive, it is being adopted by the International Planetary Data Alliance (IPDA) as their archive data standard. H...
Article
The Information Model is the foundation on which an Information System is built. It defines the entities to be processed, their attributes, and the relationships that add meaning. The development and subsequent management of the Information Model is the single most significant factor for the development of a successful information system. A framewo...
Article
The importance of archiving scientific data has undisputed value to current scientists and for future generations. Groups of all types have been preserving data in its original form. As we begin to move to the next generation of digital archives we are finding that bringing data back from the archive has a delayed cost associated with it that is bo...
Article
The International Planetary Data Alliance (IPDA) is a joint effort by national space exploration agencies, research institutions, and universities to establish archive standards that make it easier to share data across international boundaries.
Article
Full-text available
A goal of the International Planetary Data Alliance (IPDA) is to develop a set of archive data standards that enable the sharing of planetary science data across international agencies. To help achieve this goal, the IPDA steering committee initiated a six month project to write requirements for and draft an information model based on the Planetary...
Conference Paper
Informatics in biomedicine is becoming increasingly interconnected via distributed information services, interdisciplinary correlation, and crossinstitutional collaboration. Partnering with NASA, the Early Detection Research Network (EDRN), a program managed by the National Cancer Institute, has been defining and building an informatics architectur...
Article
Full-text available
The Reference Architecture for Space Information Management (RASIM) suggests the separation of the data model from software components to promote the development of flexible information management systems. RASIM allows the data model to evolve independently from the software components and results in a robust implementation that remains viable as t...
Article
Full-text available
We describe a reference architecture for space information management systems that elegantly overcomes the rigid design of common information systems in many domains. The reference architecture consists of a set of flexible, reusable, independent models and software components that function in unison, but remain separately managed entities. The mai...
Conference Paper
Full-text available
Modern scientific research is increasingly conducted by virtual communities of scientists distributed around the world. The data volumes created by these communities are extremely large, and growing rapidly. The management of the resulting highly distributed, virtual data systems is a complex task, characterized by a number of formidable technical...
Article
The Semantic SPASE (Space Physics Archive Search and Extract) prototype demonstrates the use of semantic web technologies to capture, document, and manage the SPASE data model, support facet- and text-based search, and provide flexible and intuitive user interfaces. The SPASE data model, under development since late 2003 by a consortium of space ph...
Article
Full-text available
ABSTRACT Successful resource discovery across heterogeneous repositories is strongly dependent on the semantic and syntactic homogeneity of the associated resource descriptions. Ideally, resource descriptions are easily extracted from pre-existing standardized sources, expressed using standard syntactic and semantic structures, and managed and acce...
Article
The science data archived by the Planetary Data System (PDS)from all planetary exploration missions prior to the 2001 Mars Odyssey mission totals approximately 5 terabytes. The currently active Mars Odyssey mission is expected to double this volume and the 2006 Mars Reconnaissance Orbiter (MRO) is expected to increase the resulting volume by a fact...
Article
Full-text available
This paper will describe several efforts now proposed to address this challenge while providing reuseable, customized, scalable, and remotely deployable software packages that will meet the data intensive requirements of the MRO era and beyond
Conference Paper
Full-text available
The sheer amount of data produced by modern science research has created a need for the construction and understanding of "data-intensive systems", large-scale, distributed systems which integrate information. The formal nature of constructing such software systems; however, is relatively unstudied, and has been a large focus of the super-computing...
Article
Science research generates an enormous amount of data that is located in geographically distributed data repositories. The data generated by these efforts are often captured and managed without reference to any standard principles of information architecture. Interoperability and efficient search and retrieval of data products across disparate data...
Article
As the volume and diversity of science data sets continues to expand, the ability to share and correlate data across distributed heterogeneous repositories remains a serious challenge. An architectural data grid framework has been developed by the Object Oriented Data Technology (OODT) project to address this challenge and has been successfully dep...
Article
Full-text available
The Planetary Data System (PDS) Distribution Subsystem (PDS-D) provides on-demand, web-based search, retrieval, and distribution of science data products from a loosely coupled collection of distributed heterogeneous data repositories that comprise the PDS archive. The development of PDS-D was initiated when it was realized that it would be cost pr...
Article
How do you provide more than 350 scientists and researchers access to data from every instrument in Odyssey when the data is curated across half a dozen institutions and in different formats and is too big to mail on a CD-ROM anymore? The Planetary Data System (PDS) faced this exact question. The solution was to use a metadata-based middleware fram...
Conference Paper
Knowledge discovery and data correlation require a unified approach to basic data management. However, achieving such an approach is nearly impossible with hundreds of disparate data sources, legacy systems and data formats. This problem is pervasive in the space science community where data models, taxonomies and data management systems are locall...
Conference Paper
Full-text available
Knowledge discovery and data correlation require a unified approach to basic data management. However, achieving such an approach is nearly impossible with hundreds of disparate data sources, legacy systems, and data formats. This problem is pervasive in the biomedical research community where data models, taxonomies, and data management systems ar...
Article
Full-text available
Science missions and instruments continue to produce volumes of useful data and scientists depend on the data systems and tools that archive this data as a means to access and analyze it. These existing legacy systems do not interoperate well, and scientists must access each data system and its corresponding science data independently through tools...
Conference Paper
Full-text available
The planetary data system (PDS) is an active science data archive consisting of approximately five terabytes of peer-reviewed science data stored on compact disk media. The PDS standards require that the meta-data necessary for understanding the context under which the data were collected be included as part of the archive. The distributed inventor...
Conference Paper
Full-text available
The Planetary Data System (PDS) is an active science data archive managed by scientists for NASA's planetary science community. With the advent of the World Wide Web the majority of the archive has been placed on-line as a science digital library for access by scientists, the educational community, and the general public. The meta-data in this arch...
Conference Paper
Full-text available
The Astrophysics Data System (ADS) provides access to astronomical literature through a sophisticated search engine. Over 10,000 users retrieve almost 5 million references and read more than 25,000 full text articles per month. ADS cooperates closely with all the main astronomical journals and data centers to create and maintain a state-of-the-art...
Conference Paper
Full-text available
The Planetary Data System (PDS) Data Model was presented at the Twelfth IEEE Symposium on Mass Storage Systems. 'l'his data model is used to represent the meta-data that describes entities within the planetary Science Community such as data sets, spacecraft, and targets, as well as the archive data )Jroducts produced by t-he PDS such as images, spe...
Article
The Planetary Data System (PDS) requires that all science data to be included in its archive be accompanied by science metadata. Science metadata has been defined as the data about science data ‘‘that encapsulates information such as: who did what and when, device characteristics, transform definition, documentation, citations, and structure’’. ‘‘I...
Conference Paper
The authors describe the object-based data model in use by the Planetary Data System (PDS). This model, based on the Object Description Language (ODL) and consisting of keyword/value statements, is both human- and computer-readable and has been used to label large volumes of science data being prepared for CD-ROM archive volumes. The use of ODL for...
Article
In this paper, we describe a prototype for CD-ROM volume design and verification. This prototype allows users to create their own model of CD volumes by modifying a prototypical model. Rule-based verification of the test volumes can then be performed later on against the volume definition. This working prototype has proven the concept of model-driv...
Article
Full-text available
The purpose of this document is to detail a study of concurrent processing by the Concurrent Processing Subgroup of the U.S. Army Intelligence Center and School (USAMS) task in association with the Advanced Computing Research Facility (ACRF) at Argonne National Laboratory (ANL). The study centered on the effect of different concurrent architectures...
Article
This paper presents some theoretical considerations for the detection of hazards in combinational switching systems. The application of fuzzy models to hazard detection in binary systems is discussed, and this leads to methods of detection for multiple zero and one hazards. Finally, a method for theorem proving applicable to the detection method is...
Article
Full-text available
In this paper, we present a preliminary study of several different electronic data movement technologies. We detail our approach to classifying the technologies included in our study and present the preliminary results of some initial performance benchmarking. Our studies suggest that highly parallel TCP/IP streaming technologies, such as GridFTP a...
Article
Full-text available
backend data systems to plug in and share their data with frontend systems providing access. The middleware N~~~~ instruments and communicatj'ons encapsulates differing representations, formats, locations, techniques have given scientists and meanings of data, making data interoperable and unprecedented amounts of data, more than relieving research...
Article
Full-text available
The Planetary Data System (PDS) data model was developed in the late 1980's and models the entities and relationships of interest within the Planetary Science Community. It was developed to both prescribe the metadata to be collected for the planetary science data archive and to design the PDS Catalog, a high level inventory of the data holdings in...
Article
A domain ontology can be used to drive the development of a science information system and enable system interoperability and science data correlation. A domain ontology defines the data structures, the metadata for the science interpretation of the data, and the metadata that describes the context within which the data was captured, processed, and...