
Simon Jonathan David CoxThe Commonwealth Scientific and Industrial Research Organisation | CSIRO · Division of Land and Water
Simon Jonathan David Cox
PhD
About
205
Publications
36,676
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,640
Citations
Introduction
Additional affiliations
January 2013 - present
January 2013 - October 2015
July 2010 - December 2012
Education
September 1981 - March 1987
October 1980 - September 1981
September 1977 - May 1980
Publications
Publications (205)
The joint W3C (World Wide Web Consortium) and OGC (Open Geospatial Consortium) Spatial Data on the Web (SDW) Working Group developed a set of ontologies to describe sensors, actuators, samplers as well as their observations, actuation, and sampling activities. The ontologies have been published both as a W3C recommendation and as an OGC implementat...
ISO 19156:2011 defines a conceptual schema for observations, and for features involved in sampling when making observations. These provide models for the exchange of information describing observation acts and their results, both within and between different scientific and technical communities. Observations commonly involve sampling of an ultimate...
We have developed an OWL ontology for the geologic timescale, derived from a Unified Modeling Language (UML) model that formalized the practice of the International Commission for Stratigraphy (ICS) [Cox & Richard, 2005]. The UML model followed the ISO/TC 211 modeling conventions, and was the basis for an XML implementation that was integrated into...
There is increasing interest in shared geospatial information spaces. In the context of Spa-tial Data Infrastructures (SDI), heterogeneity of legacy systems and variety in existing standards are central challenges. Linked Data has been suggested as a potential solution in both cases. In this paper, we explain why the Linked Data approach provides n...
The Spatial Information Services Stack Vocabulary Service (SISSVoc) is a Linked Data API for accessing published vocabularies. SISSVoc provides a RESTful interface via a set of URI patterns that are aligned with SKOS. These provide a standard web interface for any vocabulary which uses SKOS classes and properties. The SISSVoc implementation provide...
This Good Practice Guidance (GPG) document provides guidance on how to calculate the extent of land degradation for reporting on United Nations (UN) Sustainable Development Goal (SDG) Indicator 15.3.1: the proportion of land that is degraded over total land area. This guidance supports implementation of the Tier I methods for Indicator 15.3.1 adopt...
We present ten simple rules that support converting a legacy vocabulary—a list of terms available in a print-based glossary or in a table not accessible using web standards—into a FAIR vocabulary. Various pathways may be followed to publish the FAIR vocabulary, but we emphasise particularly the goal of providing a globally unique resolvable identif...
We present ten simple rules that support converting a legacy vocabulary -- a list of terms available in a print-based glossary or table not accessible using web standards -- into a FAIR vocabulary. Various pathways may be followed to publish the FAIR vocabulary, but we emphasise particularly the goal of providing a distinct IRI for each term or con...
This report presents the results of an international scientific and technical process convened by the International Science Council (ISC) and the UN Office for Disaster Risk Reduction (UNDRR). The aim of the study is to define and describe hazards in order to facilitate more effective disaster risk management. The ultimate aim of the process is to...
eReefs is a comprehensive interoperable information platform that has been developed for the Great Barrier Reef (GBR) region to provide users with access to improved environmental intelligence allowing them to assess past, present, and future conditions, as well as management options to mitigate the risks associated with multiple and sometimes comp...
Interoperability across multiple scientific disciplines has generally had limited success, as most standards are specific to a single community or discipline. However, many components of scientific data are common to multiple disciplines, and if these can be identified and leveraged then valuable foundations are laid for cross-discipline interopera...
In Australia, USA and Europe, Earth and environmental science research communities are developing best practices for data management/stewardship, cyberinfrastructure development, vocabularies and common data services. Major initiatives include:
AuScope, Earth Science Information Partners (ESIP), European Plate Observing System (EPOS), and EarthCub...
The accuracy of GPS measurements is now down to centimetres. Australia is sitting on one of the fastest moving plates on earth, and positions measured in the past are degrading. New technologies and techniques now enable more precise and accurate measurements of 'where' our scientific observations are made, whilst multidisciplinary sensors can reco...
The Sensor, Observation, Sample, and Actuator (SOSA) ontology provides a formal but lightweight general-purpose specification for modellingthe interaction between the entities involved in the acts of observation, actuation, and sampling. SOSA is the result of rethinking the W3C-XG Semantic Sensor Network (SSN) ontology based on changes in scope and...
Australia ranks currently ranks highly according to measures of existing Open Data indices. However, those measures tend to be qualitative and focus on open government data initiatives. They also generally exclude research data. With the increasing amounts of data being published online – through the various government and research initiative – a c...
The Sensor, Observation, Sample, and Actuator (SOSA) ontology provides a formal but lightweight general-purpose specification for modeling the interaction between the entities involved in the acts of observation, actuation, and sampling. SOSA is the result of rethinking the W3C-XG Semantic Sensor Network (SSN) ontology based on changes in scope and...
Life sciences research, and even more specifically biodiversity sciences research, has yet to coalesece on a single system of identifiers for specimens (physical samples collected for research) or even a single set of standards for identifiers. Diverse identifier systems lead to duplication and ambiguity, which in turn lead to challenges in finding...
International Symposium on Linking Environmental Data and Samples; Canberra, Australia, 29 May to 2 June 2017
The joint W3C (World Wide Web Consortium) and OGC (Open Geospatial Consortium) Spatial Data on the Web (SDW) Working Group developed a set of ontologies to describe sensors, actuators, samplers as well as their observations, actuation, and sampling activities. The ontologies have been published both as a W3C recommendation and as an OGC implementat...
The Sensor, Observation, Sample, and Actuator (SOSA) ontology provides a formal but lightweight general-purpose specification for modeling the interaction between the entities involved in the acts of observation, actuation, and sampling. SOSA is the result of rethinking the W3C-XG Semantic Sensor Network (SSN) ontology based on changes in scope and...
The Semantic Sensor Network (SSN) ontology is an ontology for describing sensors and their observations, the involved procedures, the studied features of interest, the samples used to do so, and the observed properties, as well as actuators. SSN follows a horizontal and vertical modularization architecture by including a lightweight but self-contai...
The OzNome initiative is seeking to connect information infrastructures across Australia and enable researchers, industry and key partners to achieve productivity gains around their discovery, access and use of data. While its origins are in earth and environmental data, the intended scope is more comprehensive. Tools and methods developed through...
Australia is currently ranked 2nd place according to the OKFN Global Open Data Index and since 2013, over 7000 datasets have been published through data.gov.au. Increasing amounts of data is being published through state based open data initiatives too through data portals, such as data.nsw.gov.au, data.vic.gov.au. Recently thematic or agency based...
Physical samples are important resources for sample-based data reuse. They may be utilized in the reproduction of scientific findings, depending on their availability and accessibility. Although several solutions have been developed to curate and publish digital collections (e.g., publications and datasets), considerably less attention has been pai...
The process of sampling, observing and analyzing physical samples is not unique to the geosciences. Physical sampling (taking specimens) is a fundamental strategy in many natural sciences, typically to support ex-situ observations in laboratories with the goal of characterizing real-world entities or populations. Observations and measurements are m...
Geophysical data communities are publishing large quantities of data across a wide variety of scientific domains which are overlapping more and more. Whilst netCDF is a common format for many of these communities, it is only one of a large number of data storage and transfer formats. One of the major challenges ahead is finding ways to leverage the...
We introduce new OWL ontologies for observations and sampling features, based on the O&M conceptual model from OGC and ISO 19156. Previous efforts, (a) through the W3C SSN project, and (b) following ISO rules for conversion from UML, had dependencies on elaborate pre-existing ontologies and frameworks. The new ontologies, known as om-lite and sam-l...
http://www.scidatacon.org/2016/sessions/103/paper/172/ We have used a generic registry platform for maintenance and publication of a variety of controlled vocabularies. These range from subject classifiers used for tagging, through glossaries of technical terms appearing in a set of reports, to highly structured technical vocabularies. The registry...
(Keynote paper, opening plenary) Controlled vocabularies are required to disambiguate, and provide definitions for, symbols appearing in scientific datasets, such as units of measure, observed properties, or analytes. Key vocabularies have either not been available, or have been published in forms that don’t allow reference to individual definition...
http://www.scidatacon.org/2016/sessions/37/paper/197/ Our experiences in the design and promulgation of a number of ‘standard’ information models has identified the impact of existing community arrangements in the level of engagement, buy-in and shared visions, the rate of technical development, model complexity, and the ease of the adoption proces...
This report describes a project to develop and document a prototype national information model for vegetation site data in collaboration with key stakeholders from across Australia. The report provides a description of the rationale for the project, the engagement process, the model developed, together with recommendations for the Working Group on...
The OWL-Time ontology is an OWL-2 DL ontology [owl2-direct-semantics] of temporal concepts, for describing the temporal properties of resources in the world or described in Web pages. The ontology provides a vocabulary for expressing facts about topological relations among instants and intervals, together with information about durations, and about...
The Australian National Computational Infrastructure (NCI) manages Earth Systems data collections sourced from several domains and organisations onto a single High Performance Data (HPD) Node to further Australia’s national priority research and innovation agenda. The NCI HPD Node has rapidly established its value, currently managing over 10 PBytes...
Geochemistry observations typically follow a complex specimen preparation process after field sampling. Details of this are required to support assessment of the reliability of the data produced, and to ensure reproducibility. We have applied W3C PROV in describing complex retrieval, processing and observation processes associated with physical spe...
We have extended OWL-Time to support the encoding of temporal position in a range of reference systems, in addition to the Gregorian calendar and conventional clock. Two alternative implementations are provided: as a pure extension or OWL-Time, or as a replacement, both of which preserve the same representation for the cases originally supported by...
Many vocabularies and enumerations are used in water data applications, such as lists of instruments and procedures, observed parameters, units of measure, indicators, interpolation methods, censored values, materials, media, phase, feature-types and feature instances. Dataset interoperability is significantly assisted when these are shared by publ...
The hydro-informatics community has been at the forefront of international standards for the exchange of timeseries observational data, culminating in the development of WaterML 2.0 Part 1: Timeseries as a conceptual model and XML exchange format for the exchange and representation of hydrological timeseries observations. Recent work has delivered...
Persistent identifiers are an integral part of the Semantic Web and Linked Data applications: they enable the stable identification of digital objects and may be used as a top-level application programming interface (API) to bind multiple representations of digital objects into a single, coherent, data model. In addition to these technical tasks, p...
A number of models for observation metadata have been developed in the earth and environmental science communities, including OGC’s Observations and Measurements (O&M), the ecosystems community’s Extensible Observation Ontology (OBOE), the W3C’s Semantic Sensor Network Ontology (SSNO), and the CUAHSI/NSF Observations Data Model v2 (ODM2). In order...
This Discussion Paper specifies a potential OGC Candidate Standard for a JSON implementation of the OGC and ISO Observations and Measurements (O&M) conceptual model (OGC Observations and Measurements v2.0 also published as ISO/DIS 19156). This encoding is expected to be useful in RESTful implementations of observation services. More specifically, t...
Geochemistry observations typically follow a complex preparation process after sample retrieval from the field. Description of these required to allow readers and other data users to assess the reliability of the data produced, and to ensure reproducibility. While laboratory notebooks are used for private record-keeping, and laboratory information...
The PROV data model is becoming accepted as a flexible and robust tool for formalizing information relating to the production of documents and datasets. Provenance stores based on the PROV-O implementation are appearing in support of scientific data workflows. However, the scope of PROV does not have to be limited to digital or information assets....
Water observation data is a key element of a water resources information system as it is commonly used in national reports, environmental impact assessments and other analysis or modeling applications. A data standard is vital to support replication, synchronization and delivery for these applications. An international standard encoding for transfe...
Shared vocabularies are a key element in geoscience data interoperability. Many organizations curate vocabularies, with most Geologic Surveys having a long history of development of lexicons and authority tables. However, their mode of publication is heterogeneous, ranging from PDFs and HTML web pages, spreadsheets and CSV, through various user-int...
netCDF is a well-known and widely used format to exchange array-oriented scientific data such as grids and time-series. We describe a new convention for encoding netCDF based on Linked Data principles called netCDF-LD. netCDF-LD allows metadata elements, given as string values in current netCDF files, to be given as Linked Data objects. netCDF-LD a...
Point time series are a key data-type for the description of real or modelled environmental phenomena. Delivering this data in useful ways can be challenging when the data volume is large, when computational work (such as aggregation, subsetting, or re-sampling) needs to be performed, or when complex metadata is needed to place data in context for...
The Foundation Spatial Data Framework (FSDF) is an Australia and New Zealand Land Information Council (ANZLIC) initiative that aims to deliver national coverage of the most current, authoritative source of fundamental spatial data. To achieve this outcome, both technical and social challenges must be addressed.
The most critical issue is integrati...
Executive Summary We describe a number of issues with standards for geospatial metadata. 1. The ISO/ANZLIC metadata standard was designed primarily by map and image data managers, and reflects a provider-centric viewpoint. The metadata record targets a level of aggregation corresponding to traditional map, image or 'dataset' series, and does not sc...
A data modelling framework for production of national foundational spatial data.
For environmental datasets to be used effectively via the Internet, they must present standardized data and metadata services and link the two. The Open Geospatial Consortium's (OGC) web services (WFS, WMS, CSW etc.), have seen widespread use over many years however few organizations have deployed information architectures based solely on OGC stand...
There is a growing need for increased integration across the publication, discovery, access and use of scientific datasets, including water related datasets. Scientific datasets have varying formats and are published using a variety of methods, ranging from physical media to sophisticated web service interfaces. The Network Common Data Form (NetCDF...
Observational data encodes values of properties associated with a feature of interest, estimated by a specified procedure. For water the properties are physical parameters like level, volume, flow and pressure, and concentrations and counts of chemicals, substances and organisms. Water property vocabularies have been assembled at project, agency an...
For environmental datasets to be used effectively via the Internet, they must present standardized data and metadata services and link the two. The Open Geospatial Consortium's (OGC) web services (WFS, WMS, CSW etc.), have seen widespread use over many years however few organizations have deployed information architectures based solely on OGC stand...
Interoperability of water quality data depends on the use of common models, schemas and vocabularies. However, terms are usually collected during different activities and projects in isolation of one another, resulting in vocabularies that have the same scope being represented with different terms, using different formats and formalisms, and publis...
The increasing global demand on freshwater is resulting in nations improving their terrestrial water monitoring and reporting systems to better understand the availability, and quality, of this valuable resource. A barrier to this is the inability for stakeholders to share information relating to water observations data: traditional hydrological in...
Interoperability of water quality data depends on the use of common models, schemas and vocabularies. However, terms are usually collected during different activities and projects in isolation of one another, resulting in vocabularies that have the same scope being represented with different terms, using different formats and formalisms, and publis...
A set of files containing RDF representations of the International [Chrono]stratigraphic Chart, including RDF/XML and Turtle serializations of data from the versions published in 2004, 2005, 2006, 2008, 2009, 2010, 2012, 2013, 2014. To accompany publication of paper "A geologic timescale ontology and service" by SJD Cox and SM Richard Submitted to...