Article

The functional approach to data management. Modeling, analyzing and integrating heterogeneous data

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In general, these language integrations are based on (monad) comprehensions (cf. Gray et al. [14]), which can be understood as succinct notation for collection operations. The simplicity of the notation lies in the fact that many SQLoperators , such as selections and projections have immediate translations to comprehension operations, such as filter and map. ...
Article
Full-text available
An incremental computation updates its result based on a change to its input, which is often an order of magnitude faster than a recomputation from scratch. In particular, incrementalization can make expensive computations feasible for settings that require short feedback cycles, such as interactive systems, IDEs, or (soft) real-time systems. This paper presents i3QL, a general-purpose programming language for specifying incremental computations. i3QL provides a declarative SQL-like syntax and is based on incremental versions of operators from relational algebra, enriched with support for general recursion. We integrated i3QL into Scala as a library, which enables programmers to use regular Scala code for non-incremental subcomputations of an i3QL query and to easily integrate incremental computations into larger software projects. To improve performance, i3QL optimizes user-defined queries by applying algebraic laws and partial evaluation. We describe the design and implementation of i3QL and its optimizations, demonstrate its applicability, and evaluate its performance.
... For example, it does not support inheritance. The Thesis uses and extends a DBMS, which is based on a functional data model [40]. The functional data model provides higher expressiveness than the relational data model, and naturally supports relational and object-oriented data. ...
... The structure of dimensions in COM can be used for navigational purposes without using joins. This makes it similar to the functional data model (FDM) (Sibley & Kerschberg, 1977;Shipman, 1981;Gray, King, & Kerschberg, 1999;Gray, Kerschberg, King, & Poulovassilis, 2004) if COM dimensions are represented as functions. The difference is that COM uses only single-valued dimensions while multi-valued functions are modeled by means of reversed dimensions and deprojection operator. ...
Chapter
Full-text available
... By applying them consecutively we can build an access path. Such an approach is close to that used in the functional data model (FDM) [Shipman, 1981; Gray et al, 1999; Gray et al, 2004]. Access paths and multidimensional queries can be defined as derived properties of concepts and then used just like normal dimensions. ...
Article
Full-text available
The paper describes logical navigation in the concept-oriented data model. This model explicitly and formally separates physical structure and logical structure so that each element of the model is simultaneously a collection and a combination of other elements. The physical structure is used to representing and access by elements by means of references. The logical structure is used to reflect the problem domain dependencies. The two-level model considered in the paper consists of a set of concepts and a set of items. Concept structure defines the model syntax while item structure defines its semantics. In the paper it is shown how the properties of the model can be used for logical navigation where we do not need to specify join conditions or other complicated parameters of queries.
... The third assumption is that the hierarchical multidimensional structure of the model can be used for automating data access and logical navigation. The mechanism of access paths and queries in the COM is very close to that used in the functional data model (FDM) [9,10,19]. In section 2 we define the model and section 3 describes what is meant by dimensionality. ...
Conference Paper
Full-text available
In the paper we describe the problem of grouping and aggregation in the concept-oriented data model. The model is based on ordering its elements within a hierarchical multidimensional space. This order is then used to define all its main properties and mechanisms. In particular, it is assumed that elements positioned higher are interpreted as groups for their lower level elements. Two operations of projection and de-projection are defined for one-dimensional and multidimensional cases. It is demonstrated how these operations can be used for multidimensional analysis.
Article
Full-text available
Both the increasing number of GPS-enabled mobile devices and the geographic crowd-sourcing initiatives, such as Open Street Map, are determinants for the large amount of vector spatial data that is currently being produced. On the other hand, the automatic generation of raster data by remote sensing devices and environmental modeling processes was always leading to very large datasets. Currently, huge data generation rates are reached by improved sensor observation systems and data processing infrastructures. As an example, the Sentinel Data Access System of the Copernicus Program of the European Space Agency (ESA) was publishing 38.71 TB of data per day during 2020. This paper shows how the assumption of a new spatial data model that includes multi-resolution parametric spatial data types, enables achieving an efficient implementation of a large scale distributed spatial analysis system for integrated vector-raster data lakes. In particular, the proposed implementation outperforms the state-of-the-art Spark-based spatial analysis systems by more than one order of magnitude during vector-raster spatial join evaluation.
Article
An incremental computation updates its result based on a change to its input, which is often an order of magnitude faster than a recomputation from scratch. In particular, incrementalization can make expensive computations feasible for settings that require short feedback cycles, such as interactive systems, IDEs, or (soft) real-time systems. This paper presents i3QL, a general-purpose programming language for specifying incremental computations. i3QL provides a declarative SQL-like syntax and is based on incremental versions of operators from relational algebra, enriched with support for general recursion. We integrated i3QL into Scala as a library, which enables programmers to use regular Scala code for non-incremental subcomputations of an i3QL query and to easily integrate incremental computations into larger software projects. To improve performance, i3QL optimizes user-defined queries by applying algebraic laws and partial evaluation. We describe the design and implementation of i3QL and its optimizations, demonstrate its applicability, and evaluate its performance.
Article
Very large amounts of geospatial data are daily generated by many observation processes in different application domains. The amount of produced data is increasing due to the advances in the use of modern automatic sensing devices and also in the facilities available to promote crowdsourcing data collection initiatives. Spatial observation data includes both data of conventional entities and also samplings over multi-dimensional spaces. Existing observation data management solutions lack declarative specification of spatio-temporal analytics. On the other hand, current data management technologies miss observation data semantics and fail to integrate the management of entities and samplings in a single data modeling solution. The present paper presents the design of a framework that enables spatio-temporal declarative analysis over large warehouses of observation data. It integrates the management of entities and samplings within a simple data model based on the well known mathematical concept of function. Observation data semantics are incorporated into the model with appropriate metadata structures.
Conference Paper
Full-text available
In the paper the concept-oriented data model (COM) is described from the point of view of its hierarchical and multidimensional properties. The model consists of two levels: syntactic and semantic. At the syntactic level each element is defined as a combination of its superconcepts. At the semantic level each item is defined as a combination of its superitems. Such a definition has several general interpretations such as a hierarchical coordinate system or multidimensional categorization schema. The described approach can be applied to very different problems for dimensional modelling including database systems, knowledge based systems, ontologies, complex categorizations, knowledge sharing and semantics web.
Conference Paper
The integration of database and programming languages is made dicult by the dierent data models and type systems prevalent in each eld. Functional-object query languages contribute to bridge this gap by letting software developers write declarative queries without im- posing any specic execution strategy. Although some query optimizers support this paradigm, Java provides no means to embed queries in a seamless and typesafe manner. Interestingly, the benets of such gram- mar extension (compile-time type inference and checking, user-friendly syntax) can alternatively be achieved with a compiler plugin as discussed in this paper for the LINQ query language and two Java compilers (from Sun and Eclipse). A prototype conrms the benets of the approach by automating at compile-time (a) the parsing of LINQ queries nested in Java, (b) their analysis for well-formedness, and (c) their rewriting into statements to build Abstract Syntax Trees (ASTs). The technique is also applicable to other languages (JPQL, XQuery) which are handled nowa- days by a Java compiler as uninterpreted strings, being thus prone to runtime exceptions due to breaches of static semantics.
Conference Paper
We are developing a multi-database architecture to provide integrated access to heterogeneous, distributed databases. The work described here is motivated by the desire to have RDF/RDFS collections as component data resources in this system, along with relational and other databases. To achieve this, the RDF/RDFS collection, like all other component resources in the system, is mapped to the functional data model, and a query translator is implemented that can translate queries originally expressed in Daplex (the query language associated with the functional data model) into SPARQL. SPARQL is the prominent query language for RDF and it is used here to bridge between the functional data model and the Semantic Web.
Preprint
Full-text available
In this paper we describe a new approach to data modelling called the concept-oriented model (CoM). This model is based on the formalism of nested ordered sets which uses inclusion relation to produce hierarchical structure of sets and ordering relation to produce multi-dimensional structure among its elements. Nested ordered set is defined as an ordered set where an each element can be itself an ordered set. Ordering relation in CoM is used to define data semantics and operations with data such as projection and de-projection. This data model can be applied to very different problems and the paper describes some its uses such grouping with aggregation and multi-dimensional analysis.
Preprint
Full-text available
In this paper we present a new approach to data modelling, called the concept-oriented model (CoM), and describe its main features and characteristics including data semantics and operations. The distinguishing feature of this model is that it is based on the formalism of nested ordered sets where any element participates in two structures simultaneously: hierarchical (nested) and multi-dimensional (ordered). An element of the model is postulated to consist of two parts, called identity and entity, and the whole approach can be naturally broken into two branches: identity modelling and entity modelling. We also propose a new query language with the main construct, called concept, defined as a pair of two classes: identity class and entity class. We describe how its operations of projection, de-projection and product can be used to solve typical data modelling tasks.
ResearchGate has not been able to resolve any references for this publication.