Article

Semantic Data Management for Experimental Manufacturing Technologies

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Experimental manufacturing technologies play a significant role in improving production processes. For example, the microwave assisted manufacturing of composites can save energy and reduce turn-around times compared to the traditional heating in ovens. However, since this technology is not yet well-understood, it requires more research and development activities (e.g., simulation or production experiments) to enable stable and efficient production with controlled product quality. These activities span multiple divisions over (possibly) multiple organizations and require close cooperation and communication. In addition this process proceeds in an iterative manner and produces a lot of data and documents on the way. From our practical experience working in such a project, we identified knowledge gaps and communication challenges, many of which can be overcome with the support of IT Knowledge Management Systems.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Initial Accepted Scopus 108 53 papers: [1]- [3], [5], [9], [10], [13]- [19], [23]- [29], [31]- [33], [37], [40], [45], [49], [50], [57], [60]- [66], [68], [70], [71], [73], [76]- [78], [81]- [84], [88], [90], [91], [93]- [95] Springer 222 20 papers: [4], [6], [12], [21], [30], [36], [38], [39], [41]- [43], [47], [51], [53], [69], [74], [79], [85], [86], [92] Google Scholar 197 6 papers: [8], [34], [56], [67], [80], [87] Web of Science 71 4 papers: [7], [44], [48], [ tively, as Google Sheets is available online. ...
... Tableau is a software for interactive data visualization. [6], [9], [31], [47], [55], [68], [76], [82], [85] Apache Flume 7 [1], [6], [27], [52], [61], [70], [ [6], [21], [40], [41], [45] MongoDB 6 [16], [33], [35], [41], [43], [ ...
... Tableau is a software for interactive data visualization. [6], [9], [31], [47], [55], [68], [76], [82], [85] Apache Flume 7 [1], [6], [27], [52], [61], [70], [ [6], [21], [40], [41], [45] MongoDB 6 [16], [33], [35], [41], [43], [ ...
... Another preventive solution to manage data access is by using high-security protocols and access management. For example, a data catalog, which provides a structured listing of data assets in the available database to facilitate accessibility and security [129], can be used as a suitable protocol and access management tool. Data catalog uses metadata to help organizations manage their data and perform data governance by organizing data based on their importance. ...
Article
Full-text available
Over the decades, Artificial Intelligence (AI) and machine learning has become a transformative solution in many sectors, services, and technology platforms in a wide range of applications, such as in smart healthcare, financial, political, and surveillance systems. In such applications, a large amount of data is generated about diverse aspects of our life. Although utilizing AI in real-world applications provides numerous opportunities for societies and industries, it raises concerns regarding data privacy. Data used in an AI system are cleaned, integrated, and processed throughout the AI life cycle. Each of these stages can introduce unique threats to individual’s privacy and have an impact on ethical processing and protection of data. In this paper, we examine privacy risks in different phases of the AI life cycle and review the existing privacy-enhancing solutions. We introduce four different categories of privacy risk, including (i) risk of identification, (ii) risk of making an inaccurate decision, (iii) risk of non-transparency in AI systems, and (iv) risk of non-compliance with privacy regulations and best practices. We then examined the potential privacy risks in each AI life cycle phase, evaluated concerns, and reviewed privacy-enhancing technologies, requirements, and process solutions to countermeasure these risks. We also reviewed some of the existing privacy protection policies and the need for compliance with available privacy regulations in AI-based systems. The main contribution of this survey is examining privacy challenges and solutions, including technology, process, and privacy legislation in the entire AI life cycle. In each phase of the AI life cycle, open challenges have been identified.
... It is limited to the promotion and application of ceramic alumina. erefore, solving the brittleness problem of ceramic alumina materials is one of the important ways to promote the application of ceramic alumina materials and solve the brittleness problem of ceramic alumina materials [19]. e alumina ceramic structure is shown in Figure 1. ...
Article
Full-text available
This paper is based on the alumina ceramic manufacturing process and its quality to build a quality information management system to study alumina ceramics and its quality factors such as physical properties, chemical properties, and other pieces of related information to find a kind of alumina ceramics. The management of quality management brings efficiency and guarantee to the management system. Obviously, this research is based on the two parts of alumina ceramic tube and quality management information system for data acquisition and analysis. One part is based on the production and performance analysis of alumina ceramic tube and summarizes the quality problems that occurred in this process. This paper constructs an information management system to count these issues and conducts research on information dominance and management; among them, exponential smoothing calculation methods, linear trends, convolutional neural algorithms, and so on are applied for methodological analysis. The experimental results show that the lack of corners and uneven corners in alumina production has been greatly reduced, and the maximum probability of uneven corners has been reduced from 9.27 to 5.27. The scrap rate of ceramic tube quality decreased from the highest value of 5.82 to 3.17, and the maximum value of unqualified quality was changed from 5.76 to 3.03.
... In our vision, Digital Shadows also serve to semantically enrich process data to enable (automated) decision making in (domain-specific) real time. To this end, they must be semantically integrated with data and models engineered during design and simulation [80]. This need demands a modeling of detailed aspects (from manufacturing system details to factory behavior, to strategic goals, to interface descriptions) in sufficiently formal languages [129,130]. ...
Article
Full-text available
The Internet of Things promises to bring significant improvements to manufacturing by facilitating the integration of manufacturing devices to collect sensor data and to control production processes. In contrast to previous industrial revolutions , today's change is driven by applied computer science technologies on several layers: Improved interfaces for human interaction in conjunction with artificial intelligence will allow companies to easily utilize the information that is available through digital shadows, an appropriate reduction of the reality, in an interconnected World Wide Lab (WWL) of previously isolated data sources from different stakeholders. We derive a systematized research roadmap to bring the envisioned advances quickly to reality in this conservative setting.
... The use of data catalogs in this context, however, are still rare. In [2], a data lake using semantic technologies is presented that can manage datasets produced by sensors or simulation programs in the manufacturing domain. It comprises a data catalog that provides inventory services and also implements security mechanisms. ...
Conference Paper
Full-text available
Data lake architectures enable the storage and retrieval of large amounts of data across an enterprise. At Robert Bosch GmbH, we have deployed a data lake for this expressed purpose, focused on managing automotive sensor data. Simply centralizing and storing data in a data lake, however, does not magically solve critical data management challenges such as data findabil-ity, accessibility, interoperability, and re-use. In this paper, we discuss how semantic technologies can help to resolve such challenges. More specifically, we will demonstrate the use of ontologies and knowledge graphs to provide vital data lake functions including the cataloging of data, tracking provenance, access control, and of course semantic search. Of particular importance is the development of the DCPAC Ontology (Data Catalog, Provenance, and Access Control) along with its deployment and use within a large enterprise setting to manage the huge volume and variety of data generated by current and future vehicles.
... Demnach fokussiert sich das Konzept des Digitalen Zwillings aktuell vor allem auf die vorausschauende Instandhaltung auf Basis der Beobachtung von Prozess-und Maschinendaten (Predictive Maintenance). Doch auch die in den vorangegangenen Phasen gesammelten Informationen sind von hoher Bedeutung, um konsolidiertes Wissen zu Maschinen, Anlagen oder Produkten erfassen, nachverfolgen und kommunizieren zu können [2]. Durch die Integration und Anreicherung all dieser Daten entsteht die Datenbasis für den Digitalen Zwilling, welche dazu genutzt werden kann, mithilfe von Analysen neue Informationen zu gewinnen und diese in Echtzeit zur Verfügung zu stellen [3]. ...
Article
Full-text available
Kurzfassung Dieser Beitrag zeigt die Potenziale des Digitalen Zwillings in Wertschöpfungsnetzwerken auf. Damit einher geht das Konzept des kollaborativen Digitalen Zwillings Co-TWIN, das im gleichnamigen Forschungsprojekt die Technologie des Digitalen Zwillings nutzt, um die Zusammenarbeit innerhalb von Wertschöpfungsnetzwerken im Maschinen- und Anlagenbau zu fördern. Hierbei wird Co-TWIN in Erweiterung zu den meisten bisherigen Einsatzszenarien im gesamten Lebenszyklus der Maschinen zum Einsatz kommen.
... Other approaches, instead, focus more on favouring the user in getting acquainted with the exploration interface, taking into account his interaction waiting tolerance [12] or by implementing proper caching strategies, to ensure high responsiveness levels [16]. The ultimate goal of Data Exploration is also to provide suitable instruments to obtain actionable insights related to the observed data; in [10], for instance, current sensor data is compared against simulated data, computed from previously stored data, in order to predict future behaviours and trends. ...
Chapter
Recently, organisations operating in the context of Smart Cities are spending time and resources in turning large amounts of data, collected within heterogeneous sources, into actionable insights, using indicators as powerful tools for meaningful data aggregation and exploration. Data lakes, which follow a schema-on-read approach, allow for storing both structured and unstructured data and have been proposed as flexible repositories for enabling data exploration and analysis over heterogeneous data sources, regardless their structure. However, indicators are usually computed based on the centralisation of the data storage, according to a less flexible schema on write approach. Furthermore, domain experts, who know data stored within the data lake, are usually distinct from data analysts, who define indicators, and users, who exploit indicators to explore data in a personalised way. In this paper, we propose a semantics-based approach for enabling personalised data lake exploration through the conceptualisation of proper indicators. In particular, the approach is structured as follows: (i) at the bottom, heterogeneous data sources within a data lake are enriched with Semantic Models, defined by domain experts using domain ontologies, to provide a semantic data lake representation; (ii) in the middle, a Multi-Dimensional Ontology is used by analysts to define indicators and analysis dimensions, in terms of concepts within Semantic Models and formulas to aggregate them; (iii) at the top, Personalised Exploration Graphs are generated for different categories of users, whose profiles are defined in terms of a set of constraints that limit the indicators instances on which the users may rely to explore data. Benefits and limitations of the approach are discussed through an application in the Smart City domain.
Chapter
In a manufacturing enterprise, like Bosch, answering business questions regarding production lines, involves different stakeholders. Production planning, product and production process development, quality management, and purchase have different views on the same entity “production line”. These different views are reflected in data residing in silos as Manufacturing Execution Systems (MES), Enterprise Resource Planning (ERP) systems as well as Master Data (MD) systems. To answer these questions, all data have to be integrated and semantically harmonized conciliating the different views in a uniform understanding of the domain. To fulfill these requirements in this specific domain, we present the Line Information System (LIS). LIS is a Knowledge Graph (KG)-based ecosystem capable of semantically integrating data from MES, ERP, and MD. LIS enables a 360\(^{\circ }\) view of manufacturing data for all stakeholders involved while resolving Semantic Interoperability Conflicts (SICs) in a scalable manner. Furthermore, as a part of the LIS ecosystem, we developed the LIS ontology, mappings, and a procedure to ensure the quality of the data in the KG. The LIS application comprises many functionalities to answer business questions that were not possible without LIS. LIS is currently in use in 12 Bosch plants semantically integrating data of more than 1.100 production lines, 16.000 physical machines, as well as more than 400 manufacturing processes. After the rollout of LIS, we performed a study with 21 colleagues. In general, the study showed that LIS in particular, and KG-based solutions in general, paves the way of exploiting the knowledge in manufacturing settings in a reusable and scalable way.KeywordsKnowledge GraphIndustry 4.0Smart ManufacturingSemantic Data IntegrationOntology
Article
In diesem Beitrag werden zwei Lehrstühle der Universität Bamberg vorgestellt, die in den Themenfeldern Datenbanken und Information Retrieval forschen und lehren. Dabei wird ein Überblick über das Umfeld, die aktuellen Forschungsthemen und die Lehraktivitäten gegeben.
Article
Full-text available
Ontology Based Data Access (OBDA) is a prominent approach to query databases which uses an ontology to expose data in a conceptually clear manner by abstracting away from the technical schema-level details of the underlying data. The ontology is ‘connected’ to the data via mappings that allow to automatically translate queries posed over the ontology into data-level queries that can be executed by the underlying database management system. Despite a lot of attention from the research community, there are still few instances of real world industrial use of OBDA systems. In this work we present data access challenges in the data-intensive petroleum company Statoil and our experience in addressing these challenges with OBDA technology. In particular, we have developed a deployment module to create ontologies and mappings from relational databases in a semi-automatic fashion; a query processing module to perform and optimise the process of translating ontological queries into data queries and their execution over either a single DB of federated DBs; and a query formulation module to support query construction for engineers with a limited IT background. Our modules have been integrated in one OBDA system, deployed at Statoil, integrated with Statoil’s infrastructure, and evaluated with Statoil’s engineers and data.
Article
Full-text available
Real-time analytics that requires integration and aggregation of heterogeneous and distributed streaming and static data is a typical task in many industrial scenarios such as diagnostics of turbines in Siemens. OBDA approach has a great potential to facilitate such tasks; however, it has a number of limitations in dealing with analytics that restrict its use in important industrial applications. Based on our experience with Siemens, we argue that in order to overcome those limitations OBDA should be extended and become analytics, source, and cost aware. In this work we propose such an extension. In particular, we propose an ontology, mapping, and query language for OBDA, where aggregate and other analytical functions are first class citizens. Moreover, we develop query optimisation techniques that allow to efficiently process analytical tasks over static and streaming data. We implement our approach in a system and evaluate our system with Siemens turbine data.
Conference Paper
Full-text available
In an experimental study, 120 participants randomly assigned to two groups were asked to rate the helpfulness of the Dublin Core elements definitions and guidelines while creating metadata records. In contrast to previous studies, findings reveal that participants had problems understanding definitions for the whole element set specified by Dublin Core. This study also reveals that careful attention should be given to the clarity of guidelines as well to ensure correct application of Dublin Core elements.
Article
Full-text available
Ontology-based data access (OBDA) is a novel paradigm for accessing large data repositories through an ontology, that is a formal description of a domain of interest. Supporting the management of OBDA applications poses new challenges, as it requires to provide effective tools for (i) allowing both expert and non-expert users to analyze the OBDA specification, (ii) collaboratively documenting the ontology, (iii) exploiting OBDA services, such as query answering and automated reasoning over ontologies, e.g., to support data quality check, and (iv) tuning the OBDA application towards optimized performances. To fulfill these challenges, we have built a novel system, called MASTRO STUDIO, based on a tool for automated reasoning over ontologies, enhanced with a suite of tools and optimization facilities for managing OBDA applications. To show the effectiveness of MASTRO STUDIO, we demonstrate its usage in one OBDA application developed in collaboration with the Italian Ministry of Economy and Finance.
Article
Full-text available
The article deals with and discusses two main approaches in building semantic structures for electrophysiological metadata. It is the use of conventional data structures, repositories, and programming languages on one hand and the use of formal representations of ontologies, known from knowledge representation, such as description logics or semantic web languages on the other hand. Although knowledge engineering offers languages supporting richer semantic means of expression and technological advanced approaches, conventional data structures and repositories are still popular among developers, administrators and users because of their simplicity, overall intelligibility, and lower demands on technical equipment. The choice of conventional data resources and repositories, however, raises the question of how and where to add semantics that cannot be naturally expressed using them. As one of the possible solutions, this semantics can be added into the structures of the programming language that accesses and processes the underlying data. To support this idea we introduced a software prototype that enables its users to add semantically richer expressions into a Java object-oriented code. This approach does not burden users with additional demands on programming environment since reflective Java annotations were used as an entry for these expressions. Moreover, additional semantics need not to be written by the programmer directly to the code, but it can be collected from non-programmers using a graphic user interface. The mapping that allows the transformation of the semantically enriched Java code into the Semantic Web language OWL was proposed and implemented in a library named the Semantic Framework. This approach was validated by the integration of the Semantic Framework in the EEG/ERP Portal and by the subsequent registration of the EEG/ERP Portal in the Neuroscience Information Framework.
Article
Full-text available
The merging of semantic technologies with cloud computing is in the eye of the hurricane of web new developments. Taking into account the importance of web in today's society, it is worth to investigate the relevant perspectives and insights. In this special issue, readers will find the foundations together with cutting-edge developments in the state of the art of semantic technologies on Linked Data over Grid and Cloud Architectures.
Article
Full-text available
It has become almost axiomatic that business success depends upon expanding the global reach of an organization. Moreover, the adoption of the transnational organizational model for multinational enterprise is widely acknowledged as the preferred means of “going global.” Designing effective transnational organizations depends on the effective deployment of advanced information technologies. Because globalization requires employees and business partners to be geographically and temporally distant from one another, deploying information technologies within a “virtual organization” is an obvious choice for overcoming spatial and temporal boundaries. This article reviews three competitive requirements of the transnational enterprise: efficiency, responsiveness, and learning. We then describe the role of specific information technologies in meeting these requirements and offer practical guidelines for using these technologies to increase competitiveness in global markets.
Article
Full-text available
Many curated databases are constructed by scientists inte- grating various existing data sources. Most current approaches to prove- nance in databases are based on views and fail to take account of the added value of the work done by scientists in manually creating and modi- fying data. Capturing provenance in such an environment is a challenging problem, requiring changes in practice, changes to existing software, and crucially, a good model of the process of curation.
Article
Full-text available
Data management has become a critical challenge faced by a wide array of scientific disciplines in which the provision of sound data management is pivotal to the achievements and impact of research projects. Massive and rapidly expanding amounts of experimental data combined with evolving domain models contribute to making data man-agement an increasingly challenging task that warrants a rethinking of its design. In this paper we present PODD, an ontology-centric data management system architecture for scientific experimental data that is extensible and domain independent. In this architecture, the behaviors of domain concepts and objects are specified entirely by ontological enti-ties, around which all data management tasks are carried out. The open and semantic nature of ontology languages also makes PODD amenable to greater data reuse and interoperability. To evaluate this architecture, we have developed a data management system and applied it to the challenge of managing phenomics data.
Article
Full-text available
Monitoring and control of the greenhouse environment play a decisive role in greenhouse production processes. Assurance of optimal climate conditions has a direct influence on crop growth performance, but it usually increases the required equipment cost. Traditionally, greenhouse installations have required a great effort to connect and distribute all the sensors and data acquisition systems. These installations need many data and power wires to be distributed along the greenhouses, making the system complex and expensive. For this reason, and others such as unavailability of distributed actuators, only individual sensors are usually located in a fixed point that is selected as representative of the overall greenhouse dynamics. On the other hand, the actuation system in greenhouses is usually composed by mechanical devices controlled by relays, being desirable to reduce the number of commutations of the control signals from security and economical point of views. Therefore, and in order to face these drawbacks, this paper describes how the greenhouse climate control can be represented as an event-based system in combination with wireless sensor networks, where low-frequency dynamics variables have to be controlled and control actions are mainly calculated against events produced by external disturbances. The proposed control system allows saving costs related with wear minimization and prolonging the actuator life, but keeping promising performance results. Analysis and conclusions are given by means of simulation results.
Conference Paper
Full-text available
Curated databases in bioinformatics and other disciplines are the result of a great deal of manual annotation, correction and transfer of data from other sources. Provenance information concerning the creation, attribution, or version history of such data is crucial for assessing its integrity and scientific value. General purpose data- base systems provide little support for tracking provenance, espe- cially when data moves among databases. This paper investigates general-purpose techniques for recording provenance for data that is copied among databases. We describe an approach in which we track the user's actions while browsing source databases and copy- ing data into a curated database, in order to record the user's actions in a convenient, queryable form. We present an implementation of this technique and use it to evaluate the feasibility of database support for provenance management. Our experiments show that although the overhead of a na¨ive approach is fairly high, it can be decreased to an acceptable level using simple optimizations.
Conference Paper
Full-text available
Data management has become a critical challenge faced by a wide array of scientific disciplines in which the provision of sound data management is pivotal to the achievements and impact of research projects. Massive and rapidly expanding amounts of data combined with data models that evolve over time contribute to making data management an increasingly challenging task that warrants a rethinking of its design. In this paper we present PODD, an ontology-centric architecture for data management systems that is extensible and domain independent. In this architecture, the behaviors of domain concepts and objects are captured entirely by ontological entities, around which all data management tasks are carried out. The open and semantic nature of ontology languages also makes PODD amenable to greater data reuse and interoperability. To evaluate the PODD architecture, we have applied it to the challenge of managing phenomics data.
Article
Full-text available
Many organizations nowadays face the problem of accessing existing data sources by means of exible mechanisms that are both pow- erful and ecient. Ontologies are widely considered as a suitable formal tool for sophisticated data access. The ontology expresses the domain of interest of the information system at a high level of abstraction, and the relationship between data at the sources and instances of concepts and roles in the ontology is expressed by means of mappings. In this paper we present a solution to the problem of designing eective systems for ontology-based data access. Our solution is based on three main ingre- dients. First, we present a new ontology language, based on Description Logics, that is particularly suited to reason with large amounts of in- stances. The second ingredient is a novel mapping language that is able to deal with the so-called impedance mismatch problem, i.e., the problem arising from the dierence
Article
Full-text available
In a study of thirty-one knowledge management projects in twenty-four companies, the authors examine the differences and similarities of the projects, from which they develop a typology. All the projects had someone responsible for the initiative, a commitment of human and capital resources, and four similar kinds of objectives: (1) they created repositories by storing knowledge and making it easily available to users; (2) they provided access to knowledge and facilitated its transfer, (3) they established an environment that encourages the creation, transfer and use of knowledge, and (4) they managed knowledge as an asset on the balance sheet. The authors identify eight factors that seem to characterize a successful project: 1. The project involves money saved or earned, such as the Dow Chemical project that better managed company patents. 2. The project uses a broad infrastructure of both technology and organization. A technology infrastructure includes common technologies for desktop computing and communications. an organizational infrastructure establishes roles for people and groups to serve as resources for particular projects. 3. The project has a balanced structure that, while flexible and evolutionary, still makes knowledge easy to access. 4. Within the organization, people are positive about creating, using, and sharing knowledge. 5. The purpose of the project is clear, and the language that knowledge managers use in describing it is framed in terms common to the company's culture. 6. The project motivates people to create, share, and use knowledge (for example, giving awards to the top "knowledge sharers"). 7. There are many ways to transfer knowledge, such as the Internet, lotus Notes, and global communications systems, but also including face-to-face communication. 8. The project has senior managers' support and commitment. An organization's knowledge-oriented culture, senior managers committed to the "knowledge business," a sense of how the customer will use the knowledge, and the human factors involved in creating knowledge are most important to effective knowledge management.
Article
Full-text available
Knowledge is a broad and abstract notion that has defined epistemological debate in western philosophy since the classical Greek era. In the past few years, however, there has been a growing interest in treating knowledge as a significant organizational resource. Consistent with the interest in organizational knowledge and knowledge management (KM), IS researchers have begun promoting a class of information systems, referred to as knowledge management systems (KMS). The objective of KMS is to support creation, transfer, and application of knowledge in organizations. Knowledge and knowledge management are complex and multi-faceted concepts. Thus, effective development and implementation of KMS requires a foundation in several rich literatures. To be credible, KMS research and development should preserve and build upon the significant literature that exists in different but rzelated fields. This paper provides a review and interpretation of knowledge management literatures in different fields with an eye toward identifying the important areas for research. We present a detailed process view of organizational knowledge management with a focus on the potential role of information technology in this process. Drawing upon the literature review and analysis of knowledge management processes, we discuss several important research issues surrounding the knowledge management processes and the role of IT in support of these processes.
Article
Full-text available
This is a thought piece on data-intensive science requirements for databases and science centers. It argues that peta-scale datasets will be housed by science centers that provide substantial storage and processing for scientists who access the data via smart notebooks. Next-generation science instruments and simulations will generate these peta-scale datasets. The need to publish and share data and the need for generic analysis and visualization tools will finally create a convergence on common metadata standards. Database systems will be judged by their support of these metadata standards and by their ability to manage and access peta-scale datasets. The procedural stream-of-bytes-file-centric approach to data analysis is both too cumbersome and too serial for such large datasets. Non-procedural query and analysis of schematized self-describing data is both easier to use and allows much more parallelism.
Conference Paper
Real-time analytics that requires integration and aggregation of heterogeneous and distributed streaming and static data is a typical task in many industrial scenarios such as diagnostics of turbines in Siemens. OBDA approach has a great potential to facilitate such tasks; however, it has a number of limitations in dealing with analytics that restrict its use in important industrial applications. Based on our experience with Siemens, we argue that in order to overcome those limitations OBDA should be extended and become analytics, source, and cost aware. In this work we propose such an extension. In particular, we propose an ontology, mapping, and query language for OBDA, where aggregate and other analytical functions are first class citizens. Moreover, we develop query optimisation techniques that allow to efficiently process analytical tasks over static and streaming data. We implement our approach in a system and evaluate our system with Siemens turbine data.
Article
In the domain of so-called smart cities, ICT technologies play a vital role to improve life quality and resource efficiency in future cities. Many smart city applications depend on sensor data – live data for real-time reaction or stored data for analysis and optimization – to measure city-wide processes like mobility, energy consumption and energy production, or environmental factors like city climate or air quality. To realize such future smart city applications, we face many challenges, including data integration from heterogeneous sensor systems, poor or unknown sensor data quality, effective and efficient management of big sensor data volumes, as well as support the development of mobile and/or analytical applications that use that data. The Bamberg Smart City Lab will provide an Open Data testbed for research and evaluation of sensor-based applications in smart cities. In this paper, we focus particularly on the challenge of privacy: how can we set up long-running city-wide sensor campaigns and share the data without compromising the citizen’s privacy? A fundamental aspect in our current approach is to understand and incorporate the Privacy by Design (PbD) guidelines. We apply them to our specific requirements and develop a privacy-preserving architecture. To do this we evaluate and incorporate the integration of different state-of-the art privacy methods to reduce the risk of leaks to a minimum, especially in the field of online publication of data sets.
Article
We illustrate the usefulness of an Ontology-Based Data Management (OBDM) approach to develop an open information system, allowing for a deep level of interoperability among different databases, and accounting for additional dimensions of data quality compared to the standard dimensions of the OECD (Quality framework and guidelines for OECD statistical activities, OECD Publishing, Paris, 2011) Quality Framework. Recent advances in engineering in computer science provide promising tools to solve some of the crucial issues in data integration for Research and Innovation.
Conference Paper
Until now, the SPARQL query language was restricted to simple entailment. Now SPARQL is being extended with more expressive entailment regimes. This allows to query over inferred, implicit knowledge. However, in this case the SPARQL endpoint provider decides which inference rules are used for its entailment regimes. In this paper, we propose an extension to the SPARQL query language to support remote reasoning, in which the data consumer can define the inference rules. It will supplement the supported entailment regimes of the SPARQL endpoint provider with an additional reasoning step using the inference rules defined by the data consumer. At the same time, this solution offers possibilities to solve interoperability issues when querying remote SPARQL endpoints, which can support federated querying frameworks. These frameworks can then be extended to provide distributed, remote reasoning.
Article
In an increasing number of scientific disciplines, large data collections are emerging as important community resources. In this paper, we introduce design principles for a data management architecture called the data grid. We describe two basic services that we believe are fundamental to the design of a data grid, namely, storage systems and metadata management. Next, we explain how these services can be used to develop higher-level services for replica management and replica selection. We conclude by describing our initial implementation of data grid functionality.
Article
The borderless global economy has accentuated the importance of knowledge as the most critical source of competitive advantage. Thus, knowledge management (KM) has become a strategic mandate for most world-class organizations. A key enabler for implementing an effective KM system is advanced information technology (IT). Strategies for developing an enterprise-wide KM system infrastructure with embedded IT are discussed. In particular, this paper discusses the concept of a KM life cycle – knowledge capture, knowledge development, knowledge sharing, and knowledge utilization, and how applications of new IT support each step of the KM practices within and between organizations is suggested.
Data stewards-information security office-computing services-Carnegie Mellon University
  • University Carnegie Mellon
  • Carnegie Mellon University
An analysis of reference architectures for the Internet of things. In: Exploring component-based techniques for constructing reference architectures (cobRA)
  • E Cavalcante
Execution of SPARQL query using apache Jena Fuseki server in AISHE domain
  • P R Panchal
  • R Swaminarayan
  • PR Panchal
Towards automated polyglot persistence. Workshop Paper, Datenbanksysteme für Business
  • M Schaarschmidt
  • F Gessert
SAP HANA Vora: a distributed computing platform for enterprise data lakes
  • C Sengstock
  • C Mathis
Big data is no longer equivalent to Hadoop in the industry
  • A Tonne
Reasoning over SPARQL
  • S Coppens
Big data is no longer equivalent to Hadoop in the industry
  • A Tonne
  • B Mitschang
  • al
PODD - towards an extensible, domain-agnostic scientific data management system
  • Y F Li
  • YF Li