Article

Unified Information Access

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Access to structured information in DBMS which is usually done using Business Intelligence tools and access to unstructured information in document and content management systems is going to be unified. First steps are already done and more to come soon.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... This started with employee profiles that allowed quick finding of experts, especially with the help of faceted search. Later on it expanded to include product, customer and other structured data, so that the search engine became the single point of access to information, that portals claimed to be before (Peinl, 2011b). ...
Chapter
IT support for collaboration in knowledge-intensive processes has gone through quite some change since the beginning of the century and is providing more and more support for knowledge workers. Although single systems are getting easier to use, they are often not replacing former systems but accompany them, which makes the overall system landscape harder to oversee for knowledge workers. Future information systems should therefore combine the existing building blocks under a consistent user interface and assist the user in storing information at the right place by providing access to them directly from a process-specific user interface. This can be business or knowledge processes. This chapter discusses the development of digital collaboration solutions and shows how social software, and machine-understandability have changed them to better support knowledge processes. After that it is discussed how the Social Collaboration Hub, a BMBF-funded project, fulfills the requirements for future information systems.
... A metadata schema in RDFS or OWL provides background knowledge about known types. Additionally, NER tools usually use gazetteers for disambiguation of entities [28]. Protégé is a popular ontology editor for OWL, developed by Stanford University using Java [29]. ...
Article
Full-text available
The Semantic Web has matured from a vision and research area of a few AI specialists to an important technology being used in a large number of research and a few practice projects. Most building blocks of the Semantic Web stack are filled with concrete technologies and W3C standards, but there are still enough areas for research. However, even with existing technologies, the potential of semantic applications within corporations is not yet fully harnessed as the adoption of Semantic Web technologies lacks behind other technologies like NoSQL databases or Web 2.0 technologies. This paper reviews the state of the art of Semantic Web technologies, discusses important terms and developments as well as currently active research streams. It further analyses available tools and applications with a focus on corporate scenarios and open source software and concludes with the suggestion of an architecture for a corporate semantic intranet.
... On the other hand, contents get also connected, by using metadata to create cross connections or generate system-spanning structures using ontologies. Especially integration between structured and unstructured data is an emerging research topic which is discussed under the headline "unifed information access" (Peinl 2011b). ...
Chapter
Full-text available
KM tools, especially ICT-based ones are often introduced in companies without having analyzed the problem and requirements in detail, paying respect to accompanying organizational measures or even having a proper KM strategy. In this book chapter, a framework for analyzing KM requirements and choosing useful tools accordingly is presented which helps supporting both personalization and codification strategies. The knowledge maturing model developed in the MATURE project funded by the European Union is used here. The KM tools discussed reach from classical document and enterprise content management tools over groupware and unified communication tools to social media, personal productivity tools and business intelligence. Furthermore, KM instruments consisting of ICT, organizational and personal KM measures are presented that are concerted and assist each other to reach best results with a KM project. An example of such a KM instrument is team experience management. It is the systematic collection, assessment and application of experiences gained in projects or process cycles. Experiences are distilled in project or case debriefings, recorded as lessons learned and further advanced with the goal to get good or best practices. Further measures include regular process reviews and time measurement, establishing obligatory rules for applying experiences and using case-based reasoning tools to find experiences from similar projects. Finally, the added value resulting of integrating several KM tools to form an Enterprise Knowledge Infrastructure (EKI) to support knowledge work in a comprehensive manner is highlighted.
Conference Paper
Full-text available
Project Halo has the long-term objective of developing a digital Aristotle, i.e. a knowledge system that is able to answer questions in a particular domain and give explanations for its answers. In this paper we report about the Ontoprise contribution to the Halo Pilot Project, in which various competing ontology engineering methodologies and knowledge system capabilities have been investigated. Concerning the first, we describe how we dealt with engineered a significant set of laws from chemistry that interacted at different levels of generality and in varying orders. With regard to the latter, we report on the ability of our system to produce coherent and concise explanations of its reasoning. The importance of these two aspects can hardly be underestimated in the Semantic Web, as with future growth the interaction of large sets of laws will require dedicated management as well as the ability to let the user explore the trustworthiness of the ontology and the underlying data sources.
Conference Paper
Full-text available
A question based knowledge management system with the capability to integrate heterogeneous sources of information and knowledge and which nonetheless acts like a single coherent system with only one user interface is introduced. Especially this interface and the easy access to different information-resources makes it comfortable for users with even low IT knowledge to find their way through complex and scattered information landscapes. This paper will particularly describe the integrative effects and the aspects of creating the knowledge base: Users pose questions, answers are created by the system resorting to the internal knowledge base, external information systems and eventually also involving the systems user-base. This mechanism is assisted by a meaningful scoring system. Finally the external resource interface is being explained as well as the time and cost saving factors.
Conference Paper
Full-text available
Recent progress in information extraction has shown how to automatically build large ontologies from high-quality sources like Wikipedia. But knowledge evolves over time; facts have associated validity intervals. Therefore, ontologies should include time as a first-class dimension. In this paper, we introduce Timely YAGO, which extends our previously built knowledge base YAGO with temporal aspects. This prototype system extracts temporal facts from Wikipedia infoboxes, categories, and lists in articles, and integrates these into the Timely YAGO knowledge base. We also support querying temporal facts, by temporal predicates in a SPARQL-style language. Visualization of query results is provided in order to better understand of the dynamic nature of knowledge.
Conference Paper
Full-text available
Spatial and temporal data is plentiful on the Web, and Semantic Web technologies have the potential to make this data more accessible and more useful. Semantic Web researchers have consequently made progress towards better handling of spatial and temporal data.SPARQL, the W3C-recommended query language for RDF, does not adequately support complex spatial and temporal queries. In this work, we present the SPARQL-ST query language. SPARQL-ST is an extension of SPARQL for complex spatiotemporal queries. We present a formal syntax and semantics for SPARQL-ST. In addition, we describe a prototype implementation of SPARQL-ST and demonstrate the scalability of this implementation with a performance study using large real-world and synthetic RDF datasets.
Article
Full-text available
IBM Research undertook a challenge to build a computer system that could compete at the human champion level in real time on the American TV quiz show, Jeopardy. The extent of the challenge includes fielding a real-time automatic contestant on the show, not merely a laboratory exercise. The Jeopardy Challenge helped us address requirements that led to the design of the DeepQA architecture and the implementation of Watson. After three years of intense research and development by a core team of about 20 researchers, Watson is performing at human expert levels in terms of precision, confidence, and speed at the Jeopardy quiz show. Our results strongly suggest that DeepQA is an effective and extensible architecture that can be used as a foundation for combining, deploying, evaluating, and advancing a wide range of algorithmic techniques to rapidly advance the field of question answering (QA).
Chapter
Spatial and temporal data is plentiful on the Web, and SemanticWeb technologies have the potential to make this data more accessible and more useful. Semantic Web researchers have consequently made progress towards better handling of spatial and temporal data.SPARQL, the W3C-recommended query language for RDF, does not adequately support complex spatial and temporal queries. In this work, we present the SPARQL-ST query language. SPARQL-ST is an extension of SPARQL for complex spatiotemporal queries. We present a formal syntax and semantics for SPARQL-ST. In addition, we describe a prototype implementation of SPARQL-ST and demonstrate the scalability of this implementation with a performance study using large real-world and synthetic RDF datasets.
Article
Developing intelligent tools for the integration of information extracted from multiple heterogeneous sources is a challenging issue to effectively exploit the numerous sources available on-line in global information systems. In this paper, we propose intelligent, tool-supported techniques to information extraction and integration from both structured and semistructured data sources. An object-oriented language, with an underlying Description Logic, called ODLI3, derived from the standard ODMG is introduced for information extraction. ODLI3 descriptions of the source schemas are exploited first to set a Common Thesaurus for the sources. Information integration is then performed in a semiautomatic way by exploiting the knowledge in the Common Thesaurus and ODLI3 descriptions of source schemas with a combination of clustering techniques and Description Logics. This integration process gives rise to a virtual integrated view of the underlying sources for which mapping rules and integrity constraints are specified to handle heterogeneity. Integration techniques described in the paper are provided in the framework of the MOMIS system based on a conventional wrapper/mediator architecture.
Conference Paper
There are major trends to advance the functionality of search engines to a more expressive semantic level. This is enabled by the advent of knowledge-sharing communities such as Wikipedia and the progress in automatically extracting entities and relationships from semistructured as well as natural-language Web sources. Recent endeavors of this kind include DBpedia, EntityCube, KnowItAll, ReadTheWeb, and our own YAGO-NAGA project (and others). The goal is to automatically construct and maintain a comprehensive knowledge base of facts about named entities, their semantic classes, and their mutual relations as well as temporal contexts, with high precision and high recall. This tutorial discusses state-of-the-art methods, research opportunities, and open challenges along this avenue of knowledge harvesting.
Conference Paper
Management decision making depends on highly integrated information from different sources and of different granularity: Quantitative information, which is mainly analysed by data warehouse systems and OLAP systems, is needed as well as qualitative information, which can be administered by content management systems. In parallel the source of these information can be provided within or outside of the enterprise. By now, most approaches concentrate on the integration of the user interface. But the diversity of the information type and sources requires integration beyond this level. Therefore new architectures for information systems to support management decision are needed. Starting form the analysis of the state-of-the-art a new metadata based concept for system integration is presented and developed in three steps. A short description of an imaginary EIP shows the intended opportunities the systems will provide. Metadata management is used to integrate operational systems to a single source of information and consolidates metadata of data warehouse systems and content management systems within a new metadata integration model. Using the resulting integrated source of information, new information system architectures are developed to build Enterprise Information Portals for data warehouse systems and content management systems.
Conference Paper
Die Marktliberalisierung zwang Energiedienstleistungsunternehmen, neue Informationssysteme einzuführen, um Energiehändler bei analytischen Aufgaben zu unterstützen. Neben dem klassischen Ansatz, mittels Data Warehouse zeitbezogene Einblicke in das Marktgeschehen zu geben, ist es entscheidend, zusätzlich externe Informationen aus dem Internet verfügbar zu machen. Wetterinformationen, politische Nachrichten oder Marktgerüchte sind erforderlich, um die Variablen des volatilen Energiemarktes richtig zu interpretieren. Ausgehend vom multidimensionalen Datenmodell und erfassten Markttransaktionen wird eine Datenbank aufgebaut, die Energiehändler analytisch unterstützt. Zusätzlich gilt es, externe Informationsquellen zu finden und deren Informationen nach einem Filterungsprozess im Data Warehouse zu erfassen. Diese qualifizierten Informationen werden, über die Zeitachse mit Marktdaten verknüpft, in einer zentralen Benutzerschnittstelle dargestellt.
Conference Paper
Creating mappings between database schema and Web ontology is a preconditioning process in the generation of ontological annotations for dynamic Web page contents extracted from the database. In this paper, a practical approach to creating mappings between a relational database schema and an OWL ontology is presented. The approach can automatically construct the mappings by following a set of predefined heuristic rules based on the conceptual correspondences between the schema and the ontology. This automatic mapping is implemented as the core functionality in a prototype tool D2OMapper that has some assistant functions to help the user manually create and maintain the mappings. Case studies show that the proposed approach is effective and the produced mappings can be applied to semantic annotation of database-based, dynamic Web pages
Conference Paper
Many applications operate on time-sensitive data. Some of these data are only valid for certain intervals (e.g., job-assignments, versions of software code), others describe temporal events that happened at certain points in time (e.g., a persons birthday). Until recently, the only way to incorporate time into Semantic Web models was as a data type property. Temporal RDF, however, considers time as an additional dimension in data preserving the semantics of time. In this paper we present a syntax and storage format based on named graphs to express temporal RDF. Given the restriction to preexisting RDF-syntax, our approach can perform any temporal query using standard SPARQL syntax only. For convenience, we introduce a shorthand format called t-SPARQL for temporal queries and show how t-SPARQL queries can be translated to standard SPARQL. Additionally, we show that, depending on the underlying data’s nature, the temporal RDF approach vastly reduces the number of triples by eliminating redundancies resulting in an increased performance for processing and querying. Last but not least, we introduce a new indexing approach method that can significantly reduce the time needed to execute time point queries (e.g., what happened on January 1st).
Article
Named Entity Recognition (NER) systems need to integrate a wide variety of information for optimal performance. This paper demonstrates that a maximum entropy tagger can effectively encode such information and identify named entities with very high accuracy. The tagger uses features which can be obtained for a variety of languages and works effectively not only for English, but also for other languages such as German and Dutch.
Integration eines Data Warehouse mit einem Wissensmanagementsystem am Beispiel des SAP BW und dem Knowledge CaféHrsg) Professionelles Wissensmanagement – Erfahrungen und Visionen, 2.–4
  • Haak
  • U Reimer
  • A Abecker
  • S Staab
  • Stumme
  • Haak
  • U Reimer
  • A Abecker
  • S Staab
  • Stumme
A Processing Pipeline for Solr. Apache Lucene EuroCon
  • K Jansson
Jansson K (2010) A Processing Pipeline for Solr. Apache Lucene EuroCon 2010, Prague, CZ
Unified Access to Content and Data: Delivering a 360-Degree View of the Enterprise
  • C W Olofson
  • R Boggs
  • S Feldman
  • D Vesset
Olofson CW, Boggs R, Feldman S, Vesset D (2006) Unified Access to Content and Data: Delivering a 360-Degree View of the Enterprise, IDC Research
Search + BI = Unified Information Access -Combining Unstructured And Structured Info Delivers Business Insight
  • B Evelson
  • M Brown
Evelson B, Brown M (2008) Search + BI = Unified Information Access -Combining Unstructured And Structured Info Delivers Business Insight. Forrester Research