Lisa Ehrlinger

Lisa Ehrlinger
Software Competence Center Hagenberg | SCCH

Dr. techn.

About

39
Publications
40,525
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
834
Citations
Additional affiliations
September 2014 - July 2022
Johannes Kepler University Linz
Position
  • Senior Researcher

Publications

Publications (39)
Conference Paper
Full-text available
Most existing methodologies agree that the assessment of data quality (DQ) is a cyclic process, which has to be carried out continuously. Nevertheless, the majority of DQ tools allow the evaluation of data sources only at specific points in time, and the automation and scheduling is therefore in the responsibility of the user. In contrast, automate...
Conference Paper
Full-text available
Recently, the term knowledge graph has been used frequently in research and business, usually in close association with Semantic Web technologies, linked data, large-scale data analytics and cloud computing. Its popularity is clearly influenced by the introduction of Google's Knowledge Graph in 2012, and since then the term has been widely used wit...
Conference Paper
Full-text available
Data is central to decision-making in enterprises and organizations (e.g., smart factories and predictive maintenance), as well as in private life (e.g., booking platforms). Especially in artificial intelligence applications, like self-driving cars, trust in data-driven decisions depends directly on the quality of the underlying data. Therefore, it...
Chapter
Training machine learning models, especially in producing enterprises with numerous information systems having different data structures, requires efficient data access. Hence, standardized descriptions of data sources and their data structures are a fundamental requirement. We therefore introduce version 4.0 of the Data Source Description Vocabula...
Chapter
Data catalogs automatically collect metadata from distributed data sources and provide a unified and easily accessible view on the data. Many existing data catalog tools focus on the automatic collection of technical metadata (e.g., from a data dictionary) into a central repository. The functionality of annotating data with semantics (i.e., its mea...
Chapter
Data quality is of central importance for the qualitative evaluation of decisions taken by AI-based applications. In practice, data from several heterogeneous data sources is integrated, but complete, global domain knowledge is often not available. In such heterogeneous scenarios, it is particularly difficult to monitor data quality (e.g., complete...
Chapter
Technical standards help software architects to identify relevant requirements and to facilitate system certification, i.e., to systematically assess whether a system meets critical requirements in fields like security, safety, or interoperability. Despite their usefulness, standards typically remain vague on how requirements should be addressed vi...
Article
Full-text available
In the last two decades, computing and storage technologies have experienced enormous advances. Leveraging these recent advances, Artificial Intelligence (AI) is making the leap from traditional classification use cases to automation of complex systems through advanced machine learning and reasoning algorithms. While the literature on AI algorithms...
Conference Paper
Full-text available
In the last two decades, computing and storage technologies have experienced enormous advances. Leveraging these recent advances, Artificial Intelligence (AI) is making the leap from traditional classification use cases to automation of complex systems through advanced machine learning and reasoning algorithms. While the literature on AI algorithms...
Article
Full-text available
High-quality data is key to interpretable and trustworthy data analytics and the basis for meaningful data-driven decisions. In practical scenarios, data quality is typically associated with data preprocessing, profiling, and cleansing for subsequent tasks like data integration or data analytics. However, from a scientific perspective, a lot of res...
Chapter
Temporal knowledge graphs allow to store process data in a natural way since they also model the time aspect. An example for such data are registration processes in the area of intellectual property protection. A common question in such settings is to predict the future behavior of a (yet unfinished) process. However, traditional process mining tec...
Chapter
Data integration, data management, and data quality assurance are essential tasks in any data science project. However, these tasks are often not treated with the same priority as core data analytics tasks, such as the training of statistical models. One reason is that data analytics generate directly reportable results and data management is only...
Article
Full-text available
Data management approaches have changed drastically in the past few years due to improved data availability and increasing interest in data analysis (e.g., artificial intelligence). The volume, velocity, and variety of data requires novel and automated ways to "operate" this data. In accordance with software development, where DevOps is the de-fact...
Chapter
In enterprises, data is usually distributed across multiple data sources and stored in heterogeneous formats. The harmonization and integration of data is a prerequisite to leverage it for AI initiatives. Recently, data catalogs pose a promising solution to semantically classify and organize data sources across different environments and to enrich...
Poster
Full-text available
Newsadoo is a media startup that provides news articles from different sources on a single platform. Users can create individual time-lines, where they follow the latest development of a specific topic. To support the topic creation process, we developed an algorithm that automatically suggests related tags to a set of given reference tags. In this...
Conference Paper
Full-text available
Data quality assessment is a challenging but necessary task to ensure that business decisions that are derived from data can be trusted. A number of data quality metrics have been developed to measure dimensions like accuracy, completeness, and timeliness. The tool QuaIIe (developed as part of our previous research) facilitates the calculation of d...
Article
Full-text available
Knowledge graphs in manufacturing and production aim to make production lines more efficient and flexible with higher quality output. This makes knowledge graphs attractive for companies to reach Industry 4.0 goals. However, existing research in the field is quite preliminary, and more research effort on analyzing how knowledge graphs can be applie...
Article
Full-text available
In order to make good decisions, the data used for decision-making needs to be of high quality. As the volume of data continually increases, ensuring high data quality is a big challenge nowadays and needs to be automated with tools. The goal of the Data Quality Library (DaQL) is to provide a tool to continuously ensure and measure data quality as...
Conference Paper
Full-text available
High data quality (e.g., completeness, accuracy, non-redundancy) is essential to ensure the trustworthiness of AI applications. In such applications, huge amounts of data is integrated from different heterogeneous sources and complete, global domain knowledge is often not available. This scenario has a number of negative effects, in particular, it...
Article
Full-text available
The main challenges are discussed together with the lessons learned from past and ongoing research along the development cycle of machine learning systems. This will be done by taking into account intrinsic conditions of nowadays deep learning models, data and software quality issues and human-centered artificial intelligence (AI) postulates, inclu...
Preprint
Full-text available
Knowledge graphs in manufacturing and production aim to make production lines more efficient and flexible with higher quality output. This makes knowledge graphs attractive for companies to reach Industry 4.0 goals. However, existing research in the field is quite preliminary, and more research effort on analyzing how knowledge graphs can be applie...
Conference Paper
Full-text available
Industrial production processes generate huge amounts of streaming data, usually collected by the deployed machines. To allow the analysis of this data (e.g., for process stability monitoring or predictive maintenance), it is necessary that the data streams are of high quality and comparable between machines. A common problem in such scenarios is s...
Chapter
The main challenges along with lessons learned from ongoing research in the application of machine learning systems in practice are discussed, taking into account aspects of theoretical foundations, systems engineering, and human-centered AI postulates. The analysis outlines a fundamental theory-practice gap which superimposes the challenges of AI...
Chapter
Machine learning models can only be as good as the data used to train them. Despite this obvious correlation, there is little research about data quality measurement to ensure the reliability and trustworthiness of machine learning models. Especially in industrial settings, where sensors produce large amounts of highly volatile data, a one-time mea...
Preprint
High-quality data is key to interpretable and trustworthy data analytics and the basis for meaningful data-driven decisions. In practical scenarios, data quality is typically associated with data preprocessing, profiling, and cleansing for subsequent tasks like data integration or data analytics. However, from a scientific perspective, a lot of res...
Conference Paper
Full-text available
The reliability and trustworthiness of machine learning models depends directly on the data used to train them. Knowledge about data defects that affect machine learning models is most often considered implicitly by data analysts, but usually no centralized data defect management exists. Knowledge graphs are a powerful tool to capture, structure, e...
Conference Paper
Full-text available
Data quality measurement is a critical success factor to estimate the explanatory power of data-driven decisions. Several data quality dimensions, such as completeness, accuracy, and timeliness, have been investigated so far and metrics for their measurement have been proposed. While most research into those dimensions refers to the data values, sc...
Chapter
The development of well-founded metrics to measure data quality is essential to estimate the significance of data-driven decisions, which are, besides others, the basis for artificial intelligence applications. While the majority of research into data quality refers to the data values of an information system, less research is concerned with schema...
Chapter
Assessing the quality of information system schemas is crucial, because an unoptimized or erroneous schema design has a strong impact on the quality of the stored data, e.g., it may lead to inconsistencies and anomalies at the data-level. Even if the initial schema had an ideal design, changes during the life cycle can negatively affect the schema...
Article
Full-text available
Data quality measurement is essential to gain knowledge about data used for decision-making and to evaluate the trustworthiness of those decisions. Example applications, which are based on automated decision-making, are self-driving cars, smart factories, and weather forecast. One-time data quality measurement is an important starting point for any...
Conference Paper
With the advent of Industry 4.0, many companies aim at analyzing historically collected or operative transaction data. Despite the availability of large amounts of data, particular missing values can introduce bias or preclude the use of specific data analytics methods. Historically, a lot of research into missing data comes from the social science...

Network

Cited By