ArticlePDF Available

The Impact of Poor Data Quality on the Typical Enterprise.

Authors:

Abstract

Poor data quality has far-reaching effects and consequences. The article aims to increase the awareness by providing a summary of impacts of poor data quality on a typical enterprise. These impacts include customer dissatisfaction, increased operational cost, less effective decision-making and a reduced ability to make and execute strategy. More subtly perhaps, poor data quality hurts employee morale, breeds organizational mistrust, and makes it more difficult to align the enterprise. Creating awareness of a problem and its impact is a critical first step towards resolution of the problem. The needed awareness of the poor data quality, while growing, has not yet been achieved in many enterprises. After all, the typical executive is already besieged by too many problems, low customer satisfaction, high costs, a data warehouse project that is late, and so forth. Creating awareness of issues of the accuracy level and impacts within the enterprise is the first obstacle that practitioners must overcome when implementing data quality programs.
... Likewise, central banks have to collect high-quality data from their supervised institutions to ensure effective supervision and financial stability (Motjolopane and Lutu, 2012). However, poor data quality can negatively affect the organisation's bottom line, resulting in ineffective strategies, higher operational costs, compliance risks, and loss of reputation (Najjar and Bishu, 2005;Redman, 1998;Wende, 2007). According to recent research by Gartner, poor data quality is estimated to cause losses of 15 million US dollars per year (Moore, 2018). ...
... However, banking data are either outdated, incomplete, inconsistent, irrelevant or stored in fragmented databases, making it difficult for data stakeholders to trust the data source (Redman, 1998;Najjar and Bishu. Moreover, core banking systems such as credit facilities and deposits have increased in size and complexity, leading to many silo databases (Dadashzade, 2018). ...
... In the era of big data analytics, data quality (DQ) has gained much attention from academia and industry (Cai and Zhu, 2015;Juddoo, 2016;Redman, 1998). The term data quality has been widely defined as "data fitness", which implies that data are suitable for use to satisfy their intended purpose in a specific context (Data, 1998;Liu, 2013;Strong et al., 1997). ...
Conference Paper
In the era of big data analytics, data is widely recognised as a valuable asset that can enable organisations to achieve their strategic objectives. Despite that, banks are still struggling to maintain high-quality data. Prior studies show that a data governance programme can play a critical role in improving data quality. It can provide data quality professionals with a holistic approach to formally define policies, procedures and decision rights required for managing data quality in a more systematic manner. However, few empirical studies were conducted in this area. Therefore, the present paper aims to close this gap by investigating the data quality problem in the Omani banking industry to understand how various data governance mechanisms can address this issue. The study adopted a qualitative case study with semi-structured interviews and document reviews being used to collect data. A theoretical framework by Abraham et al. (2019) was adopted to guide the collection and analysis of the data. A thematic analysis (TA) by Braun and Clark was followed for data analysis. Findings of the study suggest that the data governance mechanisms, namely ‘performance measurement’, ‘compliance monitoring’ and ‘training’, have positively contributed to mitigating data quality issues in the Omani banking sector.
... True to the motto "garbage in and garbage out", even a sophisticated complex algorithm is useless if the quality of the data is poor. Even though a project may fail for various reasons, the success of a project often depends on the quality of the available data (Redman, 1998). ...
Article
Full-text available
Consolidation of the research information improves the quality of data integration, reducing duplicates between systems and enabling the required flexibility and scalability when processing various data sources. We assume that the combination of a data lake as a data repository and a data wrangling process should allow low-quality or “bad” data to be identified and eliminated, leaving only high-quality data, referred to as “research information” in the Research Information System (RIS) domain, allowing for the most accurate insights gained on their basis. This, however, would lead to increased value of both the data themselves and data-driven actions contributing to more accurate and aware decision-making. This cleansed research information is then entered into the appropriate target Current Research Information System (CRIS) so that it can be used for further data processing steps. In order to minimize the effort for the analysis, the proliferation and enrichment of large amounts of data and metadata, as well as to achieve far-reaching added value in information retrieval for CRIS employees, developers and end users, this paper outlines the concept of a curated data lake with the data wrangling process, showing how it can be used in CRIS to clean up data from heterogeneous data sources during their collection and integration.
... M i (d) The weight vector for each criterion is calculated by finding the minimum degree of possibility of the superiority of each criterion over another with Equation (22). Fuzzy weight vectors calculated for each industry are then used for defuzzification to calculate the normalized weight vector by using Equation (23). Weight vectors and normalized final weights for manufacturing and service industries are given in Tables 11 and 12, respectively. ...
Article
Full-text available
Considering constantly increasing global competition in the market and developing technologies, information systems (ISs) have become an important component of the business world and a vital component of intelligent systems. An IS provides support for planning, controlling, analyzing activities, and support in decisions by managing data throughout the organization to assist executives in their decisions. The main function of an IS is to collect data spread between various parts of the organization and business partners and to process these collected data to form reliable information, which is required for decision making. Another critical function of an IS is to transfer the necessary information to the point-of-need in a timely manner. ISs assist in the conversion of data and information into meaningful outcomes. An IS is a combination of software, data storage hardware, related infrastructure, and people in the organization that use the system. Many business organizations rely on management information systems (MISs), and they conduct their critical operations based on these systems. The existence of an efficient MIS is a requirement for the sustainability of any business. However, MIS’s efficiency depends on the business’s requirements and nature. The compatibility of MIS with business in the company is vital for the successful implementation of these systems. The current study analyzes differences in expectations of manufacturing and service industries from MISs. For this aim, a fuzzy multi-criteria group decision-making (F-MCGDM) model is proposed to determine the differentiating success factors of MIS in both manufacturing and service industries. Findings indicate that there are considerable differences in the needs of both industries from MIS.
... Low quality has a high impact ranging from increased difficulty in setting strategies, derived from data analysis, to reduced customer satisfaction. [7]. The use of low-quality data leads to unsustainable decision and unsuccessful strategies and induces inefficient decision-making. ...
Article
Full-text available
The Internet of Things (IoT) technologies plays a key role in the Fourth Industrial Revolution (Industry 4.0). This implies the digitisation of the industry and its services to improve productivity. To obtain the necessary information throughout the different processes, useful data streams are obtained to provide Artificial Intelligence and Big Data algorithms. However, strategic decision-making based on these algorithms may not be successful if they have been developed based on inadequate low-quality data. This research work proposes a set of metrics to measure Data Quality (DQ) in streaming time series, and implements and validates a set of techniques and tools that allow monitoring and improving the quality of the information. These techniques allow the early detection of problems that arise in relation to the quality of the data collected; and, in addition, they provide some mechanisms to solve these problems. Later, as part of the work, a use case related to industrial field is presented, where these techniques and tools have been deployed into a data management, monitoring and data analysis platform. This integration provides additional functionality to the platform, a Decision Support System (DSS) named DQ-REMAIN (Data Quality REport MAnagement and ImprovemeNt) , for decision-making regarding the quality of data obtained from streaming time series.
... When ex ante assessing the value base of compensation of newly discovered options, the managers suffer from some level of noise (see Eq. 8) following a Gaussian distribution with mean 0 and a standard deviation of 0.05. This parameterization intends to reflect some empirical evidence indicating that Incomplete incentive contracts in complex task… error levels around 10 percent could be a realistic estimation (Tee et al. 2007;Redman 1998). The aspiration levels of performance enhancements start at a level of zero to capture the desire to avoid, at least, situations of not-sustaining an already achieved performance level. ...
Article
Full-text available
Incentive contracts often do not govern all task elements for which an employee is responsible. Prior research, particularly in the tradition of principal-agent theory, has studied incomplete incentive contracts as multi-task problems focusing on how to motivate the employee to incur effort for a not-contracted task element. Thus, emphasis is on the “vertical” relation between superior and subordinate, where both are modeled as gifted economic actors. This paper takes another perspective focusing on the “horizontal” interferences of—contracted and not-contracted—task elements across various employees in an organization and, hence, on the complexity of an organization’s task environment. In order to disentangle the interactions among tasks from agents’ behavior, the paper pursues a minimal intelligence approach. An agent-based simulation model based on the framework of NK fitness landscapes is employed. In the simulation experiments, artificial organizations search for superior performance, and the experiments control for the complexity of the task environment and the level of contractual incompleteness. The results suggest that the complexity of the task environment in terms of interactions among task elements may considerably shape the effects of incomplete incentive contracts. In particular, the results indicate that moderate incompleteness of incentive contracts may be beneficial with respect to organizational performance when intra-organizational complexity is high. This is caused by stabilization of search resulting from incomplete contracts. Moreover, interactions may induce that the not-contracted task elements could serve as means objectives, i.e., contributing to achieving contracted task elements.
Chapter
The main characteristic of what is called Digital Agriculture, or Agriculture 4.0, is the intensive use of data. It can be said that Digital Agriculture is data-driven. In other words, data, which are becoming increasingly available with spatial and temporal attributes, at high frequencies and on an unprecedented scale, have become essential inputs for the processes that culminate in decision-making.
Article
Full-text available
This is a report on how much lack or neglect of requirements formulation affects data quality in information systems. The issue of data quality in information systems is centred around their conceptual data models as they capture all the data perspectives that the systems manage. Investigations on case studies In a way of triangulation were attempted to find out that there are over 30 percent (near 40 percent) of unnecessary data replication or redundancy when elicitation of requirements is poor, which is serious enough to impel the success or failure of information systems.
Chapter
Advances in digitalization present new and emerging Supply Chain (SC) Information Architectures that rely on data and information as vital resources. While the importance of data and information in SCs has long been understood, there is a dearth of research or understanding about the effective governance, control, or management of data ecosystems at the SC level. This chapter examines data architectures through a navigation of the background of database management and data quality research of previous decades. The chapter unfolds the critical architectural elements around data and information sharing in the SC regarding the context, systems, and infrastructure. A review of various frameworks and conceptual models is presented on data and information in SCs, as well as access control policies. The critical importance of data quality and the management of data in the cyber-physical systems are highlighted. Policies for data sharing agreements (DSAs) and access control are discussed and the importance of effective governance in the distributed environments of digitally enabled SCs is emphasized. We extend the concept of data sharing agreements to capture the interplay between the various SC stakeholders around data use. Research gaps and needs relevant to new and emerging SC data and information ecosystems are highlighted.
Chapter
The “curse of dimensions” is a term that describes the many difficulties that arise in machine learning tasks as the number of features in the dataset increases. One way to solve this problem is to reduce the number of features to be provided to the model during the learning phase. This reduction in the number of dimensions can be done in two ways, either by merging dimensions together or by selecting a subset of dimensions. There are many methods to select the dimensions to be kept. One technique is to use a genetic algorithm to find a subset of dimensions that will maximize the accuracy of the classifier. A genetic algorithm specially created for this purpose is called genetic algorithm with aggressive mutation. This very efficient algorithm has several particularities compared to classical genetic algorithms. The main one is that its population is composed of a small number of individuals that are aggressively mutated. Our contribution consists in a modification of the algorithm. Indeed we propose a different version of the algorithm in which the number of mutated individuals is reduced in favor of a larger population. We have compared our method to the original one on 17 datasets, which allowed us to conclude that our method provides better results than the original algorithm while reducing the computation time.
Article
Full-text available
Sumario: Information technology was supposed to stimulate information flow and eliminate hierarchy. It has had just the opposite effect, argue the authors. As information has become the key organizational "currency", it has become too valuable for most managers to just give away. In order to make information-based organizations successful, companies need to harness the power of politics - that is, allow people to negotiate the use and definition of information, just as we negotiate the exchange of other currencies. The authors describe five models of information politics and discuss how companies can move from the less effective models, like feudalism and technocratic utopianism, and toward the more effective ones, like monarchy and federalism.
Article
Poor data quality (DQ) can have substantial social and economic impacts. Although firms are improving data quality with practical approaches and tools, their improvement efforts tend to focus narrowly on accuracy. We believe that data consumers have a much broader data quality conceptualization than IS professionals realize. The purpose of this paper is to develop a framework that captures the aspects of data quality that are important to data consumers.A two-stage survey and a two-phase sorting study were conducted to develop a hierarchical framework for organizing data quality dimensions. This framework captures dimensions of data quality that are important to data consumers. Intrinsic DQ denotes that data have quality in their own right. Contextual DQ highlights the requirement that data quality must be considered within the context of the task at hand. Representational DQ and accessibility DQ emphasize the importance of the role of systems. These findings are consistent with our understanding that high-quality data should be intrinsically good, contextually appropriate for the task, clearly represented, and accessible to the data consumer.Our framework has been used effectively in industry and government. Using this framework, IS managers were able to better understand and meet their data consumers' data quality needs. The salient feature of this research study is that quality attributes of data are collected from data consumers instead of being defined theoretically or based on researchers' experience. Although exploratory, this research provides a basis for future studies that measure data quality along the dimensions of this framework.
Article
Sumario: Errors in data can cost a company millions of dollars, alienate customers, and make implementing new strategies difficult or impossible. The author describes a process AT&T uses to recognize poor data and improve their quality. He proposes a three-step method for identifying data-quality problems, treating data as an asset, and applying quality systems to the processes that create data
The Politics of Information Management. The Information Economics Press
  • P Strassman
Strassman, P. The Politics of Information Management. The Information Economics Press, New Canaan, CT, 1995.
The Reengineering Revolution
  • M Hammer
  • S Stanton
Hammer, M. and Stanton, S. The Reengineering Revolution. Harper-Collins, NY, 1995.
Info'rratio7 Eodoc~. o xfoYd U nive
  • T H Davenport