Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This chapter is about data quality, which is a plague on data scientists and chief analytics officers (CAOs). It takes up to 80% of data scientists' time and is the problem they complain about most. Without delving too deeply into details, to be judged of high quality, data must meet three distinct criteria: it must be “right:” correct, properly labeled, de‐deduped, and so forth; It must be “the right data:” unbiased, comprehensive, relevant to the task at hand. It must be “(re)presented in the right way”. Data quality is probably the toughest issue data scientists face. Worse, it impacts one's entire organization. Thus, the real work of data scientists involves stepping up to the near‐term issues and addressing them in a coordinated, professional manner. And the real work of CAOs involves clarifying the larger issues for the rest of the company and helping start programs that get to the root causes of these issues.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.