A novel data quality metric for minimality

L. Ehrlinger, W. Wöß. A novel data quality metric for minimality. volume 11235, pages 1-15, DOI https://doi.org/10.1007/978-3-030-19143-6_1, 4, 2019.

Autoren
  • Lisa Ehrlinger
  • Wolfram Wöß
Editoren
  • H. Hacid
  • Q.Z. Sheng
  • T. Yoshida
  • A. Sarkheyli
  • R. Zhou
BuchData Quality and Trust in Big Data – QUAT 2018 in conjunction with WISE 2018, Revised Selected Papers
TypIn Buch
VerlagSpringer
SerieLecture Notes of Computer Science
Band11235
DOIhttps://doi.org/10.1007/978-3-030-19143-6_1
ISBN978-3-030-19142-9
Monat4
Jahr2019
Seiten1-15
Abstract

The development of well-founded metrics to measure data quality is essential to estimate the significance of data-driven decisions, which are, besides others, the basis for artificial intelligence applications. While the majority of research into data quality refers to the data values of an information system, less research is concerned with schema quality. However, a poorly designed schema negatively impacts the quality of the data, for example, redundancies at the schema-level lead to inconsistencies and anomalies at the data-level. In this paper, we propose a new metric to measure the minimality of a schema, which is an important indicator to detect redundancies. We compare it to other minimality metrics and show that it is the only one that fulfills all requirements for a sound data quality metric. In our ongoing research, we are evaluating the benefits of the metric in more detail and investigate its applicability for redundancy detection in data values.