A novel data quality metric for minimality

Authors Lisa Ehrlinger
Wolfram Wöß
Editors H. Hacid
Q.Z. Sheng
T. Yoshida
A. Sarkheyli
R. Zhou
Title A novel data quality metric for minimality
Booktitle Data Quality and Trust in Big Data – QUAT 2018 in conjunction with WISE 2018, Revised Selected Papers
Type in book
Publisher Springer
Series Lecture Notes of Computer Science
Volume 11235
ISBN 978-3-030-19142-9
DOI 10.1007/978-3-030-19143-6_1
Month April
Year 2019
Pages 1-15
SCCH ID# 18052

The development of well-founded metrics to measure data quality is essential to estimate the significance of data-driven decisions, which are, besides others, the basis for artificial intelligence applications. While the majority of research into data quality refers to the data values of an information system, less research is concerned with schema quality. However, a poorly designed schema negatively impacts the quality of the data, for example, redundancies at the schema-level lead to inconsistencies and anomalies at the data-level. In this paper, we propose a new metric to measure the minimality of a schema, which is an important indicator to detect redundancies. We compare it to other minimality metrics and show that it is the only one that fulfills all requirements for a sound data quality metric. In our ongoing research, we are evaluating the benefits of the metric in more detail and investigate its applicability for redundancy detection in data values.