A novel data quality metric for minimality
|L. Ehrlinger, W. Wöß. A novel data quality metric for minimality. volume 11235, pages 1-15, DOI https://doi.org/10.1007/978-3-030-19143-6_1, 4, 2019.|
|Buch||Data Quality and Trust in Big Data – QUAT 2018 in conjunction with WISE 2018, Revised Selected Papers|
|Serie||Lecture Notes of Computer Science|
The development of well-founded metrics to measure data quality is essential to estimate the significance of data-driven decisions, which are, besides others, the basis for artificial intelligence applications. While the majority of research into data quality refers to the data values of an information system, less research is concerned with schema quality. However, a poorly designed schema negatively impacts the quality of the data, for example, redundancies at the schema-level lead to inconsistencies and anomalies at the data-level. In this paper, we propose a new metric to measure the minimality of a schema, which is an important indicator to detect redundancies. We compare it to other minimality metrics and show that it is the only one that fulfills all requirements for a sound data quality metric. In our ongoing research, we are evaluating the benefits of the metric in more detail and investigate its applicability for redundancy detection in data values.