COMET project daSEM (2015-2018)
Automated (Big) Data Engineering, Processing and Semantic Models
Aims / Research Topics
The daSEM project focusses on the design and development of computational techniques and tools for deriving implicit knowledge from information by means of automated (big) data engineering, semi-automatic data integration, application of big data processing technologies and semantic knowledge modeling. With respect to the project objectives, state-of-the-art big data management and processing approaches and conceptualizations are evaluated including prototypical implementations for automated data and knowledge processing workflows in the context of data warehousing and business intelligence. Additionally, patterns of missing data are investigated in order to apply machine learning models and data quality management methods for, e.g., imputing missing values based on the characteristics of downstream data analysis methods.
Methods / Software / Proof of Concepts
- Evaluation of state-of-the-art big data management and processing approaches: Hadoop ecosystem, Apache Spark, NoSQL databases
- Identification of automatable data and knowledge processing tasks based on available production data of company partners
- Investigation of data mining workflow and toolstack(s), prototypical test implementations for company partners
- Evaluation of patterns and relations of missing data and (multiple) imputation methods
- Automated Data Quality Monitoring as the basis for meaningful data integration [EW17a, EW17b]
This project is subsidized in the frame of COMET – Competence Centers for Excellent Technologies by BMVIT, BMDW, State of Upper Austria and its scientific partners. The COMET program is handeled by FFG.