Software design patterns for machine learning workflows

L. Hofmann-Wellenhof. Software design patterns for machine learning workflows. 1, 2020.

  • Lorenz Hofmann-Wellenhof

Nowadays, it is more important than ever to build machine learning systems that are not only working the way they were designed to work but that are robust as well. Machine learning systems interact with humans more and more on a daily basis (e.g.: self driving cars, predictive maintenance). So we need to not just ask ourselves whether the system is working properly but also question whether it is working properly if it is under attack.

This thesis aims to determine whether we can build a machine learning system that is more robust due to the creation of additional artificial data that provides useful information during the training of the system. More specifically, our methodology introduces a self-labeling, data generating algorithm based on generative models. The generated data is used to extend the training set of a separate machine learning system. We train the machine learning system with the unmodified version of the data set and the extended version. Results show that we get a slightly better performing machine learning system with extended data set. However, there is no substantial increase of robustness towards adversarial attacks. Based on the research conducted in this thesis, we believe the machine learning community should focus on a dynamic modeling approach, where we continuously update the decision boundaries of our model even in the production stage.