Adversarial robustness in data augmentation
Michal Mariusz Lewandowski
|Title||Adversarial robustness in data augmentation|
|Booktitle||Published as a workshop presenation at ICLR 2020 Workshop Towards Trustworthy ML: Rethinking Security and Privacy for ML, Addis Ababa, Ethiopia|
Data augmentation has become a standard technique in deep learning, as it has been shown to greatly improve the generalisation abilities of models. In addition to human-designed augmentation operations such as geometric transformations (e.g., on images), recently some methods were proposed that generate new samples from the training data (e.g. using Mixup or GANs). In this paper, we empirically assess the effect of these kinds of data augmentation, regarding both classification accuracy and adversarial vulnerability. We find that ‘classical’ augmentation improves performance and robustness the most. However, we also find that while GAN-based augmentation and Mixup can improve prediction, they cause significant adversarial vulnerabilities when applied alone. Analyzing the smoothness of the models’ decision boundaries, we can relate smoothness to robustness, and find that classical augmentation results in smoother boundaries than Mixup and GAN augmentation. Finally, using influence functions we show that, when asked to predict on adversarial test examples, vulnerable models rely more on augmented samples than on real ones. Taken together, our results suggest that general-purpose data augmentations that do not take into the account the characteristics of the data and the task, must be applied with care.