Moment-based domain adaption: Learning bounds and algorithms

W. Zellinger. Moment-based domain adaption: Learning bounds and algorithms. 2, 2020.

  • Werner Zellinger

This thesis contributes to the mathematical foundation of domain adaptation as emerging field in machine learning. In contrast to classical statistical learning, the framework of domain adaptation takes into account deviations between probability distributions in the training and application setting. Domain adaptation applies for a wider range of applications as future samples often follow a distribution that differs from the ones of the training samples. A decisive point is the generality of the assumptions about the similarity of the distributions. Therefore, in this thesis we study domain adaptation problems under as weak similarity assumptions as can be modelled by finitely many moments.

By examining the generalization ability of discriminative models trained under this relaxed assumption we establish, in the first part, a framework for bounding the misclassification risk based on finitely many moments and additional smoothness conditions. Our results show that a low misclassification risk of the discriminative models can be expected if a) the misclassification risk on the training sample is small, b) the sample size is large enough, and c) the samples’ distributions meet an additional entropy condition.

In the second part, we apply our theoretical framework to the design of machine learning algorithms for domain adaptation. We propose a new moment distance for metric-based regularization of neural networks. Our methods aim at finding new data representations such that our weak assumptions on the similarity of the distributions are satisfied. In this context, various relations of the new moment distance to other probability metrics are proven. Further, a bound on the misclassification risk of our method is derived. To underpin the relevance of our theoretical framework, we perform empirical experiments on several large-scale benchmark datasets. The results show that our method, though based on weaker assumptions, often outperforms related alternatives based on stronger assumptions on the similarity of distributions.

In the third part, we apply our framework on two industrial regression problems. The first problem is settled in the area of industrial manufacturing. We propose a new algorithm that is based on the similarity of the first moments of multiple different distributions. Our algorithm enables the modeling of time series from previously unseen distributions and outperforms several standard regression algorithms on real-world data. The second problem stems from the area of analytical chemistry. We propose a new moment-based domain adaptation algorithm for the calibration of chemical measurement systems. In contrast to standard approaches, our algorithm is only based on unlabeled data from the application system. Theoretical properties of the proposed algorithm are discussed and it is shown to empirically outperform standard alternatives on two real-world datasets.