Efficient and robust median-of-means algorithms for location and regression
|Title||Efficient and robust median-of-means algorithms for location and regression|
|Booktitle||Proceedings of the 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2016)|
We consider the computational problem to learn models from data that are possibly contaminated with outliers. We design and analyze algorithms for robust location and robust linear regression. Such algorithms are essential for solving central problems of robust statistics and outlier detection. We show that our algorithms, which are based on a novel extension of the Median-of-Means method by employing the discrete geometric median, are accurate, efficient and robust against many outliers in the data. The discrete geometric median has many desirable characteristics such as it works for general metric spaces and preserves combinatorial and statistical properties. Furthermore, there is an exact and efficient algorithm to compute it, and an even faster approximation algorithm. We present theoretical and experimental results. In particular, we emphasize the generality of Median-of-Means and its ability to speedup and parallelize algorithms which additionally are accurate and robust against many outliers in the data.