Geometric Feature Extraction (Kopie)
for Distributions of Statistical Data
For example, in meteorological applications, instead of registering the whole time series of some temperature measurement, it may be much more effective to assess only the extrema of this series during one-hour intervals. Similarly, when describing the underlying probability distributions of the multivariate training data sets for predictors of production processes, it may be much more valuable to use key geometric qualities of the underlying probability density of the data into the training.
Geometric features of probability densities of interest apart from the standard location (mean) and variability (variance) parameters include values describing the decay (towards remote parts of the support), the curvature (reflecting the correlation structure) and the modality (number of local extrema). In order to access them before feeding theminto a deep learning network, for example, a data-preprocessing has to be carried out, which often involves a sophisticated analysis of the data. The reason for ‘strapping’ the input data for predictive learning algorithms to its essentials is the resulting ability to increase the efficiency of the training and cross validation process.
This thesis will primarily consist of the following steps:
- Find a suitable tool for approximating the decay type of density functions for given statistical samples.
- Apply this tool to a given data set from one of SCCH’s projects and if there is further time:
- Think of other geometric features and how to assess them, which can be extracted from relevant project related statistical data.