ABSTRACT
The performance of time-series classification of electroencephalographic data varies strongly across experimental paradigms and study participants. Reasons are task-dependent differences in neuronal processing and seemingly random variations between subjects, amongst others. The effect of data pre-processing techniques to ameliorate these challenges is relatively little studied. Here, the influence of spatial filter optimization methods and non-linear data transformation on time-series classification performance is analyzed by the example of high-frequency somatosensory evoked responses. This is a model paradigm for the analysis of high-frequency electroencephalography data at a very low signal-to-noise ratio, which emphasizes the differences of the explored methods. For the utilized data, it was found that the individual signal-to-noise ratio explained up to 74% of the performance differences between subjects. While data pre-processing was shown to increase average time-series classification performance, it could not fully compensate the signal-to-noise ratio differences between the subjects. This study proposes an algorithm to prototype and benchmark pre-processing pipelines for a paradigm and data set at hand. Extreme learning machines, Random Forest, and Logistic Regression can be used quickly to compare a set of potentially suitable pipelines. For subsequent classification, however, machine learning models were shown to provide better accuracy.