The curse of dimensionality in data mining and time series prediction

(2005) 8th International Work-Conference on Artificial Neural Networks (IWANN 2005) — Location: Barcelona (Spain) (8.June.2005)

Files

C25-Thecurseofdimensionalityindataminingandtimeseriesprediction.pdf
  • Restricted Access
  • Adobe PDF
  • 1.36 MB

Details

Authors
Abstract
Modern data analysis tools have to work on high-dimensional data, whose components are not independently distributed. High-dimensional spaces show surprising, counter-intuitive geometrical properties that have a large influence on the performances of data analysis tools. Among these properties, the concentration of the norm phenomenon results in the fact that Euclidean norms and Gaussian kernels, both commonly used in models, become inappropriate in high-dimensional spaces. This papers presents alternative distance measures and kernels, together with geometrical methods to decrease the dimension of the space. The methodology is applied to a typical time series prediction example.
Affiliations

Citations

Verleysen, M., & François, D. (2005). The curse of dimensionality in data mining and time series prediction. Lecture Notes in Computer Science, 3512, 758-770. https://hdl.handle.net/2078.5/140307 (Original work published 2005)