Sketching Data Sets for Large-Scale Learning: Keeping only what you need

Gribonval, Remi;Chatalic, Antoine;Keriven, Nicolas;Schellekens, Vincent;Schniter, Philip;et.al.
(2021) I E E E Signal Processing Magazine — Vol. 38, n° 5, p. 12-36 (2021)

Files

200801839.pdf
  • Open Access
  • Adobe PDF
  • 8.47 MB

Details

Authors
  • Gribonval, Remi
    Author
  • Chatalic, Antoine
    Author
  • Keriven, Nicolas
    Author
  • Schellekens, Vincentorcid-logoUCLouvain
    Author
  • Author
  • Schniter, Philip
    Author
Show more
Abstract
This article considers "compressive learning," an approach to large-scale machine learning where datasets are massively compressed before learning (e.g., clustering, classification, or regression) is performed. In particular, a "sketch" is first constructed by computing carefully chosen nonlinear random features (e.g., random Fourier features) and averaging them over the whole dataset. Parameters are then learned from the sketch, without access to the original dataset. This article surveys the current state-of-the-art in compressive learning, including the main concepts and algorithms, their connections with established signal-processing methods, existing theoretical guarantees -- on both information preservation and privacy preservation, and important open problems.
Affiliations

Citations

Gribonval, R., Chatalic, A., Keriven, N., Schellekens, V., Jacques, L., & Schniter, P. (2021). Sketching Data Sets for Large-Scale Learning: Keeping only what you need. I E E E Signal Processing Magazine, 38(5), 12-36. https://doi.org/10.1109/msp.2021.3092574 (Original work published 2021)