Contrastive Learning can Identify the Underlying Generative Factors of the Data

Published in NeurIPS 2020 Workshop: Self-Supervised Learning - Theory and Practice, 2020

Zimmermann, R. S., Schneider, S. Sharma, Y., Bethge, M. and Brendel, W., Contrastive Learning can Identify the Underlying Generative Factors of the Data.

Contrastive learning has recently seen tremendous success in unsupervised learning, but the understanding of the source of their effective generalization to a large variety of downstream tasks has been limited. We rigorously show that feedforward models trained on a common contrastive loss can implicitly invert the underlying generative model of the observed data up to affine transformations. While we detail the set of assumptions which need to be met to prove this result, our empirical results suggest our findings are robust to considerable model mismatch. We demonstrate contrastive learning performs comparably to the state-of-the-art in disentanglement on benchmark datasets, a notable observation due to the unique lack of an explicit generative objective. This highlights a deep connection between contrastive learning, generative modeling, and nonlinear independent component analysis, providing a theoretical foundation to derive more effective contrastive losses while simultaneously furthering our understanding of the learned representations.

Project website    Workshop paper

@article{zimmermann2021contrastive, @article{zimmermann2021cl,
  author = {
    Zimmermann, Roland S. and
    Sharma, Yash and
    Schneider, Steffen and
    Bethge, Matthias and
    Brendel, Wieland
  title = {
    Contrastive Learning Inverts
    the Data Generating Process
booktitle = {Proceedings of the 38th International Conference on Machine Learning,
    {ICML} 2021, 18-24 July 2021, Virtual Event},
  series = {Proceedings of Machine Learning Research},
  volume = {139},
  pages = {12979--12990},
  publisher = {{PMLR}},
  year = {2021},
  url = {},