CISPA
Browse

Learning One-hidden-layer ReLU Networks via Gradient Descent

Download (658.13 kB)
conference contribution
posted on 2024-10-23, 06:59 authored by Xiao ZhangXiao Zhang, Yaodong Yu, Lingxiao Wang, Quanquan Gu
We study the problem of learning one-hidden-layer neural networks with Rectified Linear Unit (ReLU) activation function, where the inputs are sampled from standard Gaussian distribution and the outputs are generated from a noisy teacher network. We analyze the performance of gradient descent for training such kind of neural networks based on empirical risk minimization, and provide algorithm-dependent guarantees. In particular, we prove that tensor initialization followed by gradient descent can converge to the ground-truth parameters at a linear rate up to some statistical error. To the best of our knowledge, this is the first work characterizing the recovery guarantee for practical learning of one-hidden-layer ReLU networks with multiple neurons. Numerical experiments verify our theoretical findings.

History

Primary Research Area

  • Trustworthy Information Processing

Name of Conference

International Conference on Artificial Intelligence and Statistics (AISTATS)

CISPA Affiliation

  • No

BibTeX

@conference{Zhang:Yu:Wang:Gu:2019, title = "Learning One-hidden-layer ReLU Networks via Gradient Descent", author = "Zhang, Xiao" AND "Yu, Yaodong" AND "Wang, Lingxiao" AND "Gu, Quanquan", year = 2019, month = 4 }

Usage metrics

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC