Generalization in Deep Learning

Kawaguchi, Kenji; Kaelbling, Leslie Pack; Bengio, Yoshua

Author(s)

Kawaguchi, Kenji; Kaelbling, Leslie Pack; Bengio, Yoshua

Downloadgedl.pdf (355.7Kb)

Other Contributors

Learning and Intelligent Systems

Advisor

Leslie Kaelbling

Terms of use

Creative Commons Attribution 4.0 International http://creativecommons.org/licenses/by/4.0/

Metadata

Show full item record

Abstract

With a direct analysis of neural networks, this paper presents a mathematically tight generalization theory to partially address an open problem regarding the generalization of deep learning. Unlike previous bound-based theory, our main theory is quantitatively as tight as possible for every dataset individually, while producing qualitative insights competitively. Our results give insight into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, answering to an open question in the literature. We also discuss limitations of our results and propose additional open problems.

Date issued

2018-05-01

URI

http://hdl.handle.net/1721.1/115274

Series/Report no.

MIT-CSAIL-TR-2018-014

Keywords

neural network, learning theory

Collections

CSAIL Technical Reports (July 1, 2003 - present)

The following license files are associated with this item:

Creative Commons