Gradient Descent

Can Theoretical Algorithms Efficiently Escape Saddle Points in Deep Learning?

Review of optimization algorithms that can escape saddle points in Deep Learning and some experimental results