Continuous optimization plays a central role in many fields of mathematics , and it is also the core science to solve problems from real-word applications. In continuous optimization, we optimize over functions depending on continuous sets such as the real numbers, which is why the theory relies on calculus techniques. The course focuses on those optimization methods which can (1) address nonlinear, and possibly nonconvex loss functions, (2) scale to large datasets, and (3) train models with lots of parameters. These methods are variations around the notion of gradient descent. In the course, we will analyze the properties of continuous optimization problems (involving convex functions, and first order differential calculus), the gradient descent method, its stochastic version, and their application to nonlinear regression and the training of neural networks.