How I think about neural networks (outline)
Reference 3b1b linear algebra series?
One - the critical linear algebra
- Vectors - geometric (arrow) and abstract (object) interpretations
- Dot product - geometric (angle) and abstract (similarity) interpretations
- Matrix multiplication - geometric (transformations of various types) and abstract (a collection of comparisons)
Two - basic neural network
- Network parameters store transformations, reference objects, randomly initialized
- Derivative of loss, “what if” for every parameter in the network (backprop is just efficient way to compute, not necessary to understand)
- Optimization
- Layers, Hierarchy, Levels of abstraction
- Activations
Three - real neural networks
- Tricks to make optimization work better - minibatch, momentum, batch norm, residuals
- Architectures - CNN, transformer
Limitations - Curse of dimensionality