I broke up with mathematics, then I got disgusted by mathematics, then we reconciled and now I consider mathematics one of my best companions.
Broke up. In high school, I competed in IMO, which cultivated for me a strong background and a good mindset of maths, and which made me think that I wanted to become a mathematician. I started college taking linear algebra, multi-variable calculus, and basic topology, getting ready for a maths degree.
It turned out that I loathed all of these topics, not because they are bad, but because of the way they were taught. Back in 2011, at Stanford University, where I received my undergraduate degree, the normal first course for math majors used a textbook called An Introduction to Multivariable Mathematics. This book, in my opinion, is unrivaled in the competition to savage anyone’s passion for maths.
The book obscured all the math concepts and slayed the last piece of love that I had for maths. I couldn’t understand what is a matrix’s rank, what is a tangent space, or what is a sub-manifold. And if you ask me to invert a matrix by reversing the Gaussian elimination, then on a good day, you’d get a “F*ck you”. I gave up the dream of being a mathematician.
Disgusts. Some of us made up this guide at Stanford:
- If you go to Stanford and you don’t know what to major in, then you major in Computer Science.
- If you are a Computer Science major at Stanford, and you don’t know what to study, then you study Machine Learning.
- If you study Machine Learning at Stanford and you cannot make anything to work, then you go make a Deep Learning startup. It has a better chance of working than your assignment on Gaussian processes.
Luckily, I just fell to step 2. The curriculum for the CS major at Stanford has a semi-required class CS 109: Probability for Computer Scientists. I took the class with a very popular professor, who likes to dress like a Jedi master on Halloweens. He taught me a very important lesson: never take any classes if their names end with “for Computer Scientists” or “for Engineers”, as these classes make me throw up out of disgust.
I couldn’t stand counting the number of poker hands, computing what percentage of Google employees know C++, or looking up the Normal Distribution Table, all without knowing what is probability. I threw up at how vague and fake were the concepts of expected value, variance, density, distribution, etc. introduced to me. I puked at the way Frequentist vs Bayes was discussed. I have never found maths that revolting in my life.
Reconciliation. After graduating, I was very fortunate to join Google Brain, where I picked up Reinforcement Learning.
Once I read John Schulman et al’s paper, Trust Region Policy Optimization. I realized, out of horror, that their equations made no sense to me. College education has turned me from a math enthusiast into a pitiful being who freaks out whenever he sees [math]\mathbb{E}_{a \sim p(a; \theta)}[R_t(a) \cdot \nabla_\theta \log{p(a; \theta)}][/math]. That’s when I decided that I needed some serious changes.
I asked my mentors, friends, whoever I could trust, for their opinions about which maths to learn. I ended up reading several books, tutorials, and notes on linear algebra, optimization, numerical methods, and measure theory. The ones that I remembered the most are:
- Linear Algebra Done Right. That’s when I learned that linear algebra should be taught from the concepts of vector space and linear transformation. Vector space should be a set of abstract objects that can be added or multiplied by scalars, not [math]\mathbb{R}^{n}[/math]. Linear transformations should be maps between vector spaces, not perfunctory matrix multiplications. I can go on for a day.
- An Introduction to Measure Theory. That’s when I realized that probabilities are measure, integrals are supremums of sums of functions. Also, both Bayesian and Frequentist people have their points, enough for Michael Jordan to say that sometimes he is a frequentist, and sometimes he is a Bayesian person.
Those reads literally salvaged my view of maths, sparking enlightening visions and questions in me. Now I understand that while I don’t need to prove the Riemann Hypothesis (those are for the more passionate and genius minds), I can still appreciate other fairly deep concepts of maths and practice them for my research in machine learning.