I’ve been making my way through the recently released Deep Learning textbook (which is absolutely excellent), and I came upon the section on Universal Approximation Properties. The Universal Approximation Theorem (UAT) essentially proves that neural networks are capable of approximating any continuous function (subject to some constraints and with upper bounds on compute).
Meanwhile, I have been thinking about the modern successes of deep learning and how many computer vision researchers resisted the movement away from hand-defined features towards deep, uninterpretable neural networks. By no means is computer vision the first field to experience such existential angst. Coming from a physics background, I recall many areas that slowly moved away from expert knowledge towards less-understood, numerical methods. I wonder how physicists dealt with such Sartrean dilemmas?
I think deep learning might be different, though.
To illustrate my point, it is helpful to think of the various methods for solving scientific, mathematical problems as existing on a spectrum. On one side is a closed-form, analytical solution. We express a scientific model with mathematical equations and then solve that problem analytically. Should these solutions resemble reality, then we have simultaneously solved the problem and helped to confirm our scientific understanding. Pretty much any introductory physics problem, like the kinematic equations, falls on this side of the spectrum.
On the other side of the spectrum, we have a complete black box solution. We put our inputs into the box, and we get some outputs back out which have fit our function or solved our problem. We know nothing about what is going on inside.
What falls towards the middle of the spectrum? There are many areas of science and applied math which live here, many of which consist of models expressed as differential equations with no analytical solution. With the lack of analytical solution, one must resort to numerical methods for solving these problems. Examples here abound: Density Functional Theory, Finite Element Analysis, Reflection Seismology, etc…
What is intersting about deep learning is that it is now being used to tackle these middle-ground, numerical problems. Hell, there is a paper on the arxiv using deep learning to solve the Schrödinger equation!
At least with the original numerical methods for solving differential equations, one had to do a decent job of modeling (and presumably understanding) the system before using a computer to solve the problem. What remains to be seen is whether or not this will still be true with deep learning. If not, then it feels like we are entering true black box territory. What I wonder is if the analytical side of the spectrum will even be necessary? Is intuition and domain knowledge necessary for innovation and pushing the boundaries of problem solving? Or, can this Universal Approximator break away from the pack and go on crushing records indefinitely?
Ideally, I would like to believe that we need both. Insights from domain knowledge help drive new approaches (e.g. neural networks were originally inspired by the brain), and breakthrough, black box results help us to understand the domain (not as many examples of these, yet…).
In the meantime, while everybody grapples with this new technology and publishes marginal increases in MNIST accuracy, it seems like the old, differential equation-heavy fields are up for grabs for anybody who can suitably aim the deep learning hammer.