Sadly, no. This OP's writeup is just about the simplest exposition that doesn't miss the point, and it misses the major transition when you go to layers with nonlinearities between them.
I think there might be a fun idea in that. I wonder what kind of learning materials we could generate by holding competitions on who could explain a topic the simplest
> This OP's writeup is just about the simplest exposition that doesn't miss the point, and it misses the major transition when you go to layers with nonlinearities between them.
Well - the activation function appears to be linear, and thus close to RELU - and so (maybe?) there wouldn't be any issue with the so-called "vanishing gradients" problem? But I am just a hobbyist who dabbles in this topic, so my opinion doesn't carry much weight...
There is no royal road to mathematics.