Well, cached recursion is a bit too generic. Almost anything can be done with both caches and recursion.
We had a different characterisation in one of our mathematical optimisation classes: dynamic programming is basically every problem that isomorphic to finding the longest path in a graph, with suitable 'overloading' of 'maximum' (for picking among multiple alternative paths) and 'addition' (for combining multiple sub-paths).
First, to illustrate what I mean by this overloading: matrix multiplication usually has multiplication () and addition (+). However, in the min-plus-algebra you overload (+) with minimum and () with plus, and then multiplying matrices becomes equivalent to hopping two paths in the adjacency matrix of a graph. (Sorry, if this is a bit confusing.) Specifically, taking an adjacency matrix A and calculating A* := 1 + A + AA + AAA + ... is equivalent to taking the transitive closure of edge hopping in your graph, ie the shortest paths between all pairs of vertices.
If you overload (+) with maximum and () with plus instead, you can find longest paths. For that to make sense, you should not have cycles in your graph.
Now let's go back to dynamic programming: the shared optimal sub-structure property of dynamic programming is exactly the same as what you have in finding longest paths.
The structure might sound overly restrictive, but you'll find that it works for all (most?) examples of dynamic programming.
We had a different characterisation in one of our mathematical optimisation classes: dynamic programming is basically every problem that isomorphic to finding the longest path in a graph, with suitable 'overloading' of 'maximum' (for picking among multiple alternative paths) and 'addition' (for combining multiple sub-paths).
First, to illustrate what I mean by this overloading: matrix multiplication usually has multiplication () and addition (+). However, in the min-plus-algebra you overload (+) with minimum and () with plus, and then multiplying matrices becomes equivalent to hopping two paths in the adjacency matrix of a graph. (Sorry, if this is a bit confusing.) Specifically, taking an adjacency matrix A and calculating A* := 1 + A + AA + AAA + ... is equivalent to taking the transitive closure of edge hopping in your graph, ie the shortest paths between all pairs of vertices.
If you overload (+) with maximum and () with plus instead, you can find longest paths. For that to make sense, you should not have cycles in your graph.
Now let's go back to dynamic programming: the shared optimal sub-structure property of dynamic programming is exactly the same as what you have in finding longest paths.
The structure might sound overly restrictive, but you'll find that it works for all (most?) examples of dynamic programming.
P.S. My recollection was a bit vague, so I asked ChatGPT for some help, and apparently I wasn't completely wrong: https://chatgpt.com/share/687de24c-c674-8009-9984-0fda56d1c1...
P.P.S. Despite everything I said above, I agree that 'cached recursion' is still a better name than dynamic programming.