I don't think its analogous, I don't think we see a cat and our brain have it frame by frame adjust our synaptic weights (or whatever brains do). The whole premise of natural brains being able to learn by static images or disjointed modalities is a very clunky reductionist engineered approach we have taken.
> I don't think we see a cat and our brain have it frame by frame adjust our synaptic weights (or whatever brains do)
I think that "whatever we do" is doing a lot of heavy lifting here. Some of those "whatevers" will be isomorphic to a frame-level analysis that pulls out structural commonalities, or close enough that it's not a clunky reductionist analogy.
When we see what we think is a cat, what we have categorised as a cat, I don't think we are looking at it from each angle and going, cat, cat, cat.
I think there is an aspect of something like the 'free-energy principle' that is required to trigger off a re-assessment. So while visually we may receive 20fps of cat images, it's mostly discarded unless there is some novelty that challenges expectation.