Google seems to be doing it all with transformers. It's not open source, though:...

Google seems to be doing it all with transformers. It's not open source, though:

> Here we highlight some results of ViT-22B. Note that in the paper we also explore several other problem domains, like video classification, depth estimation, and semantic segmentation.

https://ai.googleblog.com/2023/03/scaling-vision-transformer...