Mark gets a lot of flack for metaverse, but can imagine world design where you start in a blank room and describe what you want around you, and it appears. Like the Matrix loading white room. And with voice recognition and eye tracking (and brain scans), how close are we to “you have to use your hands? It’s like a baby’s toy.”
This was always the coolest capability of Star Trek's holodecks, not the projector technology. Super cool that we might just get there within the lifetime of someone who watched the show in the 80s.
My favorite Star Trek episode has always been "Identity Crisis". Not because it's one of the good ones, it's pretty clunky. But it contains a fantastic 5-7 minute long montage featuring Geordi La Forge interacting with computers (by touch, voice and on the holodeck) to solve a murder mystery, analyzing and live-manipulating 3D "holo footage" to discover a vital clue. Whoever imagined that sequence is the hero of my childhood and perhaps the reason I became a software engineer, doing an oddball mix of HMI and systems engineering.
There's so much in that sequence. The free mixing of different input modes, the complementary collaboration between a human and an AI system, carrying state with you from room to room. Analyzing and generating. Following instructions and making suggestions. Powerful inference, precision of control.
YES!!! That's exactly what's coming. I'm getting a head start by envisioning a rich world and creating pictorial references for it along with text descriptions. But it's a tiny head start: once the "holodeck" technology comes along, we'll all be able to create anything with just natural language and correcting anything the model got wrong. ITERATIVE CHISELING YO!!!
By the way, I am still using SD 1.5 inpainting.ckpt - thank you Runway for releasing it, it's perfect for my needs and abilities. I never even tried SD 2 and later ones - heard they worked completely differently and I'm too busy creating to be relearning.