Google launches AI that generates realistic worlds and is very close to super-intelligence
Google unveiled Genie 3, a new language model that simulates realistic visual environments, hailing it as a step toward superintelligence. Developed by DeepMind, this AI can generate simulations lasting several minutes and can be used to train multi-purpose agents. The company asserts that these environments are consistent, and Genie is able to remember what was previously generated.
According to a DeepMind blog post, Genie 3 is a hybrid of its predecessor and Veo 3, a model for generating video clips from text. Unlike Genie 2, which generated interactive scenarios lasting a few seconds, the new AI generates simulations lasting several minutes at 720p resolution. Users can navigate through the environments using the keyboard or directional pads.
One of Genie 3's most notable features is its reliance on auto-regressive generation, a technology that enables it to build the world frame by frame while remembering what happened previously. This allows it to maintain physical consistency, allowing users to rewind to a previous time. Google notes that auto-regressive technology may introduce errors; however, the environments remain consistent, with visual memory dating back to one minute.
In terms of performance, Genie 3 can produce scenes with complex physics. Examples include a waterslide moving across a lake in the middle of the night, a walk in the woods, or a skydiver jumping off a cliff.
Videos include navigation controls for moving the camera or navigating the environment, with the ability to program interactions. This is similar to what we saw in Black Mirror: Bandersnatch, where the user can choose a subsequent event. Events can be programmable through text prompts, altering elements of the virtual world.
While Genie 2 was designed as an alternative to designing video game worlds, its successor is on a different level. Beyond entertainment applications, Genie 3 is used to train AI agents in a variety of simulated environments. Google stated that it leveraged the new model to direct its SIMA agent to perform various tasks in virtual scenarios.