Technology

I built marshmallow castles in Google’s new AI world generator | TechCrunch


Google DeepMind is opening up access to Project Genie, its AI tool for creating interactive game worlds from text prompts or images.

Starting Thursday, Google AI Ultra subscribers in the U.S. can play around with the experimental research prototype, which is powered by a combination of Google’s latest world model Genie 3, its image generation model Nano Banana Pro, and Gemini.

Coming five months after Genie 3’s research preview, the move is part of a broader push to gather user feedback and training data as DeepMind races to develop more capable world models.

World models are AI systems that generate an internal representation of an environment, and can be used to predict future outcomes and plan actions. Many AI leaders, including those at DeepMind, believe world models are a crucial step to achieving artificial general intelligence (AGI). But in the nearer term, labs like DeepMind envision a go-to-market plan that starts with video games and other forms of entertainment and branches out into training embodied agents (aka robots) in simulation.

DeepMind’s release of Project Genie comes as the world model race is beginning to heat up. Fei-Fei Li’s World Labs late last year released its first commercial product called Marble. Runway, the AI video generation startup, has also launched a world model recently. And former Meta chief scientist Yann LeCun’s startup AMI Labs will also focus on developing world models.

“I think it’s exciting to be in a place where we can have more people access it and give us feedback,” Shlomi Fruchter, a research director at DeepMind, told TechCrunch via video interview, smiling ear-to-ear in clear excitement over Project Genie’s release.

DeepMind researchers that TechCrunch spoke to were upfront about the tool’s experimental nature. It can be inconsistent, sometimes impressively generating playable worlds, other times producing baffling results that miss the mark. Here’s how it works.

Techcrunch event

Boston, MA
|
June 23, 2026

A claymation-style castle in the sky made of marshmallows and candy.Image Credits:TechCrunch

You start with a “world sketch” by providing text prompts for both the environment and a main  character, whom you will later be able to maneuver through the world in either first or third person view. Nano Banana Pro creates an image based on the prompts that you can, in theory, modify before Genie uses the image as a jumping off point for an interactive world. The modifications mostly worked, but the model occasionally stumbled and would give you purple hair when you asked for green.

You can also use real life photos as a baseline for the model to build a world on, which, again, was hit or miss. (More on that later.)

Once you’re satisfied with the image, it takes a few seconds for Project Genie to create an explorable world. You can also remix existing worlds into new interpretations by building on top of their prompts, or explore curated worlds in the gallery or via the randomizer tool for inspiration. You can then download videos of the world you just explored.

DeepMind is only granting 60 seconds of world generation and navigation at the moment, in part due to the budget and compute constraints. Because Genie 3 is an auto-regressive modelit takes a lot of dedicated compute – which puts a tight ceiling on how much DeepMind is able to provide to users.

“The reason we limit it to 60 seconds is because we wanted to bring it to more users,” Fruchter said. “Basically when you’re using it, there’s a chip somewhere that’s only yours and it’s being dedicated to your session.”

He added that extending it beyond 60 seconds would diminish the incremental value of the testing.

“The environments are interesting, but at some point, because of their level of interaction and the dynamism of the environment is somewhat limited. Still, we see that as a limitation we hope to improve on.”

Whimsy works, realism doesn’t

Google received a cease-and-desist from Disney last year, so it wouldn’t build models that were Disney-related.Image Credits:TechCrunch

When I used the model, the safety guardrails were already up and running. I couldn’t generate anything resembling nudity, nor could I generate worlds that even remotely sniffed of Disney or other copyrighted material. (In December, Disney hit Google with a cease-and-desist, accusing the firm’s AI models of copyright infringement by training on Disney’s characters and IP and  generating unauthorized content, among other things.) I couldn’t even get Genie to generate worlds of mermaids exploring underwater fantasy lands or ice queens in their wintery castles.

Still, the demo was deeply impressive. The first world I built was an attempt to live out a small childhood fantasy, in which I could explore a castle in the clouds made up of marshmallows with a chocolate sauce river and trees made of candy. (Yes, I was a chubby kid.) I asked the model to do it in claymation style, and it delivered a whimsical world that childhood me would have eaten up, the castle’s pastel-and-white colored spires and turrets looking puffy and tasty enough to rip off a chunk and dunk it into the chocolate moat. (Video above.)

A “Game of Thrones” inspired world that failed to generate as photo-realistically as I wanted.Image Credits:TechCrunch

That said, Project Genie still has some kinks to work out.

The models excelled at creating worlds based on artistic prompts, like using watercolors, anime style or classic cartoon aesthetics. But it tended to fail when it came to photorealistic or cinematic worlds, often coming out looking like a video game rather than real people in a real setting.

It also didn’t always respond well when given real photos to work with. When I gave it a photo of my office and asked it to create a world based on the photo exactly as it was, it gave me a world that had some of the same furnishings of my office – a wooden desk, plants, a grey couch – laid out differently. And it looked sterile, digital, not lifelike.

When I fed it a photo of my desk with a stuffed toy, Project Genie animated the toy navigating the space, and even had other objects occasionally react as it moved past them.

That interactivity is something DeepMind is working on improving. There were several occasions when my characters walked right through walls or other solid objects.

I asked Project Genie to animate a stuffed toy (Bingo Bronson) so it could explore my desk. Image Credits:TechCrunch

When DeepMind released Genie 3 initially, researchers highlighted how the model’s auto-regressive architecture meant that it could remember what it had generated, so I wanted to test that by returning to parts of the environment it generated already to see if it would be the same. For the most part, the model succeeded. In one case, I generated a cat exploring yet another desk, and only once when I turned back to the right side of the desk did the model generate a second mug.

The part I found most frustrating was the way you navigated the space using the arrows to look around, the spacebar to jump or ascend, and the W-A-S-D keys to move. I’m not a gamer, so this didn’t come naturally to me, but the keys were often non-responsive, or they sent you in the wrong direction. Trying to walk from one side of the room to a doorway on the other side often became a chaotic zigzagging exercise, like trying to steer a shopping cart with a broken wheel.

Fruchter assured me that his team was aware of these shortcomings, reminding me again that Project Genie is an experimental prototype. In the future, he said, the team hopes to enhance the realism and improve interaction capabilities, including giving users more control over actions and environments.

“We don’t think about [Project Genie] as an end-to-end product that people can go back to everyday, but we think there is already a glimpse of something that’s interesting and unique and can’t be done in another way,” he said.


Please Subscribe. it’s Free!

Your Name *
Email Address *