Luma AI is opening a physical AI lab to outside roboticists, turning its video model infrastructure into a bet on machines that can understand and act in the real world.
Luma AI has spent the past two years trying to make generated video look less like software and more like reality. Now the Palo Alto company wants to push that same work into robotics. On June 1, Luma announced an open physical AI lab, inviting robotics teams and researchers to bring systems into its environment and test whether world models trained on video, images, and 3D data can help machines generalize beyond narrow tasks.
The announcement matters because it is not just another AI lab adding robotics to a pitch deck. Luma's core business has been building models for Dream Machine and the Ray video family, where success depends on understanding how scenes change over time. A model that can keep motion, lighting, object relationships, and physical continuity intact in video is at least working near the same problem that has slowed robotics for decades: predicting what happens next in a messy world.
As Luma described in its own announcement, the company sees physical AI as stuck in a data problem. Robots can be trained to perform specific tasks in specific settings, but they often fail when conditions shift. That is why the open lab is framed around generalization, not simply demonstration. The goal is to use Luma's infrastructure for processing multimodal data to help robotic systems learn from richer representations of the physical world.
The open science framing is also strategic. CEO Amit Jain has argued that the software foundations for robotics should not be controlled by one or two companies. By opening the lab to outside roboticists, Luma is betting that a broader ecosystem can generate more useful learning signals than a closed system built around one company's hardware stack. That is a serious claim, and it is also a competitive position.
Why Video Models Point Toward Robots
The connection between generative video and physical AI can sound abstract, but the logic is fairly direct. Luma's Ray models are designed to synthesize realistic motion from text, image, and video inputs. To do that well, the system has to infer depth, cause and effect, persistence, and how objects behave through time. Those are not cosmetic details. They are the same kinds of signals a robot needs when it picks up an object, moves through a room, or adapts when a human changes the scene.
Jain's background makes that focus easier to understand. Before Luma, he worked at Apple on computer vision and spatial computing, including Vision Pro passthrough and iPhone LiDAR work. Luma itself began with 3D capture before moving deeper into generative video. That history gives the company a more credible path into physical AI than a startup arriving from pure text models and trying to bolt on robotics later.
There is still a large gap between realistic video and reliable machines. A model can simulate a hand moving across a table without understanding force, grip, friction, or safety. But the market is moving toward the idea that robotics will need foundation models, not just hand-coded controls and teleoperated data. Luma is trying to place itself at the representation layer, where perception, simulation, and planning start to meet.
The Compute Behind The Bet
Luma is not approaching this from a small research budget. In November 2025, the company raised a $900 million Series C led by HUMAIN, Saudi Arabia's state-backed AI company, with participation from AMD Ventures, Andreessen Horowitz, Amplify Partners, and Matrix Partners. The round valued Luma above $4 billion and came with access to Project Halo, a planned 2-gigawatt AI supercluster in Saudi Arabia.
That funding and compute commitment change the scale of the robotics conversation. Physical AI is expensive because training systems requires more than scraping text or labeling images. It needs video, 3D data, simulation, real-world interaction, and repeated testing across varied environments. Companies such as Figure AI, Physical Intelligence, and Apptronik are already pulling capital and talent into the category. Luma's advantage is that it can bring a large video-model infrastructure stack into the race instead of starting from robot hardware alone.
The risk is that openness sounds better than it executes. If outside labs use Luma's infrastructure but the most valuable data and feedback remain fragmented, the platform may not compound as quickly as the company hopes. If it works, however, Luma could become a shared layer for physical AI research at a moment when the field is still deciding what its foundations should look like.
Also read: The US and China have taken a first step on AI safety rules • Intel's Crescent Island GPU arrives at Computex with 480GB of memory and a clear argument against Nvidia's dominance • Hewlett Packard Enterprise surges 37% as its AI infrastructure bet turns into the biggest earnings beat in years
What comes next is the proof stage. Luma has capital, compute, and a plausible technical bridge from video generation to world modeling. The question is whether an open lab can turn those assets into systems that behave well outside the controlled examples where today's robots still look most comfortable.