Human Archive is turning everyday service work in India into training data for physical AI. The bet is simple: robots will need the real world, and India has millions of people already working inside it.
The next big AI dataset may not come from the open web. It may come from a cleaner wearing a camera cap, a restaurant worker moving through a kitchen, or a technician handling tools while sensors record what their hands, wrists and body are doing.
That is the pitch behind Human Archive, a Silicon Valley startup founded by Raj Patel, Rushil Agarwal, Samay Maini and Shloke Patel, with roots at Stanford and Berkeley. As TechCrunch reported on May 26, the company has raised $8.2 million from Wing Venture Capital, NVP Capital, Y Combinator and angels connected to OpenAI, Nvidia, Google, Meta, Mercor, BAIR and SAIL. The money matters, but the model matters more.
Human Archive is trying to build what robotics companies badly need: large volumes of real-world, multimodal data showing humans doing physical tasks. Not just video. Video plus depth, motion, tactile force and body positioning, synchronized well enough that AI labs can use it to train machines that eventually act in homes, factories, restaurants and warehouses.
India has long been a hub for outsourced software services, customer support, content moderation and data annotation. Human Archive extends that logic into the physical world. The work is no longer just labeling images or checking text. It is capturing how people wash dishes, sort objects, clean rooms, cook food or move through service environments.
That is where the country has an obvious advantage. India has a large pool of gig and informal workers, fast-growing home services platforms, English-language familiarity and lower labor costs than the U.S. or Europe. Startups such as Urban Company, Snabbit and Pronto have already helped organize parts of the fragmented domestic services market. Human Archive is looking at that labor network and seeing a data layer for robotics.
The company says it is working with businesses in home services, hostels and restaurants, though it has not named all of its partners. It also says it has more than 1,000 active headsets deployed across multiple locations. Its own website describes a broader network of more than 100,000 contributors and 500-plus industry partners across homes, hotels, restaurants, agriculture, industrial, construction and retail settings.
The attraction for robotics labs is clear. Simulation can help, but robots still struggle when the real world refuses to behave like a clean virtual environment. A kitchen drawer sticks. A cup is slippery. A cloth folds unpredictably. These details are hard to model from scratch, which is why companies working on physical AI are hunting for human demonstrations at scale.
The data is useful because it is uncomfortable
Human Archive is not just asking people to record generic footage. It has moved from iPhones and off-the-shelf rigs to custom caps, wrist cameras, tactile gloves and full-body motion capture equipment. The company says it has more than 50 devices deployed and more than seven hardware products used across different data types.
That makes the dataset more valuable, but also more sensitive. A camera worn inside a home or workplace captures more than hand motion. It can capture faces, layouts, personal belongings, habits and conversations. Human Archive says its contracts comply with India’s Digital Personal Data Protection Act, that it displays privacy notices and consent information, and that faces are blurred from recordings.
The question is whether that will be enough. Moneycontrol reported last week that India’s Ministry of Electronics and Information Technology is examining consent mechanisms and data collection practices around egocentric data gathered through home service workers. That is exactly the kind of scrutiny this market should expect as physical AI moves from research labs into private spaces.
There is also a labor question sitting underneath the technology. TechCrunch reported that Human Archive pays participating workers a base rate of $1 per hour, while some other companies in the market pay roughly Rs 250 to Rs 400 per hour for similar collection work. That gap is the business model and the controversy in one line.
Supporters will argue that this creates a new income stream for workers and gives India a role in the global AI supply chain beyond call centers and software outsourcing. Critics will say the value capture is lopsided, with low-paid workers producing the raw material for frontier robotics companies and venture-backed startups.
A familiar outsourcing pattern moves into AI infrastructure
The debate around Human Archive mirrors earlier fights over the hidden labor behind artificial intelligence. Large language models depended on armies of annotators, moderators and evaluators. Social platforms relied on outsourced content moderation. Self-driving car companies needed endless labeled road footage. In each case, the polished product depended on people doing repetitive, messy and often under-recognized work.
Physical AI raises the stakes because the data is not abstract. It comes from bodies moving through real environments. If this works, robotics companies may be able to accelerate training without building every dataset themselves. If it fails, it will likely fail around trust, consent, worker pay or the difficulty of producing data that is consistent enough for serious model training.
The market signal is already clear. Venture capital is moving beyond chatbots and software copilots toward the infrastructure needed for robots, agents and world models. That includes simulation startups, synthetic data companies, sensor platforms and human data collection networks. Human Archive sits right in the middle of that shift.
What comes next will depend less on whether robots need this data, because they do. The harder question is who gets paid, who gives meaningful consent, and who controls the datasets once the work has been captured. India may become a key node in the physical AI supply chain, but the companies building that market will have to prove it is more than labor arbitrage with cameras attached.
Also read: Tencent makes Hy-MT2 easier for startups to use commercially • Revel and Voltera are merging to scale urban EV charging • Micron is being repriced as AI memory becomes scarce