OpenAI's chief scientist says rapid breakthroughs in coding and math reasoning are pushing the company toward its goal of an AI that functions like a human research intern within a year.
Jakub Pachocki, the chief scientist at OpenAI, is seeing concrete signals that artificial intelligence is approaching a critical capability threshold: performing sustained, multi-step technical work with minimal human oversight. Speaking on the "Unsupervised Learning" podcast, he pointed to recent advances in mathematical research, physics problem-solving, and software engineering as evidence that models are moving beyond simple prompt-response interactions into genuine autonomous workflow territory.
As Business Insider reported, Pachocki outlined specific internal milestones during an October company meeting, targeting September 2026 for an AI research intern and March 2028 for a fully autonomous researcher. The key distinction between the two, he explained, comes down to task duration. An intern-level system can handle a defined technical problem for a limited period, while a fully autonomous researcher would independently manage complex, open-ended projects spanning days or weeks. Right now, the focus is on stretching the length of time a model can work productively without human intervention, a metric Pachocki considers the most meaningful measure of actual progress.
The practical implications are already visible inside OpenAI's own operations. Pachocki noted that coding agents like Codex are now handling a significant portion of the company's internal programming work. He described an "explosive growth" in coding tool adoption that has fundamentally altered how developers approach their daily work. Mathematics benchmarks serve as what he called a "north star" for improving model reasoning, because mathematical proofs and solutions are inherently verifiable, providing a clean feedback loop that helps researchers measure genuine cognitive capability rather than clever pattern matching.
The trajectory OpenAI describes has immediate implications for how technology companies structure their research and development teams. If an AI system can reliably function as a research intern by late 2026, the economics of early-stage technical exploration shift considerably. A single senior researcher could theoretically direct multiple AI assistants simultaneously, each running parallel experiments, testing hypotheses, and generating preliminary analysis at a fraction of the cost and time required by human teams. This does not eliminate the need for human researchers, but it fundamentally changes the ratio of senior oversight to junior execution. Companies building AI-powered research tools, from Anthropic to Google DeepMind, are racing toward similar capabilities, which means the competitive pressure to integrate autonomous agents into technical workflows will intensify across the sector.
The roadblocks still standing
Despite the optimistic timeline, Pachocki was candid about current limitations. He explicitly stated that he does not expect systems capable of independently improving their own architecture or solving complex alignment problems within this calendar year. The gap between handling specific, well-defined technical tasks and navigating the ambiguous, creative problem-solving required of a full researcher remains substantial. OpenAI CEO Sam Altman acknowledged this uncertainty in a post on X, writing that the company "may totally fail" at hitting its milestones but emphasizing the importance of transparency given the potential societal impact. That caveat is worth keeping in mind when evaluating the commercial readiness of autonomous research agents over the next 18 months.
The real takeaway here is not a specific delivery date, but the accelerating pace at which AI systems are expanding their autonomous working horizons. For startups and investors, the practical question is how quickly these capabilities trickle down from internal research labs into commercially available products. The companies that figure out how to reliably deploy AI agents capable of sustained, multi-step technical work will have a significant structural advantage in industries where research velocity directly determines market position. Watch for coding agent benchmarks and autonomous task completion metrics as the most reliable leading indicators of when this transition actually materializes.