Technology · March 18, 2025

When you might start speaking to robots

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Last Wednesday, Google made a somewhat surprising announcement. It launched a version of its AI model, Gemini, that can do things not just in the digital realm of chatbots and internet search but out here in the physical world, via robots. 

Gemini Robotics fuses the power of large language models with spatial reasoning, allowing you to tell a robotic arm to do something like “put the grapes in the clear glass bowl.” These commands get filtered by the LLM, which identifies intentions from what you’re saying and then breaks them down into commands that the robot can carry out. For more details about how it all works, read the full story from my colleague Scott Mulligan.

You might be wondering if this means your home or workplace might one day be filled with robots you can bark orders at. More on that soon. 

But first, where did this come from? Google has not made big waves in the world of robotics so far. Alphabet acquired some robotics startups over the past decade, but in 2023 it shut down a unit working on robots to solve practical tasks like cleaning up trash. 

Despite that, the company’s move to bring AI into the physical world via robots is following the exact precedent set by other companies in the past two years (something that, I must humbly point out, MIT Technology Review has long seen coming). 

In short, two trends are converging from opposite directions: Robotics companies are increasingly leveraging AI, and AI giants are now building robots. OpenAI, for example, which shuttered its robotics team in 2021, started a toptechtrends.com/2025/01/10/new-openai-job-listings-reveal-its-robotics-plans/”>new effort to build humanoid robots this year. In October, the chip giant Nvidia declared the next wave of artificial intelligence to be “physical AI.”

There are lots of ways to incorporate AI into robots, starting with improving how they are trained to do tasks. But using large language models to give instructions, as Google has done, is particularly interesting. 

It’s not the first. The robotics startup Figure went viral a year ago for a video in which humans gave instructions to a humanoid on how to put dishes away. Around the same time, a startup spun off from OpenAI, called Covariant, built something similar for robotic arms in warehouses. I saw a demo where you could give the robot instructions via images, text, or video to do things like “move the tennis balls from this bin to that one.” Covariant was acquired by Amazon just five months later. 

When you see such demos, you can’t help but wonder: When are these robots going to come to our workplaces? What about our homes?

If Figure’s plans offer a clue, the answer to the first question is soon. The company announced on Saturday that it is building a high-volume manufacturing facility set to manufacture 12,000 humanoid robots per year. But training and testing robots, especially to ensure they’re safe in places where they work near humans, still takes a long time

For example, Figure’s rival Agility Robotics claims it’s the only company in the US with paying customers for its humanoids. But industry safety standards for humanoids working alongside people aren’t fully formed yet, so the company’s robots have to work in separate areas.

This is why, despite recent progress, our homes will be the last frontier. Compared with factory floors, our homes are chaotic and unpredictable. Everyone’s crammed into relatively close quarters. Even impressive AI models like Gemini Robotics will still need to go through lots of tests both in the real world and in simulation, just like self-driving cars. This testing might happen in warehouses, hotels, and hospitals, where the robots may still receive help from remote human operators. It will take a long time before they’re given the privilege of putting away our dishes.  


Now read the rest of The Algorithm

Deeper Learning

Everyone in AI is talking about Manus. We put it to the test.

The launch of the newest AI model from China, called Manus, has been accompanied by huge amounts of hype. It’s a general AI agent and uses multiple models to act autonomously on a wide range of tasks. (This distinguishes them from AI chatbots, which are based on a single large language model family and designed primarily for conversations.) 

The wait list for testing out Manus is incredibly long, but our reporter Caiwei Chen was able to give it a try. Despite some glitches, she found it highly intuitive and says it shows real promise for the future of AI helpers.

Why it matters: As we covered last week, discussions about artificial “superintelligence” and the potential impact of AI on human labor have recently reached a new frenzy. Influential people think powerful AI systems that can outperform humans on many cognitive tasks are imminent, and they say governments need to do more to prepare. That’s still highly contested among AI researchers. In the meantime, Manus offers a good glimpse into how capable these models are right now.

Bits and Bytes

Waabi says its virtual robotrucks are realistic enough to prove the real ones are safe

While companies like Waymo pursue autonomous vehicles for use as taxis in cities, others are building big-rig trucks that don’t need a driver. Waabi, one of the leading companies in this space, has been testing trucks on roads in Texas since 2023, always with a human driver in the cab for backup. But now it says its simulation models are good enough to help persuade regulators to let them operate human-free later this year. (MIT Technology Review)

OpenAI has called DeepSeek “state-controlled” and called for it to be banned 

The launch of DeepSeek earlier this year convinced lots of people that capable AI models didn’t have to be so expensive. That’s hugely threatening to OpenAI’s business model. So perhaps it should come as no surprise that OpenAI has submitted a policy proposal document to the Trump administration claiming that DeepSeek’s R1 reasoning model is at least partially a product of the Chinese government. The company’s attempt to influence the highest levels of the US government aligns with the fact that it increased its lobbying expenditure nearly sevenfold last year. (toptechtrends.com/2025/03/13/openai-calls-deepseek-state-controlled-calls-for-bans-on-prc-produced-models/”>TechCrunch)

These new AI benchmarks could help make models less biased

It’s long been known that AI models retain the biases in their training data, including racial and gender biases. Though many model makers work to reduce these issues, it’s hard to test how well it’s working. A new set of benchmarks from a team based at Stanford aims to help. (MIT Technology Review)

Inside Alexandr Wang’s role in the Trump administration 

Wang is the founder and CEO of Scale AI, a massive data-labeling company that has received many Pentagon contracts. He recently also coauthored a paper arguing that AI superintelligence is imminent, and that the government should treat certain types of AI like nuclear weapons. He’s now trying to position himself as a loyal partner to the Trump administration. (The Information)

How your kid might be using AI to cheat

Since the launch of ChatGPT, students have been using it for shortcuts. In lots of ways, the story is more complicated than that; as educational-tech companies love to highlight, AI can help create customized learning plans and act as a helpful tutor for many students. But there’s no doubt that AI is unlike any other tool students have had at their disposal before. And if they can hide the fact that they’re using it, it drastically cuts the amount of work kids need to do to get by. (Wall Street Journal)

About The Author