This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here. It’s been just over two weeks since OpenAI reached a controversial agreement to allow the Pentagon to use its AI in classified environments. There are still pressing questions about what exactly OpenAI’s…
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. The way we measure progress in AI is terrible Every time a new AI model is released, it’s typically touted as acing its performance against a series of benchmarks. OpenAI’s GPT-4o, for example,…
Generative AI models have become remarkably good at conversing with us, and creating images, videos, and music for us, but they’re not all that good at doing things for us. AI agents promise to change that. Think of them as AI models with a script and a purpose. They tend to come in one of…
Every time a new AI model is released, it’s typically touted as acing its performance against a series of benchmarks. OpenAI’s GPT-4o, for example, was launched in May with a compilation of results that showed its performance topping every other AI company’s latest model in several tests. The problem is that these benchmarks are poorly…