Sunday, 14 December

Sunday, 14 December2025

OpenAI’s o3 Model Scores Lower Than Expected on Independent Benchmark

OpenAI’s o3 Model Scores Lower Than Expected on Independent Benchmark
OpenAI's o3 AI model, initially claimed to solve over 25% of problems on the challenging FrontierMath benchmark, has underperformed in independent evaluations. Epoch AI, the institute behind FrontierMath, reported that o3 achieved around 10% accuracy, significantly lower than OpenAI's earlier assertions. The discrepancy is attributed to differences in testing conditions, with OpenAI's internal assessments possibly utilizing more computational resources and different problem subsets.

Download the TechShots App

IT Trends Move Fast. Stay Faster.

Subscribe To Our Newsletter.

Full Name
Email