The FrontierMath benchmark from Epoch AI tests generative models on difficult math problems. Find out how OpenAI’s o3 and other AI models performed. FrontierMath accuracy for OpenAI’s o3 and o4-mini ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results