Mathematics Model Making

4don MSN

AI models are starting to crack high-level math problems

“I was curious to establish a baseline for when LLMs are effectively able to solve open math problems compared to where they ...

Tech Xplore on MSN

New AI model accurately grades messy handwritten math answers and explains student errors

A research team affiliated with UNIST has unveiled a novel AI system capable of grading and providing detailed feedback on even the most untidy handwritten math answers—much like a human instructor.

HotHardware

Microsoft Unveils Phi-4, A Small AI Model With Big Math Skills

Companies like OpenAI continue to push the boundaries with large language (LLM) models in its pursuit of the holy grail of artificial general intelligence (AGI). Meanwhile, Microsoft is taking a ...

Ars Technica

New secret math benchmark stumps AI models and PhDs alike

On Friday, research organization Epoch AI released FrontierMath, a new mathematics benchmark that has been turning heads in the AI world because it contains hundreds of expert-level problems that ...

EurekAlert!

MathEval: a comprehensive benchmark for evaluating large language models on mathematical reasoning capabilities

This study introduces MathEval, a comprehensive benchmarking framework designed to systematically evaluate the mathematical reasoning capabilities of large language models (LLMs). Addressing key ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results