o4-mini

Comments: 0
o4-mini #0
o4-mini #1
o4-mini #2
o4-mini #3

OpenAI o4-mini is the newest lightweight model in the o-series, engineered for efficient and capable reasoning across text and visual tasks. Optimized for speed and performance, it excels in code generation and image-based understanding, while maintaining a balance between latency and reasoning depth. The model supports a 200,000-token context window with up to 100,000 output tokens, making it suitable for extended, high-volume interactions. It handles both text and image inputs, producing textual outputs with advanced reasoning capabilities. With its compact architecture and versatile performance, o4-mini is ideal for a wide array of real-world applications demanding fast, cost-effective intelligence.

4293
828

Position in the overall ranking as of
June 2026
16
User rating
https://compare-ai.foundtt.com
4.1

Model Overview

Web Site
AI Model Web Page
Provider
The entity that provides this model.
Chat
Input a message to start chatting
Release Date
When the model was first released.
1 year ago
Apr 16, 2025
Modalities
Types of data this model can process
text ?
images ?
API Providers
The providers that offer this model. (This is not an exhaustive list.)
OpenAI API
Knowledge Cut-off Date
When the model's knowledge was last updated.
-
Open Source
Whether the model's code is available for public use.
No
Pricing Input
Cost for processing tokens in your prompts
$1.10 per million tokens
Pricing Output
Cost for tokens generated by the model
$4.40 per million tokens
MMLU
Massive Multitask Language Understanding - Tests knowledge across 57 subjects including mathematics, history, law, and more
fort
MMLU-Pro
A more robust MMLU benchmark with harder, reasoning-focused questions, a larger choice set, and reduced prompt sensitivity
-
MMMU
Massive Multitask Multimodal Understanding - Tests understanding across text, images, audio, and video
81.6%
Source
HellaSwag
A challenging sentence completion benchmark
-
HumanEval
Evaluates code generation and problem-solving capabilities
14.28%
Source
MATH
Tests mathematical problem-solving abilities across various difficulty levels
-
GPQA
Tests PhD-level knowledge in chemistry, biology, and physics through multiple choice questions that require deep domain expertise
81.4%
Source
IFEval
Tests model's ability to accurately follow explicit formatting instructions, generate appropriate outputs, and maintain consistent instruction adherence across different tasks
-
SimpleQA
Assessing the accuracy of simple questions
-
AIME 2024
93.4%
Source
AIME 2025
92.7%
Source
Aider Polyglot
Multilingual programming benchmark.
-
LiveCodeBench v5
Benchmark for real-time programming
-
Global MMLU (Lite)
A simplified version of the benchmark for assessing the universality of models at the global level.
-
MathVista
Evaluates the mathematical reasoning abilities of AI models within visual contexts
-
Mobile Application

MathArena ?

Avg. Score87%
AIME 2025
A test based on problems from the American Invitational Mathematics Examination, designed to assess the mathematical skills of models.
92%
HMMT February 2025
A test based on problems from the Harvard-MIT Mathematics Tournament, February 2025, designed to assess the mathematical skills of models.
83%
BRUMO 202587%
SMT 2025
A test based on problems from the Stanford Math Tournament, 2025, designed to assess the mathematical skills of models.
89%
CMIMC 2025
A test based on problems from the Canadian Mathematical Olympiad, 2025, designed to assess the mathematical skills of models.
84%

Add a Comment

Compare LLMs


10%
Our site uses cookies.

Privacy and Cookie Policy: This site uses cookies. By continuing to use the site, you agree to their use.