Grok 3 Beta

Grok 3 is xAI's most advanced model, trained on the Colossus supercluster with 10 times the computational power of previous state-of-the-art models. It boasts a 1M-token context window and advanced reasoning capabilities, enhanced through large-scale reinforcement learning, enabling deep thought processes ranging from seconds to minutes for solving complex problems. The model achieves top-tier performance across academic benchmarks and real-world user evaluations, earning an Elo score of 1402 in the Chatbot Arena. It was released alongside Grok 3 Mini, a cost-efficient variant optimized for streamlined reasoning.

Llama 4 Maverick

LLaMA 4 Maverick is a cutting-edge multimodal model featuring 17 billion active parameters within a Mixture-of-Experts architecture of 128 experts, totaling 400 billion parameters. It leads its class by outperforming models like GPT-4o and Gemini 2.0 Flash across a wide range of benchmarks, and it matches DeepSeek V3 in reasoning and coding tasks—using less than half the active parameters. Designed for efficiency and scalability, Maverick delivers a best-in-class performance-to-cost ratio, with an experimental chat variant achieving an ELO score of 1417 on LMArena. Despite its scale, it runs on a single NVIDIA H100 host, ensuring simple and practical deployment.

Grok 3 BetaLlama 4 Maverick
Web Site ?
Provider ?
Chat ?
Release Date ?
Modalities ?
text ?
images ?
video ?
text ?
images ?
video ?
API Providers ?
xAI
Meta AI, Hugging Face, Fireworks, Together, DeepInfra
Knowledge Cut-off Date ?
2025-01
2024-08
Open Source ?
No
Yes (Source)
Pricing Input ?
Not available
Not available
Pricing Output ?
Not available
Not available
MMLU ?
Not available
Not available
MMLU-Pro ?
79.9%
Base model
Source
80.5%
Source
MMMU ?
78%
With Think mode
Source
73.4%
Source
HellaSwag ?
Not available
Not available
HumanEval ?
Not available
Not available
MATH ?
Not available
Not available
GPQA ?
84.6%
With Think mode, Diamond
Source
69.8%
Diamond
Source
IFEval ?
Not available
Not available
SimpleQA ?
-
-
AIME 2024
-
-
AIME 2025
-
-
Aider Polyglot ?
-
-
LiveCodeBench v5 ?
-
-
Global MMLU (Lite) ?
-
-
MathVista ?
-
-
Mobile Application
-

VideoGameBench ?

Total score
-
0%
Doom II
-
0%
Dream DX
-
0%
Awakening DX
-
0%
Civilization I
-
0%
Pokemon Crystal
-
0%
The Need for Speed
-
0%
The Incredible Machine
-
0%
Secret Game 1
-
%0
Secret Game 2
-
0%
Secret Game 3
-
0%

Compare LLMs

Add a Comment


10%
Our site uses cookies.

Privacy and Cookie Policy: This site uses cookies. By continuing to use the site, you agree to their use.