Modern artificial intelligence technologies open new opportunities for business and science. A comprehensive analysis of major language models with a focus on their capabilities, test results, and technical parameters. Developers of advanced language models:
Language Model | Provider | Rating | Release Date | Knowledge Cut-off Date | Open Source | API Providers | Modalities | Pricing Input | Pricing Output | MMLU | MMLU-Pro | MMMU | HellaSwag | HumanEval | MATH | GPQA | IFEval | Mobile Application |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Claude Opus 4 | Anthropic | ⭐️ 4.6 | May 22, 2025 | Unknown | No | Anthropic API, Amazon Bedrock, Google Cloud's Vertex AI | text images | $15 | $75 per million tokens | 88.8% Source | - | 76.5% Source | - | - | - | 79.6% Diamond Source | - | Google Play Apple Apps |
Claude Sonnet 4 | Anthropic | ⭐️ 4.5 | May 22, 2025 | Unknown | No | Anthropic API, Amazon Bedrock, Google Cloud's Vertex AI | text images | $3 per million tokens | $15 per million tokens | 86.5% Source | - | 74.4% Source | - | - | - | 75.4% Diamond Source | - | Google Play Apple Apps |
Grok 3 Beta | xAI | ⭐️ 4.4 | Jan 19, 2025 | 2025-01 | No | xAI | text images video | Not available | Not available | Not available | 79.9% Base model Source | 78% With Think mode Source | Not available | Not available | Not available | 84.6% With Think mode, Diamond Source | Not available | Google Play Apple Apps |
GPT-4.5 | OpenAI | ⭐️ 4.4 | Feb 27, 2025 | 2023-10 | No | OpenAI, Azure OpenAI Service | text images | $75.00 per million tokens | $150.00 per million tokens | Not available | Not available | 74.4% Source | Not available | Not available | Not available | 71.4% science Source | Not available | Google Play Apple Apps |
DeepSeek-R1 | DeepSeek | ⭐️ 4.2 | Jan 21, 2025 | Unknown | Yes | DeepSeek, HuggingFace | text | $0.55 per million tokens | $2.19 per million tokens | 90.8% Pass@1 Source | 84% EM Source | - | - | - | - | 71.5% Pass@1 Source | 83.3% Prompt Strict Source | Google Play Apple Apps |
Nova Pro | Amazon | ⭐️ 4.2 | Dec 02, 2024 | Purposefully not disclosed | No | Amazon Bedrock | text images video | $0.80 per million tokens | $3.20 per million tokens | 85.9% CoT Source | Not available | Not available | Not available | 89% pass@1 Source | 76.6% CoT Source | 46.9% Main Source | 92.1% Source | - |
Gemini 2.5 Pro | ⭐️ 4.2 | Mar 25, 2025 | - | No | Google AI Studio, Vertex AI, Gemini app | text images voice video | Not available | Not available | Not available | Not available | 81.7% Source | Not available | Not available | Not available | 84.0% Diamond Science Source | Not available | Google Play Apple Apps | |
Llama 4 Maverick | Meta | ⭐️ 4.2 | Apr 05, 2025 | 2024-08 | Yes (Source) | Meta AI, Hugging Face, Fireworks, Together, DeepInfra | text images video | Not available | Not available | Not available | 80.5% Source | 73.4% Source | Not available | Not available | Not available | 69.8% Diamond Source | Not available | - |
o3 | OpenAI | ⭐️ 4.2 | Apr 16, 2025 | - | No | OpenAI API | text images | $10.00 per million tokens | $40.00 per million tokens | 82.9% Source | - | - | - | - | - | 83.3% Diamond, no tools Source | - | Google Play Apple Apps |
Qwen 3 | Alibaba | ⭐️ 4.2 | Apr 29, 2025 | - | Yes (Source) | - | - | - | - | - | - | - | - | - | - | - | - | - |
Claude 3.5 Haiku | Anthropic | ⭐️ 4.1 | Nov 04, 2024 | 01.04.2024 | No | Anthropic, AWS Bedrock, Vertex AI | text | $0.80 per million tokens | $4.00 | Not available | 65% 0-shot CoT Source | Not available | Not available | 88.1% 0-shot Source | 69.4% 0-shot CoT Source | Not available | Not available | Google Play Apple Apps |
o3-mini | OpenAI | ⭐️ 4.1 | Jan 31, 2025 | Unknown | No | OpenAI API | text | $1.10 per million tokens | $4.40 per million tokens | 86.9% pass@1, high effort Source | Not available | Not available | Not available | Not available | 97.9% pass@1, high effort Source | 79.7% 0-shot, high effort Source | Not available | Google Play Apple Apps |
Claude 3.7 Sonnet - Extended Thinking | Anthropic | ⭐️ 4.1 | Feb 24, 2025 | - | No | Claude.ai, Anthropic API, Amazon Bedrock, Google Cloud Vertex AI | text images | $3.00 per million tokens | $15.00 per million tokens | Not available | Not available | 75% Source | Not available | Not available | 96.2% Source | 84.8% Diamond Source | 93.2% Source | Google Play Apple Apps |
Llama 4 Scout | Meta | ⭐️ 4.1 | Apr 05, 2025 | 2025-04 | Yes (Source) | Meta AI, Hugging Face, Fireworks, Together, DeepInfra | text images video | Not available | Not available | Not available | 74.3% Reasoning & Knowledge Source | 69.4% Image Reasoning Source | Not available | Not available | Not available | 57.2% Diamond Source | Not available | - |
o4-mini | OpenAI | ⭐️ 4.1 | Apr 16, 2025 | - | No | OpenAI API | text images | $1.10 per million tokens | $4.40 per million tokens | fort | - | 81.6% Source | - | 14.28% Source | - | 81.4% Source | - | Google Play Apple Apps |
GPT-4.1 | OpenAI | ⭐️ 4.1 | Apr 14, 2025 | - | No | OpenAI API | text images | $2.00 per million tokens | $8.00 per million tokens | 90.2% pass@1 Source | - | 74.8% Source | - | - | - | 66.3% Diamond Source | - | Google Play Apple Apps |
Gemini 2.0 Pro | ⭐️ 4 | Dec 11, 2024 | 08.2024 | No | Google AI Studio, Vertex AI | text images voice video | $0.10 per million tokens | $0.40 per million tokens | Not available | 79.1% Source | 72.7% Source | Not available | Not available | 91.8% Source | 64.7% Diamond Source | Not available | Google Play Apple Apps | |
Gemini 2.0 Flash | ⭐️ 4 | Dec 11, 2024 | 08.2024 | No | Google AI Studio, Vertex AI | text images voice video | $0.10 per million tokens | $0.40 per million tokens | Not available | 77.6% Source | 71.7% Source | Not available | Not available | 90.9% Source | 60.1% Diamond Source | Not available | Google Play Apple Apps | |
Claude 3.7 Sonnet | Anthropic | ⭐️ 4 | Feb 24, 2025 | - | No | Claude.ai, Anthropic API, Amazon Bedrock, Google Cloud Vertex AI | text images | $3.00 per million tokens | $15.00 per million tokens | Not available | Not available | 71.8% Source | Not available | Not available | 82.2% Source | 68% Diamond Source | 90.8% Source | Google Play Apple Apps |
Qwen2.5-VL-32B | Alibaba | ⭐️ 4 | Mar 25, 2025 | Unknown | Yes (Source) | - | text images video | $0 | $0 | 78.4% Source | 49.5% | 70% | Not available | Not available | 82.2% | 46.0% Diamond | Not available | - |
GPT-4.1 Nano | OpenAI | ⭐️ 4 | Apr 14, 2025 | - | No | OpenAI API | text images | $0.10 per million tokens | $0.40 per million tokens | 80.1% Source | - | 55.4% Source | - | - | - | 50.3% Diamond Source | 74.5% Source | Google Play Apple Apps |
Gemini 2.0 Flash Thinking | ⭐️ 3.9 | Dec 19, 2024 | 04.2024 | No | Google AI Studio, Vertex AI, Gemini API | text images | Not available | Not available | Not available | Not available | 75.4% Source | Not available | Not available | Not available | 74.2% Diamond Science Source | Not available | Google Play Apple Apps | |
Llama 3.3 70B Instruct | Meta | ⭐️ 3.9 | Dec 06, 2024 | 12.2024 | Yes | Fireworks, Together, DeepInfra, Hyperbolic | text | $0.23 per million tokens | $0.40 per million tokens | 86% 0-shot, CoT Source | 68.9% 5-shot, CoT Source | Not available | Not available | 88.4% pass@1 Source | 77% 0-shot, CoT Source | 50.5% 0-shot, CoT Source | 92.1% Source | - |
Llama 3.1 Nemotron 70B Instruct | NVIDIA | ⭐️ 3.9 | Oct 15, 2023 | - | Yes | OpenRouter | text | $0.35 per million tokens | $0.40 per million tokens | 85% 5-shot Source | Not available | Not available | Not available | 75% Source | 71% Source | Not available | Not available | - |
Command A | Cohere | ⭐️ 3.9 | Mar 14, 2025 | - | Yes | Cohere, Hugging Face, Major cloud providers | text | $2.50 per million tokens | $10.00 per million tokens | 85.5% Source | Not available | Not available | Not available | Not available | 80% Source | 50.8% Source | 90.9% Source | - |
Nova Lite | Amazon | ⭐️ 3.8 | Dec 02, 2024 | Purposefully not disclosed | No | Amazon Bedrock | text images video | $0.06 per million tokens | $0.24 per million tokens | 80.5% CoT Source | Not available | Not available | Not available | 85.4% pass@1 Source | 73.3% CoT Source | 42% Main Source | 89.7% Source | - |
Mistral Large 2 | Mistral AI | ⭐️ 3.8 | Jun 24, 2024 | Unknown | Yes | Azure AI, AWS Bedrock, Google AI Studio, Vertex AI, Snowflake Cortex | text | $3.00 per million tokens | $9.00 per million tokens | 84% 5-shot Source | 50.69% Source | Not available | Not available | Not available | 1.13% Source | 24.94% | 84.01% | - |
Nova Micro | Amazon | ⭐️ 3.6 | Dec 02, 2024 | Purposefully not disclosed | No | Amazon Bedrock | text | $0.04 per million tokens | $0.14 per million tokens | 77.6% CoT Source | - | - | - | 81.1% pass@1 Source | 69.3% CoT Source | 40% Main Source | 87.2% Source | - |
Language Model | Total score | Doom II | Dream DX | Awakening DX | Civilization I | Pokemon Crystal | The Need for Speed | The Incredible Machine | Secret Game 1 | Secret Game 2 | Secret Game 3 |
---|---|---|---|---|---|---|---|---|---|---|---|
VG-Agent + Gemini 2.0 Flash | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% |
VG-Agent + Llama 4 Maverick | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | %0 | 0% | 0% |
VG-Agent + Claude 3.7 Sonnet | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% |
VG-Agent + Gemini 2.5 Pro | 0.48% | 0% | 4.8% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% |
Modern large language models (LLM) continue to evolve, providing users with powerful tools for text processing and generation. This section features key development companies, their technologies, and capabilities.
Language Model | Pros | Cons |
---|---|---|
Claude Opus 4 | 4975 | 513 |
Claude Sonnet 4 | 3408 | 960 |
Grok 3 Beta | 3791 | 261 |
GPT-4.5 | 4249 | 760 |
DeepSeek-R1 | 1224 | 21 |
Nova Pro | 3775 | 982 |
Gemini 2.5 Pro | 4870 | 204 |
Llama 4 Maverick | 3589 | 252 |
o3 | 3218 | 928 |
Qwen 3 | 4507 | 122 |
Claude 3.5 Haiku | 3369 | 663 |
o3-mini | 4552 | 194 |
Claude 3.7 Sonnet - Extended Thinking | 4424 | 831 |
Llama 4 Scout | 4193 | 846 |
o4-mini | 4293 | 828 |
GPT-4.1 | 3374 | 887 |
Gemini 2.0 Pro | 4876 | 757 |
Gemini 2.0 Flash | 4052 | 628 |
Claude 3.7 Sonnet | 4934 | 979 |
Qwen2.5-VL-32B | 3200 | 126 |
GPT-4.1 Nano | 3631 | 651 |
Gemini 2.0 Flash Thinking | 4466 | 870 |
Llama 3.3 70B Instruct | 3691 | 541 |
Llama 3.1 Nemotron 70B Instruct | 4028 | 265 |
Command A | 3020 | 186 |
Nova Lite | 3967 | 819 |
Mistral Large 2 | 4923 | 871 |
Nova Micro | 2322 | 480 |
Developers of Claude – a language model focused on safety and reliability.
Creators of GPT-4, one of the most powerful language models for text generation and analysis.
Developers of Gemini – an advanced model integrated with search technologies.
Developers of Llama – an open language model for research and commercial applications.
DeepSeek - AI Assistant | ChatGPT | Google Gemini | Grok - AI Assistant | Claude by Anthropic | |
---|---|---|---|---|---|
Updated | May 30, 2025 | Jun 03, 2025 | Apr 29, 2025 | Jun 04, 2025 | May 30, 2025 |
App Released | 1.2.1 | 1.2025.140 | 1.0.751104895 | 0.5.13 | 1.250602.7 |
Score |
|
|
|
|
|
Compare AI. Test. Benchmarks. Mobile Apps Chatbots, Sketch
Copyright © 2025 All Right Reserved.