Google Gemini 3 quickly tops the LMArena leaderboard

2025-11-24 01:31:00+08

Mountain View, CA – Following its official launch, Google’s latest AI model, Gemini 3 Pro, has surged to the top of the LMArena public leaderboard with an unprecedented Elo rating of 1,501, surpassing competitors including GPT-5.1, Claude 4.5, and Grok-4.1 to become the highest-rated multimodal model in the platform’s history.

The model demonstrates exceptional performance across a range of rigorous benchmarks:

37.5% on the “Human Ultimate Exam”
91.9% on GPQA Diamond
81% on MMMU-Pro
87.6% on Video-MMMU

These results highlight Gemini 3 Pro’s leading capabilities in scientific reasoning, mathematical problem-solving, and video understanding. With its enhanced reasoning mode—Deep Think—the model further elevates its Human Ultimate Exam score to 41% and achieves a record-breaking 45.1% on ARC-AGI-2, establishing new milestones in general artificial intelligence evaluation.

Industry leaders have taken swift notice. OpenAI CEO Sam Altman praised the release on X (formerly Twitter), writing, “Gemini 3 looks really impressive.” Google CEO Sundar Pichai responded with a humble “🙏.” Elon Musk also commented on LMArena’s official account, stating, “Certainly worthy of congratulations,” while hinting at the imminent launch of Grok 4.20.

Behind the scenes, the competitive pressure is mounting. A recently leaked internal memo from OpenAI reveals Altman acknowledging that Google’s rapid advancements could create “temporary economic headwinds” for the company. He cautioned employees that external assessments in the coming period would be “extremely tough”—a clear signal that Gemini 3’s strong debut has significantly intensified the AI race in Silicon Valley.

Google reaffirms its commitment to responsible innovation and continues to push the boundaries of what AI can achieve across text, image, audio, and video modalities.

← Previous Article Next Article →

Return to News List