Force Arena Leaderboard

Hosted on MSN

Gemma 4’s 31B model ranks third among all open AI models on the Arena AI leaderboard

Google’s Gemma 4 family just posted a result that will get attention in the open-source AI community: its 31-billion-parameter dense model has climbed to third place among all open models on the Arena ...

Ars Technica

New study accuses LM Arena of gaming its popular AI benchmark

The rapid proliferation of AI chatbots has made it difficult to know which models are actually improving and which are falling behind. Traditional academic benchmarks only tell you so much, which has ...

TechCrunch

Study accuses LM Arena of helping top AI labs game its benchmark

A new paper from AI lab Cohere, Stanford, MIT, and Ai2 accuses LM Arena, the organization behind the popular crowdsourced AI benchmark Chatbot Arena, of helping a select group of AI companies achieve ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Gemma 4’s 31B model ranks third among all open AI models on the Arena AI leaderboard

New study accuses LM Arena of gaming its popular AI benchmark

Study accuses LM Arena of helping top AI labs game its benchmark

Trending now