For frontier AI news
Powered by Code Arena

WebDev Leaderboard

Compare the performance of AI models for web development tasks built in the Code Arena

Last Updated

Dec 23, 2025

Total Votes

75,257

Total Models

33

Rank Spread
1
1◄─►1
1520+12/-124,088
Anthropic
Proprietary
2
2◄─►5
1484+17/-171,647
OpenAI
Proprietary
3
2◄─►5
1480+12/-124,010
Anthropic
Proprietary
4
2◄─►5
1478+10/-109,066
Google
Proprietary
5
2◄─►6
1465+13/-132,233
Google
Proprietary
6
5◄─►6
1449+15/-151,570
Z.ai
MIT
7
7◄─►13
1398+12/-123,949
OpenAI
Proprietary
8
7◄─►14
1398+15/-151,641
OpenAI
Proprietary
9
7◄─►13
1393+9/-98,150
Anthropic
Proprietary
10
7◄─►14
1392+10/-105,191
OpenAI
Proprietary
11
7◄─►14
1388+9/-97,786
Anthropic
Proprietary
12
7◄─►14
1387+9/-99,174
Anthropic
Proprietary
13
7◄─►16
1381+14/-141,883
Google
Proprietary
14
13◄─►16
1367+9/-97,489
Z.ai
MIT
15
9◄─►18
1366+16/-161,404
DeepSeek AI
MIT
16
13◄─►17
1360+9/-97,108
OpenAI
Proprietary
17
16◄─►19
1341+9/-96,882
Moonshot
Modified MIT
18
15◄─►20
1337+18/-181,039
Xiaomi
MIT
19
17◄─►20
1335+10/-105,287
OpenAI
Proprietary
20
18◄─►20
Minimax
1316+9/-97,592
MiniMax
Apache 2.0
21
21◄─►24
1293+10/-105,161
DeepSeek AI
MIT
22
21◄─►24
1290+9/-97,857
Anthropic
Proprietary
23
21◄─►24
1289+9/-97,756
Alibaba
Apache 2.0
24
21◄─►26
1281+15/-151,707
DeepSeek AI
MIT
25
24◄─►26
1263+15/-151,946
KwaiKAT
Proprietary
26
24◄─►28
1251+17/-171,565
OpenAI
Proprietary
27
26◄─►30
1226+13/-133,720
xAI
Proprietary
28
26◄─►30
1225+20/-201,027
Mistral
Apache 2.0
29
27◄─►30
1212+13/-133,505
Google
Proprietary
30
27◄─►30
1205+19/-191,262
xAI
Proprietary
31
31◄─►32
1152+23/-23945
xAI
Proprietary
32
31◄─►33
1142+21/-211,014
xAI
Proprietary
33
32◄─►33
1102+22/-221,033
Mistral
Proprietary

Remove Style Control Leaderboard Plots

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)