DApp Store | Web3 Hub tapahtumille ja peleille

Trendaavat aiheet

o3 Pro on ARC-AGI Semi Private Eval Results ARC-AGI-1: * Low: 44%, $1.64/task * Medium: 57%, $3.18/task * High: 59%, $4.16/task ARC-AGI-2: * All reasoning efforts: <5%, $4-7/task Takeaways: * o3-pro in line with o3 performance * o3's new price sets the ARC-AGI-1 Frontier

To note, o3 Pro is *not* the same model we tested in Dec ‘24 (o3-preview) OpenAI has explicitly confirmed this. See reference tweet for more information

o3 results have been updated to reflect the 80% reduction in price

New to the chart are data points for o3 (High reasoning) and o4-mini (High reasoning). They were previously excluded because of model timeouts. OpenAI’s new ‘background mode’ has enabled us to process these models on high compute settings.

See leaderboard: Reproduce results:

108,75K

Johtavat

Rankkaus

Suosikit

Ketjussa trendaava

Trendaa X:ssä

Viimeisimmät suosituimmat rahoitukset

Merkittävin