Trendaavat aiheet
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.
o3 Pro on ARC-AGI Semi Private Eval Results
ARC-AGI-1:
* Low: 44%, $1.64/task
* Medium: 57%, $3.18/task
* High: 59%, $4.16/task
ARC-AGI-2:
* All reasoning efforts: <5%, $4-7/task
Takeaways:
* o3-pro in line with o3 performance
* o3's new price sets the ARC-AGI-1 Frontier
To note, o3 Pro is *not* the same model we tested in Dec ‘24 (o3-preview)
OpenAI has explicitly confirmed this. See reference tweet for more information

17.4.2025
Clarifying o3’s ARC-AGI Performance
OpenAI has confirmed:
* The released o3 is a different model from what we tested in December 2024
* All released o3 compute tiers are smaller than the version we tested
* The released o3 was not trained on ARC-AGI data, not even the train set
* The released o3 is tuned for chat/product use, which introduces both strengths and weaknesses on ARC-AGI
What ARC Prize will do:
* We will re-test the released o3 (all compute tiers) and publish updated results. Prior scores will be labeled “preview”
* We will test and release o4-mini results as soon as possible
* We will test o3-pro once available
o3 results have been updated to reflect the 80% reduction in price
New to the chart are data points for o3 (High reasoning) and o4-mini (High reasoning). They were previously excluded because of model timeouts.
OpenAI’s new ‘background mode’ has enabled us to process these models on high compute settings.
See leaderboard:
Reproduce results:
108,75K
Johtavat
Rankkaus
Suosikit