熱門話題
#
Bonk 生態迷因幣展現強韌勢頭
#
有消息稱 Pump.fun 計劃 40 億估值發幣,引發市場猜測
#
Solana 新代幣發射平臺 Boop.Fun 風頭正勁
Grok 4 現在幾乎不會在數學/物理考試問題上出錯,除非這些問題是巧妙的對抗性問題。
它可以識別問題中的錯誤或模糊之處,然後修正問題中的錯誤或回答每個模糊問題的變體。

7月10日 14:07
Insane that Elon Musk has pulled it off again, absolutely crushing the AI wars with Grok 4.
Summarizing the core announcements:
— Post-training RL spend == pretraining spend
— $3/M input told, $15/M output toks, 256k context, price 2x beyond 128k
— #1 on Humanity’s Last Exam (general hard problems) 44.4%, #2 is 26.9%
— #1 on GPQA (hard graduate problems) 88.9%. #2 is 86.4%
— #1 on AIME 2025 (Math) 100%, #2 is 98.4%
— #1 on Harvard MIT Math 96.7%, #2 is 82.5%
— #1 on USAMO25 (Math) 61.9%, #2 is 49.4%
— #1 on ARC-AGI-2 (easy for humans, hard for AI) 15.9%, #2 is 8.6%
— #1 on LiveCodeBench (Jan-May) 79.4%, #2 is 75.8%
Grok 4 is “potentially better than PhD level in every subject no exception”.. and it’s pretty cheap. Massive moment in the AI wars and Elon has come to play.

6.33M
熱門
排行
收藏