fun: 3/4 months ago I ran o3 for some academics on a private test set of AIME-like problems. It has taken them so long to write a summary of the results (96%) that Alex solved IMO in the meantime.
99