we have an unverified SOTA result on KernelBench with o3-mini and an evolutionary examples tape: 208/250 claimed speedups, including 3 for Level 4 (prev untouched). would be grateful for any help reviewing the optimized KernelBench kernels at . thank you to @anneouyang and Stanford’s @ScalingIntelLab for agreeing to review them.
wordgrammer
wordgrammer30.4.2025
The good GPT wrappers have already been built, and ChatGPT struggled to write raw Cuda. Now is the time for monsters.
10,66K