Some really great work from @carlobaronio @pmmarsella @ybenpan! Still a long horizon ahead for multi-turn agents :)
Cognition
Cognition7.5.2025
Our research interns present: Kevin-32B = K(ernel D)evin It's the first open model trained using RL for writing CUDA kernels. We implemented multi-turn RL using GRPO (based on QwQ-32B) on the KernelBench dataset. It outperforms top reasoning models (o3 & o4-mini)! 🧵
5,85K