Smaller specialized models can match or beat frontier generalists on the tasks they're trained for. Working with Applied Compute, we RL-trained SWE-check, a bug detection model that matches Opus 4.6 on our internal evals while running ~10x faster.