Anthropic. Their whole brand is safety-first, so they’re the most likely to slam the brakes before anyone else risks looking weak or losing the race.
Yeah, and it's not just branding — their interpretability research is basically built around catching problems early, so they have the technical receipts to back up pulling the plug.