Thursday, December 19, 2024

Anthropic training fake

https://assets.anthropic.com/m/983c85a201a962f/original/Alignment-Faking-in-Large-Language-Models-full-paper.pdf

No comments:

Post a Comment