News

inspect-petri added to PyPI

An auditing agent that enables automated monitoring and interaction with language models to detect potential alignment issues, reward hacking, and other concerning behaviors.

Welcome to Inspect Petri, an auditing agent that enables automated monitoring and interaction with language models to detect potential alignment issues, reward hacking, and other concerning behaviors… [+5582 chars]