Defining Monitorable and Useful Goals
Published on July 15, 2025 11:06 PM GMTIn my most recent post , I introduced a corrigibility transformation that could take an arbitrary goal over external environments and define a corrigible goal with no hit to performance. That post focused on corrigibili…
In my most recent post , I introduced a corrigibility transformation that could take an arbitrary goal over external environments and define a corrigible goal with no hit to performance. That post f… [+29878 chars]