MLSN: #10 Adversarial Attacks Against Language and Vision Models, Improving LLM Honesty, and Tracing the Influence of LLM Training Data
Published on September 13, 2023 6:03 PM GMTWelcome to the 10th issue of the ML Safety Newsletter by the Center for AI Safety. In this edition, we cover:<ul><li>Adversarial attacks against GPT-4, PaLM-2, Claude, and Llama 2</li><li>Robustness against unforeseeā¦
Welcome to the 10th issue of the ML Safety Newsletter by the Center for AI Safety. In this edition, we cover: <ul><li>Adversarial attacks against GPT-4, PaLM-2, Claude, and Llama 2</li><li>Robustnesā¦ [+9031 chars]