Researchers at ETH Zurich created a jailbreak attack that bypasses AI guardrails
Submitted by Anonymous (not verified) on Mon, 11/27/2023 - 22:40
Artificial intelligence models that rely on human feedback to ensure that their outputs are harmless and helpful may be universally vulnerable to so-called ‘poison’ attacks.