Bad Likert Judge: A New AI Jailbreak Boosts Attack Success Rates

Dwain.B

3 Jan 2025

Psychometric Scale Manipulation Exploits AI Models’ Guardrails

Cybersecurity researchers have unveiled a novel AI jailbreak technique, "Bad Likert Judge," that increases attack success rates by over 60%. The method uses a Likert psychometric scale to evaluate and manipulate AI responses, tricking models into generating harmful content. Tested across multiple AI platforms, the technique exposes vulnerabilities in text-generation AI models from leading tech companies like OpenAI, Meta, and Google. The findings underscore the need for robust content filtering and advanced safeguards against such exploits.

Read more about this technique on The Hacker News here.