Microsoft, OpenAI, Google, Meta genAI models could be convinced to ditch their guardrails, opening the door to chatbots giving unfettered answers on building bombs, creating malware, and much more.
None of this is news, this jailbreak has been around forever.
It’s literally just a spoof of authority.
Thing is, gpt still sucks ass at coding. I don’t think that’s changing any time soon. These models get their power from what’s done most commonly but, as we know, what’s done commonly can be vuln, change when a new update is dropped, etc etc.
Maybe don't give your LLMs access to compromising data such as emails? Then it will remain likely mostly a use to circumvent limitations for porn roleplay or possibly hallucinated manuals to create a nuclear bomb or whatever.
I wonder how many people will get fired over a keyword based alarm for the words "kill" and "child" in the same sentence in an LLM. It's probably not going to be 0...
Turns out you can lie to AI because it’s not intelligent. Predictive text is fascinating with many R&D benefits, but people (usually product people) talking about it like a thinking thing are just off the rails.