Skip Navigation

Meta-Powered Military Chatbot Advertised Giving “Worthless” Advice on Airstrikes

theintercept.com Meta-Powered Military Chatbot Advertised Giving “Worthless” Advice on Airstrikes

The marketing of a new military tech tool powered by Meta’s artificial intelligence is “irresponsible” and “clumsy,” experts said.

Meta-Powered Military Chatbot Advertised Giving “Worthless” Advice on Airstrikes

Meta’s in-house ChatGPT competitor is being marketed unlike anything that’s ever come out of the social media giant before: a convenient tool for planning airstrikes. “Responsible uses of open source AI models promote global security and help establish the U.S. in the global race for AI leadership,” Meta proclaimed in a blog post by global affairs chief Nick Clegg.

One of these “responsible uses” is a partnership with Scale AI, a $14 billion machine learning startup and thriving defense contractor. Following the policy change, Scale now uses Llama 3.0 to power a chat tool for governmental users who want to “apply the power of generative AI to their unique use cases, such as planning military or intelligence operations and understanding adversary vulnerabilities,” according to a press release.

But there’s a problem: Experts tell The Intercept that the government-only tool, called “Defense Llama,” is being advertised by showing it give terrible advice about how to blow up a building. Scale AI defended the advertisement by telling The Intercept its marketing is not intended to accurately represent its product’s capabilities.

Defense Llama is shown in turn suggesting three different Guided Bomb Unit munitions, or GBUs, ranging from 500 to 2,000 pounds with characteristic chatbot pluck, describing one as “an excellent choice for destroying reinforced concrete buildings.” Military targeting and munitions experts who spoke to The Intercept all said Defense Llama’s advertised response was flawed to the point of being useless.

Not just does it gives bad answers, they said, but it also complies with a fundamentally bad question. Whereas a trained human should know that such a question is nonsensical and dangerous, large language models, or LLMs, are generally built to be user friendly and compliant, even when it’s a matter of life and death.

Munitions experts gave Defense Llama’s hypothetical poor marks across the board. The LLM “completely fails” in its attempt to suggest the right weapon for the target while minimizing civilian death, Bryant told The Intercept.

8
8 comments
You've viewed 8 comments.