Companies are doing lots of shiny new information technology projects based on this year’s hottest new tech, Artificial Intelligence! And 80% of these fail — double the failure rate of other …
Before I read this I said "it's because they have no idea WTF AI actually is" and then it said
The most common cause of failure is that the people running the projects have no idea what “AI” even is or does. “In some cases, leaders understand AI only as a buzzword and do not realize that simpler and cheaper solutions are available.”
Most of the time, technology just makes things happen faster, or at a larger scale.
With "AI" we're getting both larger and faster at the same time as businesses try and cash in as quickly as possible once they find out that their "LLM" has been trained on data that means it is in permanent idiot mode, can be unlocked with a few words, hallucinates every second response (oh sorry you're correct raspberry only has 2 R's in it), or keeps generating completely racist images.
And there's hardly any way to start small and improve upon it.
With regular code, I can write a small solution and piece by piece improve it. But with AI, it's more or less a gamble whether the results will ever get better at all. You might need to slightly rephrase the prompt, or it's completely impossible. But you don't know that. You can only try.
Just fyi, that’s not entirely true. If we’re just focusing on LLMs, structured and guided generation exists. Combine that with an eval set (= unit tests), you can at least track how well you’re doing. For sure, prompt engineering misses the feeling of being in control. You’ll also never be able to claim 100% coverage (although even with unit tests that’s not something you can claim, as there are always blind spots). What you gain over traditional coding, however, is that you can tackle problems that might otherwise take an infinite number of years to express in code. For example, how would you define the rules for detecting whether an image shows a bird?
It’s just a tool like any other. Overuse is currently detestably rife. But its value is there.
Source: ML engineer who secretly hates a lot about ML but is also in awe at the developments of the last few years.
There's two ways to make that number be what it is. The first is to remember that failure is different from poor performance. Maybe something is working kind of, so then the boss will say hey it's not a failure whatever, even though it's worse than what they had before or other options that they could have selected. The second way to skew the data is to define AI in a way that makes things that you already did count. And maybe that's legitimate, because what exactly is AI? If you're the project manager, maybe you get to choose the definition, in which case you're probably going to do something that makes your successful project look magical even if it's something that's been done for decades.