4w ago

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

4w ago

archive.is /bJvH4

28 comments

No shit, that's how LLMs work.
- This gets me often. You keep finding papers and studies claiming things I thought were well understood, which ends up revealing corporate hype that had passed me by.
  So it turns out that letting a LLM self-prompt for a while before responding makes it a bit tighter in some ways but not self aware, huh? I have learned that this was a thing people were unclear about, and nothing else.
I really don't want to give a billion dollar corporation credit for "proving" something a shit ton of people have been saying this whole time.
The only people saying this was true AI was the people who work for these companies and the investors who fell for it.
Most of the "big uses" have been literal mechanical Turks with a human pretending to be a program.
It's just when capitalism drives science, it only matters what the wealthy say, and apple is very very wealthy
- There’s nothing wrong with scientifically proving something that’s commonly known. In fact, that’s an important duty of science, even if it’s not a glamorous one.
  
  Exactly. “Conventional wisdom” is often inaccurate, or outright incorrect.
  
  But others have been showing this for years...
  You don't often hear about the 17th time an experiment reaches the same conclusion.
  But like I said, people will care about it. Because capitalism drives science so it matters more when a billion dollar corporation says it than countless subject matter experts.
  Investors don't listen to them, but they'll listen to apple.
  
  You aren’t wrong by in this case, nothing needs to be proven by a 3rd party since anyone recently in programming knows how LLMs works. It’s factual.
- People need to be told, as too many have no judgment or critical thinking anymore.
  This is important. And it will help them get back to reality.
Was this not already common sense?
- It is important to always remember that the vast majority of people are stupid.
But Sam Altman called it a “reasoning model.” Would he lie?!
I'm not sure what's novel here. No one thought that modern AI could solve arbitrarily complex logic problems, or even that modern AI was particularly good at formal reasoning. I would call myself an AI optimist but I would have been surprised if the article found any result other than the one it did. (Where exactly the models fail is interesting, but the fact that they do at all isn't.) Furthermore, the distinction between reasoning and memorizing patterns in the title of this post is artificial - reasoning itself involves a great deal of pattern recognition.
- Most CEOs and business grads think LLMs are a universal cureall.
  There were studies out last week that indicate that most Gen Alpha think LLMs are AGI. The marketing is working.
- No one thought that modern AI could solve arbitrarily complex logic problems, or even that modern AI was particularly good at formal reasoning.
  haha, except pretty much everyone in the c-suite at the company I work for.
- I don’t think the study was meant to be novel. It looks like it was only intended to provide scientific evidence about exactly where current AIs fail.
- No one thought that modern AI could solve arbitrarily complex logic problems
  Except half the threads on Hacker News and Lobsters and LinkedIn.
- Whats novel is that a major tech company is officially saying what they all know is true.
  That Apple is finding itself the only major tech player without their own LLM likely plays heavily into why they are throwing water on the LLM fire, but it is still nice to see one of them admitting the truth.
  Also reasoning is pattern recognition with context. None of the "AI" models have contextual capability. For Claude, i refer you to Claude Plays Pokemon on twitch. It is a dumpster fire.
- I just find it shockingly good at producing working bits of code that work perfectly and all the variables and functions/methods seem aptly named and such. Its very curious
I'm a bit torn on this. On one hand: obviously LLMs do this, since they're essentially just huge pattern recognition and prediction machines, and basically any person probing them with new complex problems has made that exact observation already. On the other hand: a lot of everyday things us humans do are not that dissimilar from recognizing patterns and remembering a solution, and it feels like doing this step well is a reasonable intermediate step towards AGI, and not as hugely far off as this article makes it out to be.
- The human brain is not an ordered, carefully engineered thinking machine; it's a massive hodge-podge of heuristic systems to solve a lot of different classes of problems, which makes sense when you remember it evolved over millions of years as our very distant ancestors were exposed to radically different environments and challenges.
  Likewise, however AGI is built, in order to communicate with humans and solve most of the same problems, it's probably going to take an amalgamation of different algorithms, just like brains.
  All of this to say, I agree memorization will probably be an integral part of that system, but it's also going to be a small part of the final system. So I also agree with the article that we're way off from AGI.
Link to the actual article:
https://machinelearning.apple.com/research/illusion-of-thinking
- Direct link to the actual paper:
  https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf
I'd love to see these complexity results compared against humans for a laugh.

28 comments