Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

AbuTahir@lemm.ee · edit-2 1 个月前

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

sp3ctr4l@lemmy.dbzer0.com · edit-2 1 个月前

This has been known for years, this is the default assumption of how these models work.

You would have to prove that some kind of actual reasoning capacity has arisen as… some kind of emergent complexity phenomenon… not the other way around.

Corpos have just marketed/gaslit us/themselves so hard that they apparently forgot this.

Riskable@programming.dev · 1 个月前

Define, “reasoning”. For decades software developers have been writing code with conditionals. That’s “reasoning.”

LLMs are “reasoning”… They’re just not doing human-like reasoning.

sp3ctr4l@lemmy.dbzer0.com · edit-2 1 个月前

Howabout uh…

The ability to take a previously given set of knowledge, experiences and concepts, and combine or synthesize them in a consistent, non contradictory manner, to generate hitherto unrealized knowledge, or concepts, and then also be able to verify that those new knowledge and concepts are actually new, and actually valid, or at least be able to propose how one could test whether or not they are valid.

Arguably this is or involves meta-cognition, but that is what I would say… is the difference between what we typically think of as ‘machine reasoning’, and ‘human reasoning’.

Now I will grant you that a large amount of humans essentially cannot do this, they suck at introspecting and maintaining logical consistency, that they are just told ‘this is how things work’, and they never question that untill decades later and their lives force them to address, or dismiss their own internally inconsisten beliefs.

But I would also say that this means they are bad at ‘human reasoning’.

Basically, my definition of ‘human reasoning’ is perhaps more accurately described as ‘critical thinking’.

technocrit@lemmy.dbzer0.com · edit-2 1 个月前

Peak pseudo-science. The burden of evidence is on the grifters who claim “reason”. But neither side has any objective definition of what “reason” means. It’s pseudo-science against pseudo-science in a fierce battle.

minoscopede@lemmy.world · edit-2 1 个月前

I see a lot of misunderstandings in the comments 🫤

This is a pretty important finding for researchers, and it’s not obvious by any means. This finding is not showing a problem with LLMs’ abilities in general. The issue they discovered is specifically for so-called “reasoning models” that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that’s a flaw that needs to be corrected before models can actually reason.

Tobberone@lemm.ee · 1 个月前

What statistical method do you base that claim on? The results presented match expectations given that Markov chains are still the basis of inference. What magic juice is added to “reasoning models” that allow them to break free of the inherent boundaries of the statistical methods they are based on?

minoscopede@lemmy.world · edit-2 1 个月前

I’d encourage you to research more about this space and learn more.

As it is, the statement “Markov chains are still the basis of inference” doesn’t make sense, because markov chains are a separate thing. You might be thinking of Markov decision processes, which is used in training RL agents, but that’s also unrelated because these models are not RL agents, they’re supervised learning agents. And even if they were RL agents, the MDP describes the training environment, not the model itself, so it’s not really used for inference.

I mean this just as an invitation to learn more, and not pushback for raising concerns. Many in the research community would be more than happy to welcome you into it. The world needs more people who are skeptical of AI doing research in this field.

Tobberone@lemm.ee · 1 个月前

Which method, then, is the inference built upon, if not the embeddings? And the question still stands, how does “AI” escape the inherent limits of statistical inference?

Zacryon@feddit.org · 1 个月前

Some AI researchers found it obvious as well, in terms of they’ve suspected it and had some indications. But it’s good to see more data on this to affirm this assessment.

jj4211@lemmy.world · 1 个月前

Particularly to counter some more baseless marketing assertions about the nature of the technology.

kreskin@lemmy.world · edit-2 1 个月前

Lots of us who has done some time in search and relevancy early on knew ML was always largely breathless overhyped marketing. It was endless buzzwords and misframing from the start, but it raised our salaries. Anything that exec doesnt understand is profitable and worth doing.

Zacryon@feddit.org · 1 个月前

Ragebait?

I’m in robotics and find plenty of use for ML methods. Think of image classifiers, how do you want to approach that without oversimplified problem settings?
Or even in control or coordination problems, which can sometimes become NP-hard. Even though not optimal, ML methods are quite solid in learning patterns of highly dimensional NP hard problem settings, often outperforming hand-crafted conventional suboptimal solvers in computation effort vs solution quality analysis, especially outperforming (asymptotically) optimal solvers time-wise, even though not with optimal solutions (but “good enough” nevertheless). (Ok to be fair suboptimal solvers do that as well, but since ML methods can outperform these, I see it as an attractive middle-ground.)

wetbeardhairs@lemmy.dbzer0.com · edit-2 1 个月前

Machine learning based pattern matching is indeed very useful and profitable when applied correctly. Identify (with confidence levels) features in data that would otherwise take an extremely well trained person. And even then it’s just for the cursory search that takes the longest before presenting the highest confidence candidate results to a person for evaluation. Think: scanning medical data for indicators of cancer, reading live data from machines to predict failure, etc.

And what we call “AI” right now is just a much much more user friendly version of pattern matching - the primary feature of LLMs is that they natively interact with plain language prompts.

AbuTahir@lemm.ee · 1 个月前

Cognitive scientist Douglas Hofstadter (1979) showed reasoning emerges from pattern recognition and analogy-making - abilities that modern AI demonstrably possesses. The question isn’t if AI can reason, but how its reasoning differs from ours.

Knock_Knock_Lemmy_In@lemmy.world · 1 个月前

When given explicit instructions to follow models failed because they had not seen similar instructions before.

This paper shows that there is no reasoning in LLMs at all, just extended pattern matching.

MangoCats@feddit.it · 1 个月前

I’m not trained or paid to reason, I am trained and paid to follow established corporate procedures. On rare occasions my input is sought to improve those procedures, but the vast majority of my time is spent executing tasks governed by a body of (not quite complete, sometimes conflicting) procedural instructions.

If AI can execute those procedures as well as, or better than, human employees, I doubt employers will care if it is reasoning or not.

Knock_Knock_Lemmy_In@lemmy.world · 1 个月前

Sure. We weren’t discussing if AI creates value or not. If you ask a different question then you get a different answer.

MangoCats@feddit.it · 1 个月前

Well - if you want to devolve into argument, you can argue all day long about “what is reasoning?”

technocrit@lemmy.dbzer0.com · edit-2 1 个月前

This would be a much better paper if it addressed that question in an honest way.

Instead they just parrot the misleading terminology that they’re supposedly debunking.

How dat collegial boys club undermines science…

Knock_Knock_Lemmy_In@lemmy.world · edit-2 1 个月前

You were starting a new argument. Let’s stay on topic.

The paper implies “Reasoning” is application of logic. It shows that LRMs are great at copying logic but can’t follow simple instructions that haven’t been seen before.

theherk@lemmy.world · 1 个月前

Yeah these comments have the three hallmarks of Lemmy:

AI is just autocomplete mantras.
Apple is always synonymous with bad and dumb.
Rare pockets of really thoughtful comments.

Thanks for being at least the latter.

technocrit@lemmy.dbzer0.com · edit-2 1 个月前

There’s probably alot of misunderstanding because these grifters intentionally use misleading language: AI, reasoning, etc.

If they stuck to scientifically descriptive terms, it would be much more clear and much less sensational.

REDACTED@infosec.pub · edit-2 1 个月前

What confuses me is that we seemingly keep pushing away what counts as reasoning. Not too long ago, some smart alghoritms or a bunch of instructions for software (if/then) was officially, by definition, software/computer reasoning. Logically, CPUs do it all the time. Suddenly, when AI is doing that with pattern recognition, memory and even more advanced alghoritms, it’s no longer reasoning? I feel like at this point a more relevant question is “What exactly is reasoning?”. Before you answer, understand that most humans seemingly live by pattern recognition, not reasoning.

https://en.wikipedia.org/wiki/Reasoning_system

stickly@lemmy.world · 1 个月前

If you want to boil down human reasoning to pattern recognition, the sheer amount of stimuli and associations built off of that input absolutely dwarfs anything an LLM will ever be able to handle. It’s like comparing PhD reasoning to a dog’s reasoning.

While a dog can learn some interesting tricks and the smartest dogs can solve simple novel problems, there are hard limits. They simply lack a strong metacognition and the ability to make simple logical inferences (eg: why they fail at the shell game).

Now we make that chasm even larger by cutting the stimuli to a fixed token limit. An LLM can do some clever tricks within that limit, but it’s designed to do exactly those tricks and nothing more. To get anything resembling human ability you would have to design something to match human complexity, and we don’t have the tech to make a synthetic human.

MangoCats@feddit.it · 1 个月前

I think as we approach the uncanny valley of machine intelligence, it’s no longer a cute cartoon but a menacing creepy not-quite imitation of ourselves.

technocrit@lemmy.dbzer0.com · 1 个月前

It’s just the internet plus some weighted dice. Nothing to be afraid of.

technocrit@lemmy.dbzer0.com · 1 个月前

Sure, these grifters are shady AF about their wacky definition of “reason”… But that’s just a continuation of the entire “AI” grift.

Jhex@lemmy.world · 1 个月前

this is so Apple, claiming to invent or discover something “first” 3 years later than the rest of the market

postmateDumbass@lemmy.world · 1 个月前

Trust Apple. Everyone else who were in the space first are lying.

Harbinger01173430@lemmy.world · 1 个月前

XD so, like a regular school/university student that just wants to get passing grades?

vala@lemmy.world · 1 个月前

No shit

Grizzlyboy@lemmy.zip · 1 个月前

What a dumb title. I proved it by asking a series of questions. It’s not AI, stop calling it AI, it’s a dumb af language model. Can you get a ton of help from it, as a tool? Yes! Can it reason? NO! It never could and for the foreseeable future, it will not.

It’s phenomenal at patterns, much much better than us meat peeps. That’s why they’re accurate as hell when it comes to analyzing medical scans.

GaMEChld@lemmy.world · 1 个月前

Most humans don’t reason. They just parrot shit too. The design is very human.

El Barto@lemmy.world · 1 个月前

LLMs deal with tokens. Essentially, predicting a series of bytes.

Humans do much, much, much, much, much, much, much more than that.

Zexks@lemmy.world · 1 个月前

No. They don’t. We just call them proteins.

El Barto@lemmy.world · 1 个月前

“They”.

What are you?

stickly@lemmy.world · 1 个月前

You are either vastly overestimating the Language part of an LLM or simplifying human physiology back to the Greek’s Four Humours theory.

Zexks@lemmy.world · 29 天前

No. I’m not. You’re nothing more than a protein based machine on a slow burn. You don’t even have control over your own decisions. This is a proven fact. You’re just an ad hoc justification machine.

stickly@lemmy.world · 29 天前

How many trillions of neuron firings and chemical reactions are taking place for my machine to produce an output? Where are these taking place and how do these regions interact? What are the rules for storing and reshaping memory in response to stimulus? How many bytes of information would it take to describe and simulate all of these systems together?

The human brain alone has the capacity for about 2.5PB of data. Our sensory systems feed data at a rate of about 10⁹ bits/s. The entire English language, compressed, is about 30MB. I can download and run an LLM with just a few GB. Even the largest context windows are still well under 1GB of data.

Just because two things both find and reproduce patterns does not mean they are equivalent. Saying language and biological organisms both use “bytes” is just about as useful as saying the entire universe is “bytes”; it doesn’t really mean anything.

skisnow@lemmy.ca · 1 个月前

I hate this analogy. As a throwaway whimsical quip it’d be fine, but it’s specious enough that I keep seeing it used earnestly by people who think that LLMs are in any way sentient or conscious, so it’s lowered my tolerance for it as a topic even if you did intend it flippantly.

GaMEChld@lemmy.world · 1 个月前

I don’t mean it to extol LLM’s but rather to denigrate humans. How many of us are self imprisoned in echo chambers so we can have our feelings validated to avoid the uncomfortable feeling of thinking critically and perhaps changing viewpoints?

Humans have the ability to actually think, unlike LLM’s. But it’s frightening how far we’ll go to make sure we don’t.

joel_feila@lemmy.world · 1 个月前

Thata why ceo love them. When your job is 90% spewing bs a machine that does that is impressive

SpaceCowboy@lemmy.ca · 1 个月前

Yeah I’ve always said the the flaw in Turing’s Imitation Game concept is that if an AI was indistinguishable from a human it wouldn’t prove it’s intelligent. Because humans are dumb as shit. Dumb enough to force one of the smartest people in the world take a ton of drugs which eventually killed him simply because he was gay.

Zenith@lemm.ee · 1 个月前

Yeah we’re so stupid we’ve figured out advanced maths, physics, built incredible skyscrapers and the LHC, we may as individuals be less or more intelligent but humans as a whole are incredibly intelligent

crunchy@lemmy.dbzer0.com · 1 个月前

I’ve heard something along the lines of, “it’s not when computers can pass the Turing Test, it’s when they start failing it on purpose that’s the real problem.”

jnod4@lemmy.ca · 1 个月前

I think that person had to choose between the drugs or hard core prison of the 1950s England where being a bit odd was enough to guarantee an incredibly difficult time as they say in England, I would’ve chosen the drugs as well hoping they would fix me, too bad without testosterone you’re going to be suicidal and depressed, I’d rather choose to keep my hair than to be horny all the time

crystalmerchant@lemmy.world · 1 个月前

I mean… Is that not reasoning, I guess? It’s what my brain does-- recognizes patterns and makes split second decisions.

mavu@discuss.tchncs.de · 1 个月前

Yes, this comment seems to indicate that your brain does work that way.

Aatube@kbin.melroy.org · 1 个月前

What’s the news? I don’t trust this guy if he thought it wasn’t known that AI is overdriven pattern matching.

burgerpocalyse@lemmy.world · 1 个月前

hey I cant recognize patterns so theyre smarter than me at least

ZILtoid1991@lemmy.world · 1 个月前

Thank you Captain Obvious! Only those who think LLMs are like “little people in the computer” didn’t knew this already.

TheFriar@lemm.ee · 1 个月前

Yeah, well there are a ton of people literally falling into psychosis, led by LLMs. So it’s unfortunately not that many people that already knew it.

joel_feila@lemmy.world · 1 个月前

Dude they made chat gpt a little more boit licky and now many people are convinced they are literal messiahs. All it took for them was a chat bot and a few hours of talk.

Pulptastic@midwest.social · 1 个月前

SplashJackson@lemmy.ca · 1 个月前

Just like me

alexdeathway@programming.dev · 1 个月前

python code for reversing the linked list.

flandish@lemmy.world · 1 个月前

stochastic parrots. all of them. just upgraded “soundex” models.

this should be no surprise, of course!

NostraDavid@programming.dev · 1 个月前

OK, and? A car doesn’t run like a horse either, yet they are still very useful.

I’m fine with the distinction between human reasoning and LLM “reasoning”.

Brutticus@midwest.social · 1 个月前

Then use a different word. “AI” and “reasoning” makes people think of Skynet, which is what the weird tech bros want the lay person to think of. LLMs do not “think”, but that’s not to say I might not be persuaded of their utility. But thats not the way they are being marketed.

fishy@lemmy.today · 1 个月前

The guy selling the car doesn’t tell you it runs like a horse, the guy selling you AI is telling you it has reasoning skills. AI absolutely has utility, the guys making it are saying it’s utility is nearly limitless because Tesla has demonstrated there’s no actual penalty for lying to investors.

technocrit@lemmy.dbzer0.com · 1 个月前

Cars are horses. How do you feel about statement?

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

archive.is