Apple published a paper criticizing the capabilities of Large Language Models (LLMs) in reasoning and formal logic. The paper builds on previous arguments made by Gary Marcus and Subbarao Kambhampati about LLMs’ limitations in generalizing beyond their training distribution.
The authors of the paper demonstrated that even the latest “reasoning models” fail to reason reliably on classic problems like the Tower of Hanoi. LLMs cannot solve the Tower of Hanoi reliably, even with the solution algorithm given to them.
The paper argues that LLMs are not a substitute for well-specified conventional algorithms and have limitations that are becoming clearer. LLMs are not a direct route to AGI and while the field of neural networks is not dead, current approach has clear limitations.
The paper highlights the importance of combining human adaptiveness with computational brute force and reliability in AI development.