
The Original Intent of AI: A Misunderstanding?
When discussing artificial intelligence, most people today think of chatty ChatGPT, artistic Midjourney, or self-driving Teslas. But are these truly the most suitable applications for AI?
Let’s return to the beginning and ask a simple question: why did people initially create AI? This question lacks a standard answer.
When Turing asked in 1950, “Can machines think?” he was exploring logical reasoning abilities like chess and puzzles. At the Dartmouth Conference, the core idea behind the concept of “artificial intelligence” was to enable machines to possess human-like symbolic reasoning abilities. Early AI applications were focused on proving mathematical theorems and processing natural language information.
In other words, the original design logic of AI centers on processing symbols, organizing information, and solving abstract problems. Its initial purpose was not to manipulate objects or perform complex physical tasks, but rather to serve as an auxiliary tool in abstract scenarios.
What Happened Later?
A turning point occurred as robotics technology matured. The precision of robotic arms improved, sensor costs decreased, and autonomous driving technology found a practical opportunity. Naturally, people began to envision combining AI’s logical capabilities with physical hardware, allowing machines to accomplish more real-world tasks.
Thus, a large-scale exploration of capability “transfer” began. AI, originally focused on text and symbols, was equipped with cameras, microphones, mobile bases, and mechanical execution structures. People hoped it could engage in smooth conversations, recognize environments, and perform physical tasks like pouring water or moving objects.
But can a system trained primarily in abstract logic truly understand the operational logic of the physical world? From the current outcomes, the answer is not optimistic.
AI can accurately state, “A cup is a container for holding water,” but it cannot perceive the smoothness or weight of the cup’s walls. It can recognize the textual definition of “danger” but struggles to predict real-world sudden risks. It can perform complex mathematical calculations but finds it challenging to correlate abstract numbers with real-world scenarios.
It mimics human understanding logic rather than forming a reality-based cognition.
Imitation Doesn’t Equal Understanding
This is akin to someone who has never been to Paris but has read all the books and historical materials about it. When discussing the local culture and landmarks, they can articulate clearly with rich details, but they likely do not truly understand the city.
Today’s AI resembles this “well-read tourist.” Its knowledge system derives entirely from second-hand information like text, images, and videos, without direct interaction with the real world. Its understanding of “water” is based on the probabilistic associations between words, not the tactile sensation of flowing liquid or the attribute of quenching thirst. Its recognition of a “cup” is a matching relationship between pixels and labels, not the fragile nature or the feel of holding it.
If AI remains confined to abstract scenarios like information processing and content creation, this cognitive model may not present significant issues. However, when deployed in physical environments where it must autonomously complete tasks and make environmental decisions, the shortcomings of lacking real interaction experience become apparent. A cognition built on second-hand information struggles to adapt to the ever-changing real world.
Potential Cognitive Bias
This does not deny AI’s practical value. On the contrary, in the realm of symbolic information processing, AI’s capabilities have already surpassed human abilities—efficiently completing tasks like copywriting, information retrieval, language translation, and code writing.
What needs reevaluation is the approach of directly transferring AI’s capabilities from abstract domains to physical world applications. Real-world operational tasks require not only theoretical knowledge but also scene judgment, error tolerance, and dynamic adaptability formed through direct experience. These abilities often need to be accumulated gradually through trial and error, perception, and feedback.
The core issue may not be the insufficiency of computing power, model scale, or training data, but rather the compatibility between the underlying design logic and application scenarios. Using a framework designed for abstract symbols to tackle the complex and variable physical world inherently presents limitations.
This is akin to how birds can evolve strong flying abilities but struggle to adapt fully to aquatic environments, as there are fundamental physiological differences between the two.
Is There a More Suitable Path?
If the goal is to design intelligent systems that adapt to the real physical world, their underlying logic may need complete reconstruction.
Their cognitive starting point should be direct interaction with the real world: touch to perceive softness and hardness, collisions to understand weight, tipping to recognize flow, and trial and error to avoid risks.
Their knowledge system should not rely solely on fitting vast amounts of static data but should grow from continuous scene interactions and causal feedback.
Their decision logic should not be limited to probabilistic matching but should rely on contextual constraints and scene associations to form unique and reasonable judgments.
This logic closely aligns with the growth and learning process of human children. A child does not need to browse countless images; by simply touching, playing with, or dropping an object once, they can establish a complete understanding of a cup.
Real understanding that aligns with reality often does not require massive data accumulation; the core lies in direct scene interaction and logical convergence. This is also a direction worth deep contemplation in the current development of AI.
Returning to the Initial Question
What is the original intent of artificial intelligence?
Perhaps we will never provide a unified answer, but one fact is clear: if the intent was to process symbolic information, assist in abstract work, and expand human logical reasoning capabilities, then current AI has delivered an excellent performance.
However, if we expect machines to deeply understand the real world, autonomously complete complex physical operations, and adapt to dynamic environments, then the current development path indeed needs reevaluation and optimization.
The key issue lies not in the speed of technological iteration but in the compatibility of the underlying architecture with application goals. A system designed for abstract symbols, even equipped with numerous sensory hardware, will struggle to naturally form a sensory understanding of the real world.
This challenge is not only technical but also a fundamental philosophical question about “cognition and understanding.”
Comments
Discussion is powered by Giscus (GitHub Discussions). Add
repo,repoID,category, andcategoryIDunder[params.comments.giscus]inhugo.tomlusing the values from the Giscus setup tool.