The Terminator.
Photo credit: Insomnia Cured Here/Flickr

Researchers have proposed a new structural framework for Embodied AI, aiming to bridge the gap between current artificial intelligence capabilities and the general-purpose intelligent robots envisioned in science fiction.

While recent advancements in AI have shown remarkable capabilities in language, vision and speech processing, these technologies remain largely “disembodied,” according to a new survey published in the journal SmartBot.

The authors argue that this passive analysis is insufficient for creating truly intelligent agents that can interact with the physical world.

They illustrate this limitation using the instruction “clean the room.” A classic, disembodied AI can process parts of this task — interpreting the command and detecting objects in a static image — but it cannot perform the physical actions required to complete it. An embodied agent, by contrast, must solve the entire problem through active interaction with its environment.

A robot roadmap

To guide future research in bridging this gap, the comprehensive new survey, “Embodied AI: A Survey on the Evolution from Perceptive to Behavioral Intelligence,” provides a systematic roadmap for the field.

The authors categorise the process of achieving intelligent behaviour into three distinct modules: Embodied Perception, Embodied Decision-Making and Embodied Execution.

Embodied Perception focuses on tasks primarily utilised for robot actions, such as sensing object properties for manipulation or building maps for mobility. It also includes “behaviour for perception,” where the robot uses its own movements to actively obtain information about its surroundings.

The second module, Embodied Decision-Making, addresses how the agent generates a sequence of behaviours to complete a human instruction. The research paper illustrates the complexity of this module using a striking example: a “terminator robot” that needs to search for a target person, walk to their position, and execute a final skill that requires sophisticated, integrated planning that current systems lack.

The final module, Embodied Execution, translates these decisions into physical action. The survey reviews primary algorithmic approaches for training the policy that maps skill descriptions and observations to concrete actions, including imitation learning and reinforcement learning.

The authors highlight a critical trend toward developing General-Purpose Execution Models capable of handling multiple skills within a single system, rather than training isolated, single-skill models.

By providing this comprehensive framework, the survey aims to structure the research landscape and offer a clear roadmap for developing the next generation of general-purpose intelligent agents.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Resilience by design: Protecting the North’s digital backbone

theFreesheet is the official media partner for Manchester Edge & Digital Infrastructure…

Funny business: Algorithms reveal hidden engineering of stand-up comedy

It may feel like a spontaneous conversation, but a new algorithmic analysis…

95% of AI pilots failing as companies driven by ‘fear of missing out’, Davos told

Ninety-five per cent of generative AI pilot projects are failing to deliver…

‘Digital harness’ needed to tame AI before it surpasses human intelligence

A “digital harness” is urgently needed to prevent artificial intelligence from outrunning…

Loneliness drives binge-watching addiction as viewers seek escape

New research indicates that people suffering from loneliness are significantly more likely…