What Constitutes the “Brain” of an Embodied Intelligence System?
-
Graphical Abstract
-
Abstract
Embodied intelligence refers to the intelligence emerging from the close coupling between an agent's physical body and its environment. By highlighting intelligence as shaped by bodily interactions, it inherently possesses characteristics of embodiment, situatedness, proactivity, and interactivity, and it is considered essential for effectively applying artificial intelligence within the physical world. Regarding the silicon-based “brain” of embodied intelligence systems, it is expected not only to control the physical body but also to perceive the environment, memorize contextual information, and plan actions. To enhance these capabilities, the “brain” of an embodied intelligence system should integrate the visual observations, historical contexts, and prior knowledge. Additionally, it should be capable of envisioning future scenarios and adapting to its environment. These capabilities closely align with current technologies such as embodied foundation models, world models, etc. In this paper, we first outline the definition and characteristics of embodied intelligence, then analyze the framework of embodied intelligence systems and review functional modules along with their interrelations. Furthermore, we discuss related technologies, including embodied foundation models, world models, embodied memory, action planning, and humanoid learning. Finally, the paper explores future trends in technological development in this field.
-
-