高级检索

具身智能系统的“大脑”里都有什么?

What Constitutes the “Brain” of an Embodied Intelligence System?

  • 摘要: 具身智能是指通过身体与环境互动来实现的智能,具有涉身性、情境性、主动性和交互性的特点,是人工智能走向物理世界的核心关键。具身智能涉及身体、环境和智能三者之间的多样关系,真实物理世界丰富多样,具身智能系统的硅基大脑不仅是机械身体的指挥官,还是环境感知的融合器、情境知识的记忆器和场景行为的预测器,既需要考虑到环境信息、历史信息和知识信息,也涉及行为的规划、未来的预测和自主的学习,这与具身大模型、具身世界模型和具身记忆等都有紧密关系。本文首先介绍具身智能的定义、内涵与外延,分析具身智能系统的组成结构,梳理具身智能系统“大脑”的功能板块及相互关系,探讨具身大模型、世界模型、具身记忆、行为预测、自主学习等技术在具身智能系统中的作用、适用场景及研究现状,并对未来的技术发展趋势进行展望。

     

    Abstract: Embodied intelligence refers to the intelligence emerging from the close coupling between an agent's physical body and its environment. By highlighting intelligence as shaped by bodily interactions, it inherently possesses characteristics of embodiment, situatedness, proactivity, and interactivity, and it is considered essential for effectively applying artificial intelligence within the physical world. Regarding the silicon-based “brain” of embodied intelligence systems, it is expected not only to control the physical body but also to perceive the environment, memorize contextual information, and plan actions. To enhance these capabilities, the “brain” of an embodied intelligence system should integrate the visual observations, historical contexts, and prior knowledge. Additionally, it should be capable of envisioning future scenarios and adapting to its environment. These capabilities closely align with current technologies such as embodied foundation models, world models, etc. In this paper, we first outline the definition and characteristics of embodied intelligence, then analyze the framework of embodied intelligence systems and review functional modules along with their interrelations. Furthermore, we discuss related technologies, including embodied foundation models, world models, embodied memory, action planning, and humanoid learning. Finally, the paper explores future trends in technological development in this field.

     

/

返回文章
返回