面向国产智能芯片的统一智能计算架构

王豪杰; 潘泽众; 黄嘉成; 李映辉; 赵家程; 武永卫

doi:10.11991/cccf.202507005

面向国产智能芯片的统一智能计算架构

Unified Artificial Intelligence Computing Architecture for Domestic Artificial Intelligence Chips

摘要

摘要: 智能计算架构是人工智能系统的底层基础设施，包含了算子库/通信库、领域编程语言、硬件特定编程语言等关键层次。国际主流智能计算架构以英伟达为主导，国内各厂商虽然也都建立了自己的智能计算架构，但不同芯片之间架构差异较大，无法协同，导致国内智能计算生态长期落后。启元实验室牵头提出了面向国产智能芯片的九源统一智能计算架构，通过软件优化和标准化建设, 实现整个智能计算软硬件生态的高效分工合作，保证芯片与智能计算生态的高可用性。目前九源统一智能计算架构已经在英伟达以及多款国产芯片、不同的中央处理器（central processing unit, CPU）以及端侧芯片上进行了初步适配，支持大模型、图像处理等典型的智能模型推理任务，并具有与芯片原生计算架构相当的性能。

Abstract: The artificial intelligence(AI) computing architecture serves as the foundational infrastructure for AI systems. It encompasses key layers such as operator libraries/communication libraries, domain-specific programming languages, and hardware-specific programming languages. Internationally, the mainstream AI computing architecture is dominated by NVIDIA. Although domestic manufacturers have also developed their own AI computing architectures, significant differences among these architectures prevent interoperability across different chips, resulting in a fragmented ecosystem and a lag in the overall development of China’s AI computing capabilities. To address this challenge, Qiyuan Lab has led the development of the Jiuyuan Unified AI Computing Architecture for domestic AI chips. Through software optimization and standardization, this architecture enables efficient collaboration across the entire AI computing software and hardware ecosystem, ensuring high availability of both chips and the broader AI computing infrastructure. At present, the Jiuyuan Unified AI Computing Architecture has been preliminarily adapted to NVIDIA chips as well as multiple domestic chips, different central processing units (CPUs), and edge devices. It supports typical AI model inference tasks such as large language models and image processing, achieving performance comparable to native chip-specific architectures.

参考文献(26)

施引文献

资源附件(0)