Abstract:
The artificial intelligence(AI) computing architecture serves as the foundational infrastructure for AI systems. It encompasses key layers such as operator libraries/communication libraries, domain-specific programming languages, and hardware-specific programming languages. Internationally, the mainstream AI computing architecture is dominated by NVIDIA. Although domestic manufacturers have also developed their own AI computing architectures, significant differences among these architectures prevent interoperability across different chips, resulting in a fragmented ecosystem and a lag in the overall development of China’s AI computing capabilities. To address this challenge, Qiyuan Lab has led the development of the Jiuyuan Unified AI Computing Architecture for domestic AI chips. Through software optimization and standardization, this architecture enables efficient collaboration across the entire AI computing software and hardware ecosystem, ensuring high availability of both chips and the broader AI computing infrastructure. At present, the Jiuyuan Unified AI Computing Architecture has been preliminarily adapted to NVIDIA chips as well as multiple domestic chips, different central processing units (CPUs), and edge devices. It supports typical AI model inference tasks such as large language models and image processing, achieving performance comparable to native chip-specific architectures.