高级检索

迎接人工智能挑战:构建下一代数据库系统

Meeting the Artificial Intelligence Challenge: Building the Next-Generation DBMS

  • 摘要: 随着生成式人工智能(artificial intelligence, AI)的爆发式发展,数据管理面临数据爆发式增长、实时动态更新、多模态语义割裂及隐私安全等新挑战,传统数据库系统难以适配AI时代的需求。本文分析了AI驱动下数据库管理系统的核心关键技术,包括超大规模非结构化数据处理、多模态数据语义融合、实时更新效率及隐私保护等问题。在此背景下,深圳计算科学研究院采用“理论+工程”双轨模式,借鉴贝尔实验室的研发范式,构建了全自研的崖山数据库系统(YashanDB)。该系统通过原创的基于事务代价的运行时调度方法、细粒度多版本并发控制、受限资源计算理论、共享集群架构(聚合内存、去中心化事务管理等)、多模态语义连接及轻量化隐私增强等关键技术,实现了高并发事务处理、数据尺度无关查询、大数据实时计算、多模态统一管理以及数据安全隐私保护,有效应对了AI时代的数据管理挑战。崖山数据库已在金融、能源、政务等关键领域核心系统落地应用,为AI时代提供了自主可控的核心数据底座。

     

    Abstract: With the explosive development of generative artificial intelligence (AI), data management is confronted with new challenges such as explosive data growth, real-time dynamic updates, multi-modal semantic fragmentation, and privacy security, making traditional database systems incompatible with the needs of the AI era. This article analyzes the core key technologies of database management systems driven by AI, including issues like ultra-large-scale unstructured data processing, semantic fusion of multi-modal data, real-time update efficiency, and privacy protection. Against this backdrop, Shenzhen Institute of Computing Science has adopted a “theory + engineering” dual-track model, drawing on the R&D paradigm of Bell Labs, to build the fully self-developed Yashan Database System (YashanDB). The system has achieved high-concurrency transaction processing, scale-independent query processing, real-time computing of big data, unified management of multi-modal data, and data security and privacy protection through key technologies such as the original runtime scheduling method based on transaction cost, fine-grained multi-version concurrency control, bounded resource computing theory, shared cluster architecture, including cohesive memory and decentralized transaction management, multi-modal semantic connection, and lightweight privacy enhancement. It effectively addresses the data management challenges in the AI era. YashanDB has been applied in core systems of key fields such as finance, energy, and government affairs, providing a self-controllable core data base for the AI era.

     

/

返回文章
返回