大模型时代下生成式可视媒体的机遇与挑战 第二十三期CCF秀湖会议报告

杨鑫; 王贝贝; 过洁; 夏佳志; 王莉莉; 吕琳; 刘利斌; 陈雪锦; 高林; 赫然; 徐凯; 周昆

doi:10.11991/cccf.202510009

大模型时代下生成式可视媒体的机遇与挑战第二十三期CCF秀湖会议报告

Generative Visual Media in the Era of Foundation Models: Opportunities and Challenges—Insights from the 23rd CCF Beautiful Lake Seminar

摘要

摘要: 生成式可视媒体涵盖图像、视频、三维几何以及虚拟现实（virtual reality, VR）、增强现实（augmented reality, AR）等媒体形式，正成为推动数字经济发展的变革性力量。在扩散模型、Transformer 架构与大规模多模态预训练技术的驱动下，生成式可视媒体重塑了传统内容创作范式，实现了更加真实、可控、高保真的内容生成。本报告总结了第二十三期CCF秀湖会议的核心观点，会议汇聚了来自全国的领域专家，围绕生成式可视媒体的理论基础、计算架构及其跨学科应用展开深入研讨。会议讨论聚焦于几何−物理一体化表示、原生三维特征提取、多模态控制对齐、计算机辅助设计（computer-aided design, CAD）/计算机辅助工程（computer-aided engineering, CAE）系统集成、多感官VR内容生成等关键技术挑战，并总结了“生成式可视媒体”的技术瓶颈、核心挑战与发展路径，最终达成行动共识。本报告通过凝练12项关键科学技术问题，总结生成式可视媒体的五类应用场景，并阐述生成式可视媒体技术对我国技术突破与产业落地的推动作用，为生成式可视媒体技术在我国智能制造、文化产业、国家安全等关键领域的部署提供了方向指引与技术路径。

Abstract: Generative visual media, encompassing images, videos, 3D geometry, as well as virtual reality (VR) and augmented reality (AR), is becoming a transformative force in the digital economy. Driven by diffusion models, Transformer architectures, and large-scale multimodal pretraining, it is reshaping traditional content creation paradigms and enabling more realistic, controllable, and high-fidelity generation. This report summarizes the key insights of the 23rd CCF Beautiful Lake Seminar, which brought together experts nationwide to discuss the theoretical foundations, computational architectures, and interdisciplinary applications of generative visual media. The discussions focused on major challenges such as geometry-physics integrated representation, native 3D feature extraction, multimodal alignment and control, CAD/CAE system integration, and multisensory VR content generation. The symposium identified the core bottlenecks, challenges, and development pathways of generative visual media and reached a consensus on future actions. This report distills twelve key scientific and technological questions, outlines five categories of application scenarios, and highlights the role of generative visual media in advancing China’s technological breakthroughs and industrial deployment, providing strategic guidance for its application in intelligent manufacturing, cultural industries, and national security.

参考文献(0)

施引文献

资源附件(0)