Advanced Search
Haoyi Zhu, Tong He. Aether: Geometric-Aware Unified World ModelingJ. Computing Magazine of the CCF, 2026, 2(5): 66−70. DOI: 10.11991/cccf.202605009
Citation: Haoyi Zhu, Tong He. Aether: Geometric-Aware Unified World ModelingJ. Computing Magazine of the CCF, 2026, 2(5): 66−70. DOI: 10.11991/cccf.202605009

Aether: Geometric-Aware Unified World Modeling

  • The integration of geometric reconstruction and generative modeling remains a critical challenge in developing AI systems capable of human-like spatial reasoning. This article proposes Aether, a unified framework that enables geometry-aware reasoning in world models by jointly optimizing three core capabilities: 4D dynamic reconstruction, action-conditioned video prediction, and goal-conditioned visual planning. Through task-interleaved feature learning, Aether achieves synergistic knowledge sharing across reconstruction, prediction, and planning objectives. Building upon video generation models, our framework demonstrates zero-shot synthetic-to-real generalization despite never observing real-world data during training. Furthermore, our approach achieves zero-shot generalization in both action following and reconstruction tasks, thanks to its intrinsic geometric modeling. Notably, even without real-world data, its reconstruction performance is comparable with or even better than that of domain-specific models. Additionally, Aether employs camera trajectories as geometry-informed action spaces, enabling effective action-conditioned prediction and visual planning. We hope our work inspires the community to explore new frontiers in physically-reasonable world modeling and its applications.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return