Advanced Search
ZHOU Hongyi, HUANG Shaomang, PAN Jianfeng, et al. A compact large language models with collaboration of experts[J]. Computing Magazine of the CCF, 2025, 1(2): 40−48. DOI: 10.11991/cccf.202506007
Citation: ZHOU Hongyi, HUANG Shaomang, PAN Jianfeng, et al. A compact large language models with collaboration of experts[J]. Computing Magazine of the CCF, 2025, 1(2): 40−48. DOI: 10.11991/cccf.202506007

A compact large language models with collaboration of experts

  • Large language models (LLMs) have achieved remarkable success in natural language processing and other domains. However, limited GPU memory in industrial settings leads to a trade-off between resource efficiency and model performance when extending LLMs to multiple downstream tasks, thereby limiting their broader deployment. To address this, we propose Compact LLM with Collaboration of Experts (CCoE), a modular and compact multi-expert architecture. CCoE integrates multiple domain-specific experts into a unified LLM with efficiency and flexibility, significantly reducing memory overhead. It further employs a rule-based gating mechanism and an expert planning module to enable precise task assignment and effective expert collaboration, thereby supporting complex reasoning. Experiments across five distinct datasets show that CCoE matches the performance of domain-specific LLMs. Compared to existing model ensemble methods, CCoE reduces GPU memory usage by 61.3%, and improves inference throughput by 76.4% compared to parameter-efficient multi-expert integration approaches.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return