Alibaba’s Tongyi Qianwen team wins NeurIPS 2025 Best Paper Award

2025-11-29 12:04:00+08

At tonight NeurIPS 2025 conference—the world premier AI event—Alibaba Tongyi Qianwen team received the Best Paper Award for their work “Attention Gating Makes Better Foundation Models.” It is the only Chinese-led paper among the four winners, selected from a record 20,000 submissions (acceptance rate: 25%), marking the most competitive year in the conference history.

The paper introduces Attention Gating, a lightweight“sliding door”mechanism that adds a learnable gate after standard attention to dynamically filter which attention heads and tokens proceed to downstream computation. In tests, models trained on 3.5 trillion tokens—including a 1.7B dense and a 15B MoE variant—showed consistent gains: 0.2 lower perplexity, +2 points on MMLU, and improved performance across all subdomains of The Pile, with just a 1% parameter increase.

The team likens the gate to a“security checkpoint” that blocks irrelevant information before it reaches the feedforward network, boosting both efficiency and model robustness.

The technique has already been integrated into the upcoming Qwen3-Next model. Alibaba has open-sourced the code and the 1.7B experimental model on GitHub to encourage community validation. Looking ahead, Tongyi Qianwen plans to extend this gating approach to multimodal and long-context scenarios, aiming to make “self-filtering attention”a standard feature in next-generation large models.

Return to News List