Alibaba’s Tongyi Qianwen team wins NeurIPS 2025 Best Paper Award

Source: Saiyp | Date: 2025-11-29 20:04:00

At tonight’s NeurIPS 2025 conference—the world’s premier AI event—Alibaba’s Tongyi Qianwen team received the Best Paper Award for their work “Attention Gating Makes Better Foundation Models.” It is the only Chinese-led paper among the four winners, selected from a record 20,000 submissions (acceptance rate: 25%), marking the most competitive year in the conference’s history.

The paper introduces Attention Gating, a lightweight “sliding door” mechanism that adds a learnable gate after standard attention to dynamically filter which attention heads and tokens proceed to downstream computation. In tests, models trained on 3.5 trillion tokens—including a 1.7B dense and a 15B MoE variant—showed consistent gains: 0.2 lower perplexity, +2 points on MMLU, and improved performance across all subdomains of The Pile, with just a 1% parameter increase.

The team likens the gate to a “security checkpoint” that blocks irrelevant information before it reaches the feedforward network, boosting both efficiency and model robustness.

The technique has already been integrated into the upcoming Qwen3-Next model. Alibaba has open-sourced the code and the 1.7B experimental model on GitHub to encourage community validation. Looking ahead, Tongyi Qianwen plans to extend this gating approach to multimodal and long-context scenarios, aiming to make “self-filtering attention” a standard feature in next-generation large models.

Return to News List