🚀Shippingscore 100.1May 26, 2026·2605.27358cs.LGcs.AIcs.CL

MobileMoE: Scaling On-Device Mixture of Experts

Yanbei Chen, Hanxian Huang, Ernie Chang, Jacob Szwejbka, Digant Desai, Zechun Liu, Vikas Chandra, Raghuraman Krishnamoorthi

PDF ↗arXiv ↗

Narrative

No narrative written yet. The narrate cron picks top papers by score; run /api/cron/narrate to populate this manually.

Abstract

Mixture-of-Experts (MoE) has become the de facto architecture for hundred-billion-parameter language models, yet its advantages at sub-billion scales for on-device deployment remain largely unexplored. To close this gap, we present MobileMoE, a family of on-device MoE language models with sub-billion active parameters (0.3-0.9B active and 1.3-5.3B total) that establish a new Pareto frontier for on-device LLMs. We first formulate an on-device MoE scaling law that jointly optimizes MoE architecture under mobile memory and compute constraints, identifying an on-device sweet spot - moderate sparsity with fine-grained and shared experts - that is simultaneously memory and compute-optimal. Building on the derived architectures, we train MobileMoE with a four-stage recipe covering pre-training, mid-training, instruction fine-tuning, and quantization-aware training, all on open-source datasets. Across 14 benchmarks, MobileMoE matches or exceeds leading on-device dense LLMs with 2-4$\times$ fewer inference FLOPs, and matches or surpasses the state-of-the-art MoE OLMoE-1B-7B with up to 60% fewer parameters. To bridge the last mile to mobile deployment, we provide the first efficient MoE inference on commodity smartphones with comprehensive on-device profiling. At comparable INT4 weight memory, MobileMoE-S delivers $1.8$-$3.8\times$ faster prefill and $2.2$-$3.4\times$ faster decode than the dense baseline MobileLLM-Pro.

Citation timeline

Not enough citation snapshots yet to plot a timeline. Come back after a few cron runs.

Signal

Stars: 121
Repos: 40
Citations: 0
Velocity: 0.00/d

GitHub repos (20)

CSQianDong/Awesome-arXiv-Daily-Reporter⭐ 48
“{'arxiv_id': 'arXiv:2605.26133', 'title': 'Pretraining Data Exposure in Large Language Models: A Survey of Membership Inference, Data Contamination, and Security Implications', 'authors': 'Ziyi Tong, Feifei Sun, Le Minh Nguyen', 'link': 'https://arxiv.org/abs/2605.26133', 'abstra”
flyryan/ai-news-aggregator⭐ 15
“ <entry> <id>urn:ainews:2026-05-27:research:d1ee40c89462</id> <title>MobileMoE: Scaling On-Device Mixture of Experts</title> <link href="http://arxiv.org/abs/2605.27358" rel="alternate" type="text/html"/> <link href="https://news.aatf.ai/?date=2026-05-27&cate”
ZenAlexa/agi-brief-history⭐ 11
“- **Summary**: Large language model (LLM) agents rely on reusable skills to solve complex tasks. However, existing skill creation approaches treat skills as isolated and static artifacts, limiting their reusability, reliability, and long-term improvement. We propose MUSE-Autoskil”
komiyamma/site_mkdoc_ai_news⭐ 9
“| タイトル | まとめた内容 | 参考URL | |----------|--------------|---------| | DiscoverPhysics: Benchmarking LLMs for Out-of-the-Box Scientific Thinking | LLMsの科学的思考力をベンチマークする新論文。従来の知識外の科学的推論能力を評価。 | [arXiv](https://arxiv.org/abs/2605.26087) | | MobileMoE: Scaling On-Device Mixture of Experts”
lonePatient/lonePatient.github.io⭐ 9
“{% hideToggle 点击查看摘要 %} {% note blue no-icon %} ID-7-MobileMoE: Scaling On-Device Mixture of Experts {% endnote %} **链接**: https://arxiv.org/abs/2605.27358 **作者**: Yanbei Chen,Hanxian Huang,Ernie Chang,Jacob Szwejbka,Digant Desai,Zechun Liu,Vikas Chandra,Raghuraman Krishnamoorth”
sifted-network/sifted-awesome-ai-agents⭐ 7
“ arXiv:2605.27358v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) has become the de facto architecture for hundred-billion-parameter language models, yet its advantages at sub-billion scales for on-device deployment remain largely unexplored. To close this gap, we present”
2shin0/arxiv-ai-mailing⭐ 7
“ ## 41. MobileMoE: Scaling On-Device Mixture of Experts - **Authors**: Yanbei Chen , Hanxian Huang , Ernie Chang , Jacob Szwejbka , Digant Desai , Zechun Liu , Vikas Chandra , Raghuraman Krishnamoorthi - **URL**: [https://arxiv.org/abs/2605.27358](https://arxiv.org/abs/2605.27358”
ttmens/ai-radar-wiki⭐ 6
“暂无中文摘要 ## 链接 - 📄 arXiv: http://arxiv.org/abs/2605.27358v1 ## PM 视角解读 > 由 Stage 2 LLM 分析后补充”
bailynlove/Rookie-s-Newsletters⭐ 3
“ <p><span class="priority-badge p2">P2</span> <span class="stars">★★★</span> (评分: 6.5/10)</p> <p><strong>来源:</strong> <a href="https://arxiv.org/abs/2605.27358">arXiv:2605.27358</a></p> <p><strong>作者:</strong> Yanbei Chen et al.</p>”
Kiraaa1/ArXic-AI-Paper-Digest-Agent⭐ 1
“ ### 6. MobileMoE: Scaling On-Device Mixture of Experts **Authors:** Yanbei Chen, Hanxian Huang, Ernie Chang, Jacob Szwejbka, Digant Desai, Zechun Liu, Vikas Chandra, Raghuraman Krishnamoorthi **Link:** https://arxiv.org/abs/2605.27358v1 **Summary:** The paper addresses the chall”
osa-mayor/DailyUpdate⭐ 1
“ ## 23. [MobileMoE: Scaling On-Device Mixture of Experts](https://huggingface.co/papers/2605.27358) **Upvotes**: 5 | **도입 난이도**: 중 | **신뢰도**: 상 **arXiv**: https://arxiv.org/abs/2605.27358 **태그**: On-Device AI, Mixture-of-Experts, Model Compression, Mobile Optimization, Benchmark”
aparasion/turingwire⭐ 1
“source_publisher: "arXiv cs.AI" source_url: "https://arxiv.org/abs/2605.27358v1" arxiv_id: "2605.27358"”
amor-mio-de-mi-vida/PaperDigest⭐ 1
“ 1. [MobileMoE: Scaling On-Device Mixture of Experts](https://arxiv.org/abs/2605.27358v1) - 分类：On-Device MoE / Efficient LLM”
meekotharaccoon-cell/meeko-nerve-center⭐ 1
“ <span class="year">2026-05</span> </div> <h3><a href="http://arxiv.org/abs/2605.27358v1" target="_blank" rel="noopener">MobileMoE: Scaling On-Device Mixture of Experts</a></h3> <p class="authors">Yanbei Chen, Hanxian Huang, Ernie Chang</p> ”
shubhamshardul-work/Projects⭐ 1
“ "title": "[ArXiv] MobileMoE: Scaling On-Device Mixture of Experts", "url": "http://arxiv.org/abs/2605.27358v1", "summary": "Mixture-of-Experts (MoE) has become the de facto architecture for hundred-billion-parameter language models, yet its advantages at sub-bil”
randomrisk/Awesome-DigitalTwin-WorldModels⭐ 0
“| 2026-05-26 | GENESIS: Harnessing AI Agents for Autonomous 6G RAN Synthesis, Research, and Testing | Tamerlan Aghayev, Maxime Elkael, Michele Polese, Minh Dat Nguyen, Gabriele Gemmi, Andrea Lacava, Ali Saeizadeh, Reshma Prasad, Paolo Testolina, Angelo Feraudo, Soumendra Nanda, P”
Saurav-Kalaskar/AI-News-Daily⭐ 0
“ <b>📦 Open-Source Models</b> • <a href="https://arxiv.org/abs/2605.27366v1">MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation</a> • <a href="https://arxiv.org/abs/2605.27358v1">MobileMoE: Scaling On-Device Mixture of Experts</a> • <a ”
sirichen2/sirichen2.github.io⭐ 0
“ <div class="paper-head"> <div> <h2><a href="https://arxiv.org/abs/2605.27358v1" target="_blank" rel="noopener noreferrer">MobileMoE: Scaling On-Device Mixture of Experts</a></h2> </div> ”
spencerbk/daily-ai-news-site⭐ 0
“ </item> <item> <title>MobileMoE pushes mixture-of-experts models onto smartphones</title> <link>https://arxiv.org/abs/2605.27358</link> <description>A new arXiv paper introduces MobileMoE, a family of sub-billion-active-parameter MoE language models opti”
Time-has-wings/MLSys-Papers⭐ 0
“ *Subjects: Machine Learning (cs.LG)* ### [MobileMoE: Scaling On-Device Mixture of Experts](https://arxiv.org/abs/2605.27358) **MobileMoE：面向端侧部署的MoE语言模型扩展** *Yanbei Chen, Hanxian Huang, Ernie Chang, Jacob Szwejbka, Digant Desai, Zechun Liu, Vikas Chandra, Raghuraman Krishnamoort”