本地35B模型替代Claude日常编码

#ARTICLE HackerNews 2026.06.16

推荐指数 60.0 NO. 012 · 2026.06.16

发布2026/06/15Score413Comments231

为什么值得看

HN用户用量化版Qwen3.6 35B（仅3B活跃参数）在Mac Studio/MacBook上完全离线完成Django+Wagtail网站重构。证明消费级硬件+量化模型已能支撑实际开发工作流，隐私敏感者和离线场景可参考。

媒体预览

编辑判断

这个案例的关键不是模型能力，而是参数激活策略——35B总参数只跑3B活跃参数，用MoE架构的稀疏性换速度，这是本地部署能用的核心前提。之前大家试本地模型觉得慢，往往是全参数加载，没走这条路径。

Wagtail这个细节更值得注意：小众框架没有大量训练语料，模型仍能完成任务，说明当前代码模型的泛化能力被低估，不是只能写React和Python标准库。如果你在评估本地模型能不能覆盖内部技术栈，这个信号比benchmark更有参考价值。

真正门槛其实是上下文长度和工具调用稳定性，帖子没提这部分。建议想复现的人先测长文件编辑（>2000行）和连续多轮agent执行的成功率，这是日常编码和demo的区别。

社区反馈

意见分歧 216 条评论

核心争论：本地量化模型能否真正替代云端大模型用于日常编码，速度、成本与稳定性仍存争议

tumetab1

Not yet, tried Gemma 4 on an Apple M4 but the tok/s is significant lower than the cloud offering. Also,the lack of enterprise tooling to help selected an appropriate model and tooling to run a local LLM does not help.

arjie

Not “local” and not interactive coding but sharing since it might be helpful. I have 2x RTX Pro 6000 Blackwell running DeepSeek V4 Flash. I get 160 tok/s raw but it’s a reasoning model. For my use case, I have it auto-write code and another system auto-review the code. I occasionally use it wit

leptons

Have you measured your electricity consumption for this rig? I have to wonder how much it would cost you per month.

替代方案： ClaudeClaude CodeCodexCursorGemma 4DeepSeek V4 FlashKimi 2.6GLM 5.1OpenRouterFireworksQwen 3.6 27b denseClaude Haiku 4.5Claude Sonnet

查看原文 →