本地35B模型替代Claude日常编码
推荐指数 60.0 NO. 012 · 2026.06.16
发布2026/06/15Score413Comments231
为什么值得看
HN用户用量化版Qwen3.6 35B(仅3B活跃参数)在Mac Studio/MacBook上完全离线完成Django+Wagtail网站重构。证明消费级硬件+量化模型已能支撑实际开发工作流,隐私敏感者和离线场景可参考。
媒体预览
编辑判断
这个案例的关键不是模型能力,而是参数激活策略——35B总参数只跑3B活跃参数,用MoE架构的稀疏性换速度,这是本地部署能用的核心前提。之前大家试本地模型觉得慢,往往是全参数加载,没走这条路径。
Wagtail这个细节更值得注意:小众框架没有大量训练语料,模型仍能完成任务,说明当前代码模型的泛化能力被低估,不是只能写React和Python标准库。如果你在评估本地模型能不能覆盖内部技术栈,这个信号比benchmark更有参考价值。
真正门槛其实是上下文长度和工具调用稳定性,帖子没提这部分。建议想复现的人先测长文件编辑(>2000行)和连续多轮agent执行的成功率,这是日常编码和demo的区别。
社区反馈
意见分歧 216 条评论
核心争论:本地量化模型能否真正替代云端大模型用于日常编码,速度、成本与稳定性仍存争议
Not yet, tried Gemma 4 on an Apple M4 but the tok/s is significant lower than the cloud offering. Also,the lack of enterprise tooling to help selected an appropriate model and tooling to run a local LLM does not help.
Not “local” and not interactive coding but sharing since it might be helpful. I have 2x RTX Pro 6000 Blackwell running DeepSeek V4 Flash. I get 160 tok/s raw but it’s a reasoning model. For my use case, I have it auto-write code and another system auto-review the code. I occasionally use it wit
Have you measured your electricity consumption for this rig? I have to wonder how much it would cost you per month.