Linux 无锁数据结构新武器 rseq
为什么值得看
Linux 4.18+ 引入的 restartable sequences(rseq)允许在多核 CPU 上实现无需锁或原子操作的高性能线程安全数据结构。对需要极致性能的 AI 基础设施(如特征存储、参数服务器)有显著优化空间。
媒体预览
编辑判断
rseq 的杀手级场景是 per-CPU 计数器和数据结构,之前大家要么用原子操作撞缓存一致性墙,要么用 RCU 的复杂回调机制。rseq 通过内核协助的本地快速路径 + 优雅回退,在 128 核机器上能把计数器性能提升一个数量级。
目前主要限制是仅 Linux 且需要 glibc 2.35+ 支持,云原生场景下容器调度可能把线程迁移到不同 CPU 导致 seq 中断。做高性能推理服务或实时特征平台的团队,如果瓶颈在 CPU 缓存争用上,值得评估是否用 rseq 替换现有的 sharded atomic 方案。
Justine Tunney 的这篇实现解析比官方文档更接地气,包含可直接抄的 arm64/x86 汇编模板。
社区反馈
意见分歧 23 条评论
核心争论:文章开头的"20k工作站"是讽刺还是炫富,rseq 内核级无锁机制的技术价值被认可
Maybe I'm just getting old but the "if you don't spend $20,000 on a workstation you're going to be left behind like a dinosaur" at the top of this article is a huge turn off to reading any further. And I say that as someone who owns a workstation with more cores than the author's.
If we're being overly generous, they're saying you need at least a raspberry pi? You can see a 3x improvement there, which shows the pattern works, and that's good enough for a dinosaur (this interpretation is easier to justify if you just skim the article... Which I did the first time) But agreeing
Yeah, you can rent an equivalent workstation from AWS for under $10/hour (and that's the on demand price) so I don't think cost is a huge barrier to doing this sort of work. The language and listing the prices of the workstations down to the penny just strikes me as a rather unprofessional way