Topic
#inference infrastructure
2 articles exploring inference infrastructure. Expert insights and analysis from our editorial team.
Showing 1โ2 of 2 articles
Articles
Newest first
Infrastructure & Runtime
UCCL-Zip Adds Lossless Compression to NCCL Collectives: 47.5% Faster RL Weight Sync, No API Changes
UCCL-Zip fuses lossless compression into NCCL collectives at the kernel level, cutting cross-node wire bytes without accuracy tradeoffs or application changes. Peak gains:.
Infrastructure & Runtime
Prefill-Decode Disaggregation: The Architecture Shift Redefining LLM Serving at Scale
Prefill-decode disaggregation separates compute-bound prefill from memory-bound decode onto dedicated hardware, eliminating phase interference.