Topic
#nccl
2 articles exploring nccl. Expert insights and analysis from our editorial team.
Showing 1–2 of 2 articles
Articles
Newest first
Infrastructure & Runtime
UCCL-Zip Brings Lossless Compression to NCCL Collectives — 47.5% Faster RL Weight Sync and 10% Lower vLLM Latency
UCCL-Zip fuses lossless compression into NCCL and GPU P2P transfers, cutting RL weight sync by 47.5% and vLLM latency by 10% with no API changes and bit-identical outputs.
Infrastructure & Runtime
UCCL-Zip Brings Lossless Compression to NCCL Collectives — 47.5% Faster RL Weight Sync, No API Changes
UCCL-Zip fuses lossless compression into NCCL collectives at the kernel level, cutting cross-node wire bytes without accuracy tradeoffs or application changes. Peak gains:.