62
3
A running list of reasons to move to open source (whyopensource.ai)
1
Moe inference optimizations: 15% lower expert load by request reordering (doubleword.ai)
1
Tensor Network Attention (mainlymatmul.com)
5
Redundant Information in LLM Weights (fergusfinn.com)
1
Tans: Precomputing RANS (fergusfinn.com)
2
Also-RANS: Asymmetric Numeral Systems for Entropy Coding (fergusfinn.com)
4
70x faster cold(ish) starts for SGLang (fergusfinn.com)
1
QueueSpec – drafting speculation tokens while a request queues (doubleword.ai)
1
ZeroDP: Just-in-Time Weight Offloading over NVLink for Data Parallelism (mainlymatmul.com)
1
Parallel Primitives for Multi-Agent Workflows (fergusfinn.com)
2
New fastest AI Model Gateway – 450x less overhead than LiteLLM (github.com/doublewordai)
4
Should GPUs Make Free Trade Agreements? (doubleword.ai)
2
Controlled generation of OS LLMs – without impacting latency (youtube.com)
3
Takeoff Inference Server Is Now Open Source (github.com/titanml)
4