66
LLM Inference Handbook (bentoml.com)
14 hours ago | djhu9 | bentoml.com | best
2
The Shift to Distributed LLM Inference (bentoml.com)
4 weeks ago | djhu9 | bentoml.com | newest
2
Do Managers Need 1:1 Meetings with Every Team Member? (bloomberg.com)
a month ago | djhu9 | bloomberg.com | newest
2
Software engineer lost his $150K-a-year job to AI (msn.com)
a month ago | djhu9 | msn.com | newest
2
Xiaomi's in-house XRing 01 SoC (tomshardware.com)
a month ago | djhu9 | tomshardware.com | newest
1
Buster: An open-source platform for deploying AI data analysts (github.com/buster-so)
2 months ago | djhu9 | github.com | newest
2
VLLM is now a PyTorch Foundation-hosted project (pytorch.org)
2 months ago | djhu9 | pytorch.org | newest
1
Cold-Starting LLMs on Kubernetes in Under 30 Seconds (bentoml.com)
3 months ago | djhu9 | bentoml.com | newest
1
NanoService – Build lightweight, modular, and scalable back end applications (github.com/deskree-inc)
3 months ago | djhu9 | github.com | newest
1
Google Gemini is shaking up its AI leadership ranks (semafor.com)
3 months ago | djhu9 | semafor.com | newest
2
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning (arxiv.org)
4 months ago | djhu9 | arxiv.org | newest
2
Nvidia close to acquiring AI cloud provider Lepton AI in nine-figure deal (siliconangle.com)
4 months ago | djhu9 | siliconangle.com | newest
1
A Vision for Rebuilding TikTok in America (perplexity.ai)
4 months ago | djhu9 | perplexity.ai | newest
2
Six Infrastructure Pitfalls Slowing Down Your AI Progress (bentoml.com)
4 months ago | djhu9 | bentoml.com | newest
6
Scale AI is being investigated by the US Department of Labor (techcrunch.com)
4 months ago | djhu9 | techcrunch.com | frontpage
55
Microsoft's Relationship with OpenAI Is Not Looking Good (gizmodo.com)
4 months ago | djhu9 | gizmodo.com | frontpage
2
China's AI agent Manus gains traction amid growing demand for autonomous AI (technode.com)
4 months ago | djhu9 | technode.com | frontpage
1
LlamaIndex Secures $19M Series A (llamaindex.ai)
4 months ago | djhu9 | llamaindex.ai | newest
1
Secure and Private DeepSeek Deployment (bentoml.com)
5 months ago | djhu9 | bentoml.com | newest
4
Show HN: Apache Cloudberry – Open-source Massively Parallel Processing database (github.com/apache)
7 months ago | djhu9 | github.com | frontpage
1
Apache Pulsar 4.0: Towards an Open Data Streaming Architecture (streamnative.io)
9 months ago | djhu9 | streamnative.io | newest
9
Ollama can run any GGUF Model on Hugging Face Hub now (huggingface.co)
9 months ago | djhu9 | huggingface.co | frontpage
1
Optimizing and Characterizing High-Throughput Low-Latency LLM Inference (mlc.ai)
9 months ago | djhu9 | mlc.ai | newest
3
Assistant-UI: A set of React components for AI chat (github.com/yonom)
10 months ago | djhu9 | github.com | frontpage
1
Tuning TensorRT-LLM for Optimal Serving (bentoml.com)
10 months ago | djhu9 | bentoml.com | newest
3
OpenAI and Anthropic will share their models with the US government (cnbc.com)
11 months ago | djhu9 | cnbc.com | frontpage
2
Vector DB Comparison List (superlinked.com)
12 months ago | djhu9 | superlinked.com | frontpage
51
RouteLLM: A framework for serving and evaluating LLM routers (github.com/lm-sys)
a year ago | djhu9 | github.com | best