All
5+
10+
25+
50+
100+
66
LLM Inference Handbook (bentoml.com)
14 hours ago |
djhu9
| bentoml.com
|
best
2
The Shift to Distributed LLM Inference (bentoml.com)
4 weeks ago |
djhu9
| bentoml.com
|
newest
2
Do Managers Need 1:1 Meetings with Every Team Member? (bloomberg.com)
a month ago |
djhu9
| bloomberg.com
|
newest
2
Software engineer lost his $150K-a-year job to AI (msn.com)
a month ago |
djhu9
| msn.com
|
newest
2
Xiaomi's in-house XRing 01 SoC (tomshardware.com)
a month ago |
djhu9
| tomshardware.com
|
newest
1
Buster: An open-source platform for deploying AI data analysts (github.com/buster-so)
2 months ago |
djhu9
| github.com
|
newest
2
VLLM is now a PyTorch Foundation-hosted project (pytorch.org)
2 months ago |
djhu9
| pytorch.org
|
newest
1
Cold-Starting LLMs on Kubernetes in Under 30 Seconds (bentoml.com)
3 months ago |
djhu9
| bentoml.com
|
newest
1
NanoService – Build lightweight, modular, and scalable back end applications (github.com/deskree-inc)
3 months ago |
djhu9
| github.com
|
newest
1
Google Gemini is shaking up its AI leadership ranks (semafor.com)
3 months ago |
djhu9
| semafor.com
|
newest
2
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning (arxiv.org)
4 months ago |
djhu9
| arxiv.org
|
newest
2
Nvidia close to acquiring AI cloud provider Lepton AI in nine-figure deal (siliconangle.com)
4 months ago |
djhu9
| siliconangle.com
|
newest
1
A Vision for Rebuilding TikTok in America (perplexity.ai)
4 months ago |
djhu9
| perplexity.ai
|
newest
2
Six Infrastructure Pitfalls Slowing Down Your AI Progress (bentoml.com)
4 months ago |
djhu9
| bentoml.com
|
newest
6
Scale AI is being investigated by the US Department of Labor (techcrunch.com)
4 months ago |
djhu9
| techcrunch.com
|
frontpage
55
Microsoft's Relationship with OpenAI Is Not Looking Good (gizmodo.com)
4 months ago |
djhu9
| gizmodo.com
|
frontpage
2
China's AI agent Manus gains traction amid growing demand for autonomous AI (technode.com)
4 months ago |
djhu9
| technode.com
|
frontpage
1
LlamaIndex Secures $19M Series A (llamaindex.ai)
4 months ago |
djhu9
| llamaindex.ai
|
newest
1
Secure and Private DeepSeek Deployment (bentoml.com)
5 months ago |
djhu9
| bentoml.com
|
newest
4
Show HN: Apache Cloudberry – Open-source Massively Parallel Processing database (github.com/apache)
7 months ago |
djhu9
| github.com
|
frontpage
1
Apache Pulsar 4.0: Towards an Open Data Streaming Architecture (streamnative.io)
9 months ago |
djhu9
| streamnative.io
|
newest
9
Ollama can run any GGUF Model on Hugging Face Hub now (huggingface.co)
9 months ago |
djhu9
| huggingface.co
|
frontpage
1
Optimizing and Characterizing High-Throughput Low-Latency LLM Inference (mlc.ai)
9 months ago |
djhu9
| mlc.ai
|
newest
3
Assistant-UI: A set of React components for AI chat (github.com/yonom)
10 months ago |
djhu9
| github.com
|
frontpage
1
Tuning TensorRT-LLM for Optimal Serving (bentoml.com)
10 months ago |
djhu9
| bentoml.com
|
newest
3
OpenAI and Anthropic will share their models with the US government (cnbc.com)
11 months ago |
djhu9
| cnbc.com
|
frontpage
2
Vector DB Comparison List (superlinked.com)
12 months ago |
djhu9
| superlinked.com
|
frontpage
51
RouteLLM: A framework for serving and evaluating LLM routers (github.com/lm-sys)
a year ago |
djhu9
| github.com
|
best