All
5+
10+
25+
50+
100+
3
LISA: Layerwise Importance Sampling for Memory-Efficient LLM Fine-Tuning (arxiv.org)
4 weeks ago |
convexstrictly
| arxiv.org
|
newest
2
NTIA AI Open Model Weights RFC (regulations.gov)
4 weeks ago |
convexstrictly
| regulations.gov
|
newest
1
Mechanics of Next Token Prediction with Self-Attention (arxiv.org)
a month ago |
convexstrictly
| arxiv.org
|
newest
1
Dive Deeper into Yi-9B (huggingface.co)
a month ago |
convexstrictly
| huggingface.co
|
newest
3
You can now train a 70B language model at home (answer.ai)
a month ago |
convexstrictly
| answer.ai
|
newest
1
Shape Suffixes – Good Coding Style (medium.com/noamshazeer)
a month ago |
convexstrictly
| medium.com
|
newest
3
Star Trek prompt optimal for grade school math on Llama-70B (twitter.com/emollick)
a month ago |
convexstrictly
| twitter.com
|
newest
1
(US Dept of Commerce) NTIA Solicits Comments on Open-Weight AI Models (commerce.gov)
2 months ago |
convexstrictly
| commerce.gov
|
newest
4
BitDelta: Your Fine-Tune May Only Be Worth One Bit (arxiv.org)
2 months ago |
convexstrictly
| arxiv.org
|
newest
37
Time is encoded in the weights of finetuned language models (arxiv.org)
4 months ago |
convexstrictly
| arxiv.org
|
best
2
Zoology 1: Measuring and Improving Recall in Efficient Language Models (stanford.edu)
4 months ago |
convexstrictly
| stanford.edu
|
newest
2
TinyGSM: Achieving >80% on GSM8k with small language models (arxiv.org)
4 months ago |
convexstrictly
| arxiv.org
|
newest
2
Androids built to meet the labor demands (1x.tech)
4 months ago |
convexstrictly
| 1x.tech
|
newest
6
Sam Altman will likely start another company with researchers leaving OpenAI (twitter.com/emilychangtv)
5 months ago |
convexstrictly
| twitter.com
|
newest
492
Three senior researchers have resigned from OpenAI
5 months ago |
convexstrictly
| ycombinator.com
|
best
1
Ron Conway disapproves of Sam Altman's firing (twitter.com/ronconway)
5 months ago |
convexstrictly
| twitter.com
|
newest
111
Sutskever: OpenAI board doing its mission to build AGI that benefits all (twitter.com/garymarcus)
5 months ago |
convexstrictly
| twitter.com
|
best
3
Kara Swisher: OpenAI dev day and store were "pushing too fast (twitter.com/karaswisher)
5 months ago |
convexstrictly
| twitter.com
|
newest
1
GPT4 coding regression claims misleading (twitter.com/si_boehm)
9 months ago |
convexstrictly
| twitter.com
|
frontpage
3
Model 4 bit inference 4.2x faster than 16 bit with full HF support (twitter.com/tim_dettmers)
9 months ago |
convexstrictly
| twitter.com
|
newest
3
SqueezeLLM: Dense-and-Sparse Quantization (arxiv.org)
10 months ago |
convexstrictly
| arxiv.org
|
newest
2
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model (arxiv.org)
10 months ago |
convexstrictly
| arxiv.org
|
newest
2
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 (arxiv.org)
10 months ago |
convexstrictly
| arxiv.org
|
newest
4
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression (arxiv.org)
10 months ago |
convexstrictly
| arxiv.org
|
newest
2
Azure GPT 3.5 completion endpoint bumps HumanEval from <50% to 74% (twitter.com/amanrsanger)
10 months ago |
convexstrictly
| twitter.com
|
newest
76
Falcon 40B LLM (which beats Llama) now Apache 2.0 (twitter.com/thom_wolf)
11 months ago |
convexstrictly
| twitter.com
|
best
3
Tim Dettmers: QLoRA finetunes a 65B model on a single 48 GB GPU (twitter.com/tim_dettmers)
11 months ago |
convexstrictly
| twitter.com
|
newest