Hacker News headlines

3

LISA: Layerwise Importance Sampling for Memory-Efficient LLM Fine-Tuning (arxiv.org)

4 weeks ago | convexstrictly | arxiv.org | newest

2

NTIA AI Open Model Weights RFC (regulations.gov)

4 weeks ago | convexstrictly | regulations.gov | newest

1

Mechanics of Next Token Prediction with Self-Attention (arxiv.org)

a month ago | convexstrictly | arxiv.org | newest

1

Dive Deeper into Yi-9B (huggingface.co)

a month ago | convexstrictly | huggingface.co | newest

3

You can now train a 70B language model at home (answer.ai)

a month ago | convexstrictly | answer.ai | newest

1

Shape Suffixes – Good Coding Style (medium.com/noamshazeer)

a month ago | convexstrictly | medium.com | newest

3

Star Trek prompt optimal for grade school math on Llama-70B (twitter.com/emollick)

a month ago | convexstrictly | twitter.com | newest

1

(US Dept of Commerce) NTIA Solicits Comments on Open-Weight AI Models (commerce.gov)

2 months ago | convexstrictly | commerce.gov | newest

4

BitDelta: Your Fine-Tune May Only Be Worth One Bit (arxiv.org)

2 months ago | convexstrictly | arxiv.org | newest

37

Time is encoded in the weights of finetuned language models (arxiv.org)

4 months ago | convexstrictly | arxiv.org | best

2

Zoology 1: Measuring and Improving Recall in Efficient Language Models (stanford.edu)

4 months ago | convexstrictly | stanford.edu | newest

2

TinyGSM: Achieving >80% on GSM8k with small language models (arxiv.org)

4 months ago | convexstrictly | arxiv.org | newest

2

Androids built to meet the labor demands (1x.tech)

4 months ago | convexstrictly | 1x.tech | newest

6

Sam Altman will likely start another company with researchers leaving OpenAI (twitter.com/emilychangtv)

5 months ago | convexstrictly | twitter.com | newest

492

Three senior researchers have resigned from OpenAI

5 months ago | convexstrictly | ycombinator.com | best

1

Ron Conway disapproves of Sam Altman's firing (twitter.com/ronconway)

5 months ago | convexstrictly | twitter.com | newest

111

Sutskever: OpenAI board doing its mission to build AGI that benefits all (twitter.com/garymarcus)

5 months ago | convexstrictly | twitter.com | best

3

Kara Swisher: OpenAI dev day and store were "pushing too fast (twitter.com/karaswisher)

5 months ago | convexstrictly | twitter.com | newest

1

GPT4 coding regression claims misleading (twitter.com/si_boehm)

9 months ago | convexstrictly | twitter.com | frontpage

3

Model 4 bit inference 4.2x faster than 16 bit with full HF support (twitter.com/tim_dettmers)

9 months ago | convexstrictly | twitter.com | newest

3

SqueezeLLM: Dense-and-Sparse Quantization (arxiv.org)

10 months ago | convexstrictly | arxiv.org | newest

2

Inference-Time Intervention: Eliciting Truthful Answers from a Language Model (arxiv.org)

10 months ago | convexstrictly | arxiv.org | newest

2

Orca: Progressive Learning from Complex Explanation Traces of GPT-4 (arxiv.org)

10 months ago | convexstrictly | arxiv.org | newest

4

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression (arxiv.org)

10 months ago | convexstrictly | arxiv.org | newest

2

Azure GPT 3.5 completion endpoint bumps HumanEval from <50% to 74% (twitter.com/amanrsanger)

10 months ago | convexstrictly | twitter.com | newest

76

Falcon 40B LLM (which beats Llama) now Apache 2.0 (twitter.com/thom_wolf)

11 months ago | convexstrictly | twitter.com | best

3

Tim Dettmers: QLoRA finetunes a 65B model on a single 48 GB GPU (twitter.com/tim_dettmers)

11 months ago | convexstrictly | twitter.com | newest