50
2
Statistical Learning Theory and ChatGPT (kamalikachaudhuri.substack.com)
7
I ran out of money, spent my savings on a Hong Kong prostitute,& became a commie (docs.google.com)
5
Why are your models so big? (2023) (pawa.lt)
1
[dupe] The Decline of Deviance (experimental-history.com)
87
Sycophancy is the first LLM "dark pattern" (seangoedecke.com)
2
Contextualization Machines (stochasm.blog)
1
ChaCha has all the answers – unless I'm on the other end (2009) (archive.org)
8
What I don’t like about chains of thoughts (2023) (samsja.github.io)
1
Reframing Impact (turntrout.com)
1
Continuous Batching from First Principles (huggingface.co)
1
Solving Kilordle (hauntsaninja.github.io)
1
Kilordle (jonesnxt.github.io)
1
What makes good reasoning data (huggingface.co)
2
Evaluating the Effectiveness of LLM-Evaluators (a.k.a. LLM-as-Judge) (eugeneyan.com)
1
Compute Forecast (AI 2027) (ai-2027.com)
1
A Realistic AI Timeline (vintagedata.org)
2
Biotech companies I wish existed (eladgil.com)
1
A World of Verifiable Domains (seancai.com)
4
Learning to Model the World with Language (dynalang.github.io)
2
A hitchhiker's guide to CUDA programming (seanzhang.me)
46
Estimating the perceived 'claustrophobia' of New York City's streets (2024) (mfranchi.net)
166
Tinkering is a way to acquire good taste (seated.ro)
2
Modern LLM Training (A Summary) (lesswrong.com)
2
Yes it's just doing compression. No it's not the diss you think it is (blog.wtf.sg)
2
Good developer relations is about being a celebrity for dorks (pfiffer.org)
2
Prompt Baking (arxiv.org)
1
Offline "Studying" Shrinks the Cost of Contextually Aware AI (stanford.edu)
5
The State of Machine Learning Frameworks in 2019 (thegradient.pub)
1
Many AI Safety Orgs Have Tried to Criminalize Open-Source AI (2024) (1a3orn.com)
1
Neural Networks and Deep Learning (neuralnetworksanddeeplearning.com)
105
America's future could hinge on whether AI slightly disappoints (noahpinion.blog)
3
Self-Respect (By Joan Didion) (1961) (gatech.edu)
12
Read your way through Hà Nội (vietnamesetypography.com)
59
How hard do you have to hit a chicken to cook it? (2020) (james-simon.github.io)
2
Start a Blog (guzey.com)
1
Computable Babylonian Diaries Project (christopherwolfram.com)
1
Survival of the Best Fit (survivalofthebestfit.com)
3
Breath of the Wild Decompilation (botw.link)
2
The Politics of Contagion (emilybynight.com)
8
A PhD in Snapshots (rbharath.github.io)
31
Memory access is O(N^[1/3]) (vitalik.eth.limo)
1
Highrises (hythacg.com)
30
How does gradient descent work? (centralflows.github.io)
3
Small Products That Improved My Life (moultano.wordpress.com)
1
Whispers of A.I.'s Modular Future (2023) (newyorker.com)
1
An Age of AI Enlightenment (xiangfu.co)
1
A vision researcher's guide to some RL stuff: PPO and GRPO (yugeten.github.io)
3
LLMs are strangely-shaped tools (near.blog)
1
Learned Structures (nonint.com)
1
LoRA-XS: Low-Rank Adaptation with Small Number of Parameters (arxiv.org)
8
Evals in 2025: going beyond simple benchmarks to build models people can use (github.com/huggingface)
1
Dissecting Batching Effects in GPT Inference (qun.ch)
2
My (speculative) master plan for immortality (maxwellnye.com)
3
Richard Feynman and the Connection Machine (1989) (longnow.org)
98
Defeating Nondeterminism in LLM Inference (thinkingmachines.ai)
12
Perceived Age (2024) (sdan.io)
2
Don't Build an RL Environment Startup (benanderson.work)
1
Shifting Bits in Company History (williamyeny.github.io)
1
ML Systems: Motivating Dense Models (jacobkahn.me)
4
The "it" in AI models is the dataset (nonint.com)
1
The Paradigm (nonint.com)
1
Personalization, measuring with taste, and intrinsic interfaces (thesephist.com)
3
Long Term Memory in AI (Princeton CS 597A) (edoliberty.github.io)
1
Model Merging – A Biased Overview (crisostomi.github.io)
1
Adversarial Examples Are Not Bugs, They Are Superposition (livgorton.com)
1
Sequence Parallelism: Long Sequence Training from System Perspective (2021) (arxiv.org)
10
How many paths of length K are there between A and B? (2021) (horace.io)
2
How A Neuron Learns (rvns.moe)
1
GPT, Fast (pytorch.org)
1
GPT-Fast (github.com/meta-pytorch)
10
Exploring EXIF (2023) (hturan.com)
1
The Practitioner's Guide to the Maximal Update Parameterization (cerebras.ai)
1
The scientific method and its application to the science of deep learning (james-simon.github.io)
2
Solving Humanity's Last Exam Problems (youtube.com)
1
Why We Think (lilianweng.github.io)
5
Philosophical Thoughts on Kolmogorov-Arnold Networks (2024) (kindxiaoming.github.io)
1
Matmul() using PyTorch's MPs back end is faster than Apple's MLX (kevinmartinjose.com)
2
The Making of Gemini Plays Pokémon (jcz.dev)
6
Facebook is not worth $33B (2010) (signalvnoise.com)
3
Comefrom (wikipedia.org)
1
Diffusion Language Models Are Super Data Learners (jinjieni.notion.site)
2
How to build a router for MOE models (cerebras.ai)
2
The Eponymous Principles of Management – Coase's Ceiling and Floor (amvaishnav.wordpress.com)
70
[flagged] No One Is Working (humaninvariant.com)
3
No One Is Working (humaninvariant.com)
1
SFT Is Bad RL (justinchiu.netlify.app)
8
A Simple CPU on the Game of Life (2021) (carlini.com)
2
Trends in LLM-Generated Citations on ArXiv (spylab.ai)
3
'AI' just means LLMs now (jxmo.io)
2
Ada Lovelace and the Analytical Engine (ox.ac.uk)
40
How long before superintelligence? (1997) (nickbostrom.com)
65
Attention is your scarcest resource (2020) (benkuhn.net)
1
DeltaNet Explained (sustcsonglin.github.io)
84
All AI models might be the same (jxmo.io)
1
Life Update – On Health (jiha-kim.github.io)
1
Asymmetry of Verification and Verifier's Law (jasonwei.net)
7
Soviet College Admission – My Dad's Story (1970) (ilyavolodarsky.com)
2
H-Net – Inference (main-horse.github.io)
8