Next >
3
Scaling Laws vs. Model Architectures: How Does Inductive Bias Influence Scaling? (arxiv.org)
4 hours ago | georgehill | arxiv.org | newest
2
The "it" in AI models is the dataset (nonint.com)
22 hours ago | georgehill | nonint.com | newest
15
Snowflake Arctic: LLM for Enterprise AI — Efficiently Intelligent, Truly Open (snowflake.com)
a day ago | georgehill | snowflake.com | newest
2
OpenAI: Training LLMs to Prioritize Privileged Instructions (arxiv.org)
a day ago | georgehill | arxiv.org | newest
2
The Illusion of State in State-Space Models (arxiv.org)
3 days ago | georgehill | arxiv.org | newest
6
LLM Leaderboard with explanations of what each score means (stanford.edu)
4 days ago | georgehill | stanford.edu | frontpage
239
[flagged] Why you should not apply to YC (twitter.com/dvassallo)
3 days ago | georgehill | twitter.com | best
0
Naval: Attempting to Define AGI (air.chat)
6 days ago | georgehill | air.chat | newest
1
Aide: The Machine Learning CodeGen Agent (github.com/wecoai)
a week ago | georgehill | github.com | newest
2
Clear Eyes, Full Heart (2lr.substack.com)
a week ago | georgehill | substack.com | newest
2
Mechanics of Next Token Prediction with Self-Attention (arxiv.org)
a week ago | georgehill | arxiv.org | frontpage
2
JetMoE: Reaching Llama2 Performance with 0.1M Dollars (arxiv.org)
a week ago | georgehill | arxiv.org | newest
1
From Words to Numbers: Your Large Language Model Is a Capable Regressor (arxiv.org)
a week ago | georgehill | arxiv.org | newest
1
Language network as a natural kind within the landscape of the human brain (nature.com)
a week ago | georgehill | nature.com | newest
2
Liquid AI's Ramin Hasani on liquid neural networks [video] (youtube.com)
a week ago | georgehill | youtube.com | newest
2
The Brain Forming a Memory (twitter.com/brianroemmele)
2 weeks ago | georgehill | twitter.com | newest
1
An Open-Source AI-Powered Answer Engine with a Generative User Interface (morphic.sh)
2 weeks ago | georgehill | morphic.sh | newest
1
How to Fix Legal Immigration in America [video] (youtube.com)
2 weeks ago | georgehill | youtube.com | newest
2
Securing Canada's AI Advantage (pm.gc.ca)
2 weeks ago | georgehill | pm.gc.ca | newest
2
Google lost ground in the AI race (ft.com)
2 weeks ago | georgehill | ft.com | newest
2
OpenAI: Testing Usage-Based GPT Earnings with US Builders (twitter.com/openai)
4 weeks ago | georgehill | twitter.com | newest
2
Collaborate with point cloud for manufacturing sites (samp.ai)
a month ago | georgehill | samp.ai | newest
3
Revealing OpenAI's plan to create AGI by 2027 [PDF] (drive.google.com)
a month ago | georgehill | google.com | newest
1
Rename Your Images and Videos with AI (keepitshot.com)
2 months ago | georgehill | keepitshot.com | newest
2
Ask HN: How does plagarism checker work?
2 months ago | georgehill | ycombinator.com | newest
3
Software-Defined Tensor Streaming Multiprocessor for Large-Scale ML [pdf] (groq.com)
2 months ago | georgehill | groq.com | newest
27
Grand-Master Level Chess Without Search: Modeling Choices and Their Implications (gist.github.com)
2 months ago | georgehill | github.com | best
2
Carta launches a startup shutdown service (henrysward.medium.com)
2 months ago | georgehill | medium.com | frontpage
7
Convert 'Screenshot.png' to 'Appropriate name.png' Using GPT-4 Vision (keepitshot.com)
3 months ago | georgehill | keepitshot.com | newest
0
Training Neural Networks Is NP-Hard in Fixed Dimension (arxiv.org)
3 months ago | georgehill | arxiv.org | newest
1
Satoshi: Hello World (twitter.com/satoshi)
3 months ago | georgehill | twitter.com | newest
96
Some updates on our AI efforts (threads.net)
3 months ago | georgehill | threads.net | best
94
Bing Gained Less Than 1% Market Share Since Adding Bing Chat (seroundtable.com)
3 months ago | georgehill | seroundtable.com | best
1
Local LLM movement feels like early days of PCs vs. Mainframes (reddit.com)
3 months ago | georgehill | reddit.com | newest
25
Phi-2: Self-Extend Boosts Performance, Extends Context to 8k Without Training (reddit.com)
3 months ago | georgehill | reddit.com | best
1
Google confirms it just laid off around a thousand employees (theverge.com)
3 months ago | georgehill | theverge.com | newest
3
Serving Open Source Models 4x faster than vLLM by quantizing with ~no tradeoffs (fireworks.ai)
3 months ago | georgehill | fireworks.ai | newest
1
LNX: Using Tantivy to Build One of the Fastest Search Engines Around [video] (youtube.com)
3 months ago | georgehill | youtube.com | newest
1
Moe-Mamba: Efficient Selective State Space Models with Mixture of Experts (arxiv.org)
3 months ago | georgehill | arxiv.org | newest
3
Evaluating LLMs with CommonGen-Lite (github.com/allenai)
3 months ago | georgehill | github.com | newest
65
Turing Complete Transformers: Two Transformers Are More Powerful Than One (openreview.net)
3 months ago | georgehill | openreview.net | best
6
Human-Like Browser Automation with GPT-4 Vision and MeiliSearch (github.com/vignshwarar)
3 months ago | georgehill | github.com | frontpage
1
Create browser automation as if you were teaching a human using GPT-4 Vision (github.com/vignshwarar)
3 months ago | georgehill | github.com | newest
6
Ask HN: What's the name of the website where the structure changes as I scroll?
3 months ago | georgehill | ycombinator.com | newest
0
The essay "How to Start a Startup," which grew into Y Combinator (twitter.com/paulg)
4 months ago | georgehill | twitter.com | newest
2
Ask HN: Video Understanding
4 months ago | georgehill | ycombinator.com | frontpage
1
GPT-Vision first most reliable open-source browser automation (github.com/vignshwarar)
4 months ago | georgehill | github.com | newest
2
Prompting-Based Methods for Text Ranking Using Large Language Models (reachsumit.com)
4 months ago | georgehill | reachsumit.com | newest
3
Bard. Is. Insanely. Useful (twitter.com/cgarciae88)
4 months ago | georgehill | twitter.com | newest
4
Unfortunately, No AGI Yet (twitter.com/omarsar0)
4 months ago | georgehill | twitter.com | newest
1
DeWave: Discrete EEG Waves Encoding for Brain Dynamics to Text Translation (openreview.net)
4 months ago | georgehill | openreview.net | newest
31
PowerInfer: Fast Large Language Model Serving with a Consumer-Grade GPU [pdf] (sjtu.edu.cn)
4 months ago | georgehill | sjtu.edu.cn | best
1
Cargo make: Rust task runner and build tool (github.com/sagiegurari)
4 months ago | georgehill | github.com | newest
2
Emmett Shear on AI's culture wars (mercury.com)
4 months ago | georgehill | mercury.com | newest
2
xAI: A Spectral Condition for Feature Learning (arxiv.org)
4 months ago | georgehill | arxiv.org | newest
17
A Mathematical Perspective on Transformers (arxiv.org)
4 months ago | georgehill | arxiv.org | best
2
Expanded legal protections and improvements to our API (anthropic.com)
4 months ago | georgehill | anthropic.com | newest
1
Harvey Raises $80M Series B from Elad GIL, Kleiner Perkins, OpenAI and Sequoia (harvey.ai)
4 months ago | georgehill | harvey.ai | newest
47
Mapping the semantic void: Strange goings-on in GPT embedding spaces (lesswrong.com)
4 months ago | georgehill | lesswrong.com | best
166
Word2Vec received 'strong reject' four times at ICLR2013 (openreview.net)
4 months ago | georgehill | openreview.net | best
3
Point Transformer V3: Simpler, Faster, Stronger (arxiv.org)
4 months ago | georgehill | arxiv.org | newest
70
OpenAI employee: GPT-4.5 rumor was a hallucination (twitter.com/willdepue)
4 months ago | georgehill | twitter.com | best
1
Agent Attention: On the Integration of Softmax and Linear Attention (arxiv.org)
4 months ago | georgehill | arxiv.org | newest
2
Agent Attention: On the Integration of Softmax and Linear Attention (github.com/leaplabthu)
4 months ago | georgehill | github.com | frontpage
1
My First $1 Online (marclou.beehiiv.com)
4 months ago | georgehill | beehiiv.com | newest
1
NeurIPS Test of Time: Tomas Mikolov's Insights on Word2vec, GloVe, and AI (pastebin.com)
4 months ago | georgehill | pastebin.com | newest
1
Mistral: $0.00 / 1M output tokens (openrouter.ai)
4 months ago | georgehill | openrouter.ai | newest
0
Infinite Inference Power for AI [video] (youtube.com)
4 months ago | georgehill | youtube.com | newest
1
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention (arxiv.org)
4 months ago | georgehill | arxiv.org | newest
0
MLC LLM: Universal Language Model Deployment Across Diverse Hardware and Apps (mlc.ai)
4 months ago | georgehill | mlc.ai | newest
1
Advance State of the Art Model and Tooling Support in Azure AI Studio (microsoft.com)
4 months ago | georgehill | microsoft.com | newest
3
Solve Puzzles. Learn CUDA (github.com/srush)
4 months ago | georgehill | github.com | newest
2
StatQuest: Word Embedding with PyTorch and Lightning (lightning.ai)
4 months ago | georgehill | lightning.ai | newest
2
Universal Approximators: State-Space Models with Memory Decay (arxiv.org)
4 months ago | georgehill | arxiv.org | newest
1
Anyone hear of GPT4.5 drop today? (reddit.com)
4 months ago | georgehill | reddit.com | newest
3
Balaji Fund (angellist.com)
4 months ago | georgehill | angellist.com | newest
34
[flagged] Kanye West Launches Yews News Platform (yews.news)
4 months ago | georgehill | yews.news | newest
6
Elon Musk Is Planning a New University in Austin (bloomberg.com)
4 months ago | georgehill | bloomberg.com | newest
1
Mistral-Medium vs. GPT-4's code generation (twitter.com/deliprao)
4 months ago | georgehill | twitter.com | newest
1
Twitter XSS + CSRF Vulnerability Exposes Account Takeover Risk (twitter.com/shoucccc)
4 months ago | georgehill | twitter.com | newest
2
Our Latest Experimental Tools and Technology (labs.google)
4 months ago | georgehill | labs.google | newest
37
Claude for Google Sheets (anthropic.com)
4 months ago | georgehill | anthropic.com | frontpage
1
Launched OpenAI 8 years and 1 day ago (twitter.com/gdb)
4 months ago | georgehill | twitter.com | newest
2
A Neural Corpus Indexer for Document Retrieval (arxiv.org)
4 months ago | georgehill | arxiv.org | newest
2
LLM360: Towards Transparent Open-Source LLMs (arxiv.org)
4 months ago | georgehill | arxiv.org | newest
3
EdgeSAM: Prompt-in-the-Loop Distillation for On-Device Deployment of Sam (arxiv.org)
4 months ago | georgehill | arxiv.org | newest
2
Together Inference Engine – the fastest inference available (together.ai)
4 months ago | georgehill | together.ai | newest
1
Also how GPT-4 works (youtu.be)
4 months ago | georgehill | youtu.be | newest
134
GigaGPT: GPT-3 sized models in 565 lines of code (cerebras.net)
4 months ago | georgehill | cerebras.net | best
2
Official PR Reveals the Inference Code for Mixtral 8x7B (github.com/vllm-project)
4 months ago | georgehill | github.com | newest
149
Mistral: Our first AI endpoints are available in early access (mistral.ai)
4 months ago | georgehill | mistral.ai | best
239
Mixtral of experts (mistral.ai)
4 months ago | georgehill | mistral.ai | best
2
LLMs as Copilots for Theorem Proving in Lean (github.com/lean-dojo)
4 months ago | georgehill | github.com | newest
1
Combining Continuous-Time, Recurrent, and Convolutional Models (stanford.edu)
4 months ago | georgehill | stanford.edu | newest
1
Image to Text ROS2 (gist.github.com)
4 months ago | georgehill | github.com | newest
41
Mathematicians have found a new upper limit to the Ramsey number (quantamagazine.org)
4 months ago | georgehill | quantamagazine.org | best
1
Play with mixtral-8x7b (vercel.ai)
4 months ago | georgehill | vercel.ai | newest
2
Automating Continual Learning (arxiv.org)
4 months ago | georgehill | arxiv.org | frontpage
1
The EU AI Act negotiations ended (twitter.com/ylecun)
4 months ago | georgehill | twitter.com | newest
2
OpenAI's Altman Ouster Was Result of Drawn-Out Tensions (bloomberg.com)
4 months ago | georgehill | bloomberg.com | newest
Next >