2
3
CuTe Layout Representation and Algebra (arxiv.org)
1
Bespoke OLAP: Synthesizing Workload-Specific One-Size-Fits-One Database Engines (arxiv.org)
3
SkyDiscover: A Flexible Framework for AI-Driven Sci. and Algorithmic Discovery (skydiscover-ai.github.io)
4
Silent Backwards Compatibility Breaking Changes in PyTorch (ezyang.com)
1
Building an Open-Source Verilog Simulator with AI: 580K Lines in 43 Days (normalcomputing.com)
1
AgentCgroup: Understanding and Controlling OS Resources of AI Agents (github.com/eunomia-bpf)
1
Equality Saturation for Circuit Synthesis and Verification (imperial.ac.uk)
1
An Introduction to Folios (oracle.com)
2
Perplexity Cannot Always Tell Right from Wrong (ianbarber.blog)
1
Ganak: The Making of a Versatile, High Performance Model Counter (msoos.org)
12
TorchLean: Formalizing Neural Networks in Lean (leandojo.org)
1
Fast Autoscheduling for Sparse ML Frameworks (fredrikbk.com)
1
TENSURE: Fuzzing Sparse Tensor Compilers (Registered Report) (ndss-symposium.org)
1
A Reinforcement Learning Environment for Automatic Code Optimization in MLIR (arxiv.org)
2
Metamorphic Testing for Infrastructure-as-Code Engines [pdf] (programming-group.com)
2
K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model (arxiv.org)
1
Midtraining Bridges Pretraining and Posttraining Distributions (arxiv.org)
2
Testing "Raw" GPU Cache Latency (clamtech.org)
4
Hexagon-MLIR: An AI Compilation Stack for Qualcomm's NPUs (arxiv.org)
1
Analyzing Latency Hiding and Parallelism in an MLIR-Based AI Kernel Compiler (arxiv.org)
1
Argus: Automated Discovery of Test Oracles for DBMSs Using LLMs (joyemang33.github.io)
2
A Decade of Docker Containers (acm.org)
1
In Pursuit of High-Fidelity GPU Kernel Benchmarking (standardkernel.com)
2
From ASPLOS to Orbit: Unikernels Twelve Years Later (gazagnaire.org)
1
VeriSoftBench: Repository-Scale Formal Verification Benchmarks for Lean (utopia-group.github.io)
3
CSLib: The Lean Computer Science Library (arxiv.org)
3
Heliostat: Harnessing Ray Tracing Accelerators for Page Table Walks – ISCA 2025 [video] (youtube.com)
2
LDOS: Toward a Learning-Directed Operating System (sigops.org)
1
GenAI for Systems: Recurring Challenges&Design Principles from SW to Silicon (arxiv.org)
3
Precise exceptions in relaxed architectures [video] (youtube.com)
1
BitFields API: Type-Safe Bit Packing for Lock-Free Data Structures (rocksdb.org)
2
ThunderKittens 2.0: Even Faster Kernels for Your GPUs (stanford.edu)
1
Proof Assistants in the Age of AI (leodemoura.github.io)
1
Open Source Software Projects Are Brands (reidkleckner.dev)
1
Evaluating the Hardest CS Problems in the Age of LLMs (frontier-cs.org)
2
SE Radio 708: Jens Gustedt on C in 2026 (se-radio.net)
1
Spaghetti Bench: Evaluating AI Agents on Concurrency Bug Fixes (pastalab.org)
2
Computer Science as Infrastructure: The Spine of the Lean CSLib (arxiv.org)
2
Problems with a weak tryLock operation in C and C++ standards (swift.org)
1
Two mechanisms for dynamic type checks (wingolog.org)
1
Semantics, Operations, and Properties of P3109 Floating-Point Formats in Lean (github.com/rutgers-apl)
2
Oral History of Michael J. Flynn [video] (youtube.com)
2
Productively Programming Accelerated Computing Systems – Rohan Yadav (Stanford) [video] (youtube.com)
8
How to train your program verifier (risemsr.github.io)
3
Minimalist Design for Space Camera Flight Software (acm.org)
1
AMO-Lean: Towards Formally Verified Optimization via Equality Saturation in Lean (lambdaclass.com)
4
Fine-Tuning GPT-5 for GPU Kernel Generation (arxiv.org)
3
"Am I the only one still wondering what is the deal with linear types?" – Jon S (jonmsterling.com)
1
Running the "Reflections on Trusting Trust" Compiler: Revisiting the Backdoor (acm.org)
2
TileIR (ianbarber.blog)
1
Pushing Tensor Accelerators Beyond MatMul in a User-Schedulable Language (arxiv.org)
2
TLX: Triton-Like Simplicity, a Clear Path to Peak Performance [video] (youtube.com)
1
Automating Inference Optimizations with NVIDIA TensorRT LLM AutoDeploy (nvidia.com)
1
SMTLIB as a Compiler IR I (philipzucker.com)
1
Equality Saturation Meets ML: The Next Step for Smarter Optimizing Compilers [video] (youtube.com)
2
Silicon Photonics in the Data Center: What a CMOS Exec Needs to Know (semiengineering.com)
1
Verifying Distributed Protocols in Veil (proofsandintuitions.net)
1
Project Pterodactyl: Incremental Architecture (jonmsterling.com)
2
European Lisp Symposium 2025: Talks (youtube.com)
2
AutoOverlap: Enabling Fine-Grained Overlap of Computation and Communication (arxiv.org)
2
Axe: A Simple Unified Layout Abstraction for Machine Learning Compilers (arxiv.org)
2
35th ACM SIGPLAN International Conference on Compiler Construction (CC 2026) (acm.org)
1
VFlatten: Selective Value-Object Flattening Using Hybrid Static&Dynamic Analysis [pdf] (iitb.ac.in)
1
Agentic Proof-Oriented Programming (risemsr.github.io)
1
MLIR-Tutor: Exercises for Learning MLIR (Originally Written for PPoPP 2026) (github.com/groverkss)
1
Fast Autoscheduling for Sparse ML Frameworks (ajroot.pl)
1
Replicate Forwards, Partial Backwards (ezyang.com)
1
Frontier-CS 1.0 Release (frontier-cs.org)
2
uops-again.info: corner-case behaviours of port assignment on Intel processors (uops-again.info)
1
When magic meets multicore: OCaml and its elegant era of parallelism [video] (youtube.com)
51
FlashAttention-T: Towards Tensorized Attention (acm.org)
1
Scaling GPU-to-CPU Migration for Efficient Distributed Execution on CPU Clusters (acm.org)
1
MetaAttention: A Unified&Performant Attention Framework across Hardware Backends (acm.org)
6
Ga68, a GNU Algol 68 Compiler (fosdem.org)
1
LLMs versus the Halting Problem: Revisiting Program Termination Prediction (orensultan.com)
1
DTensor Erasure (ezyang.com)
1
Let the Barbarians In: How AI Can Accelerate Systems Performance Research (sigops.org)
1
Triton Bespoke Layouts (lei.chat)
6
AMD64 Bit Matrix Multiply and Bit Reversal Instructions (amd.com)
30
Demystifying ARM SME to Optimize General Matrix Multiplications (arxiv.org)
6
Evolving the OCaml programming language – CSE Bytes: K C Sivaramakrishnan [video] (youtube.com)
2
Magellan: Autonomous Discovery of Compiler Optimization Heuristics w/AlphaEvolve (arxiv.org)
1
Automatic Data Enumeration for Fast Collections (mcmichen.cc)
1
An MLIR Lowering Pipeline for Stencils at Wafer-Scale (arxiv.org)
1
The JAX sharding type system (ezyang.com)
1
AutoSP: Unlocking Long-Context LLM Training via Compiler-Based SP (ICLR 2026) (openreview.net)
1
Disentangling unification and implicit coercion (subtyping interaction problem) (jonmsterling.com)
1
Global vs. Local SPMD (ezyang.com)
3
ACM SIGPLAN Symposium on Principles of Programming Languages (POPL) 2026 talks (youtube.com)
1
Long branches in compilers, assemblers, and linkers (maskray.me)
1
Megatron via shard_map (ezyang.com)
1
Cloud-Hardware Co-Design for Memory Bandwidth-Bound HPC Workloads: Azure HBv5 (acm.org)
2
CuTile on Blackwell: NVIDIA's Compiler Moat Is Already Built (patricktoulme.substack.com)
1
Compiling Classical Sequent Calculus to Stock Hardware: Duality of Compilation [video] (youtube.com)
2
Computing Sharding with Einsum (ezyang.com)
2
What Is Control Flow Analysis for Lambda Calculus? [audio] (podcasts.apple.com)
1
Introduction to Coinduction in Agda Part 1: Coinductive Programming (jesper.cx)
1
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in CLIs (arxiv.org)
1