1
1
M^2RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Language Modeling (arxiv.org)
1
Tools of the Trade: C2C Activation Offloading on Grace Blackwell (poolside.ai)
42
EsoLang-Bench: Evaluating Genuine Reasoning in LLMs via Esoteric Languages (esolang-bench.vercel.app)
1
Speed-Of-Light ExecBench: A benchmark of real-world DL kernel problems (github.com/nvidia)
2
Equality Saturation and Symbolic Regression (egraphs.org)
2
NCCL EP: Towards a Unified Expert Parallel Communication API for NCCL (arxiv.org)
2
Vectorization of Verilog Designs and its Effects on Verification and Synthesis (arxiv.org)
1
LATTE ’26: Workshop on Languages, Tools, and Techniques for Accelerator Design (cornell.edu)
1
Read Less, Steer More (ezyang.com)
1
The Data Structures of Roads (sandboxspirit.com)
1
Verifying Move Borrow Checker in Lean:An Experiment in AI-Assisted PL Metatheory (proofsandintuitions.net)
4
Real or Slop? – Programming Languages Papers Edition (zackg.me)
1
Mamba-3 (together.ai)
1
EvoX: Letting AI Evolve Its Own Evolution Process (skydiscover-ai.github.io)
1
Native DSLs Ops in PyTorch (ianbarber.blog)
1
Flash-KMeans: Fast and Memory-Efficient Exact K-Means (arxiv.org)
2
Gluon: Explicit Performance (lei.chat)
2
Block Number Formats are (Still!) Direction Preservers (constantinides.net)
3
cuTile Rust: a safe, tile-based kernel programming DSL for Rust (github.com/nvlabs)
1
KernelBlaster: A framework for in context learning for code optimization (github.com/nvlabs)
1
Demystifying and Improving Lazy Promotion in Cache Eviction [pdf] (vldb.org)
1
Journeying through Optimization with Heuristics [video] (youtube.com)
3
To Sparsify or to Quantize: A Hardware Architecture View (sigarch.org)
6
Efficient sparse computations using linear algebra aware compilers (2025) (osti.gov)
1
A Field Guide to Reward Hacking in AI Kernel Generation (wafer.ai)
1
AI and the Mixed-Consistency Future (jhellerstein.github.io)
1
FIDES: End-to-end Compartments for Mixed-language Systems [pdf] (kcsrk.info)
1
Practical Type Inference: High‑Throughput Recovery of Real‑World Types (arxiv.org)
1
Idempotent Slices with Applications to Code-Size Reduction (arxiv.org)
1
Designing AI Chip Hardware and Software (docs.google.com)
2
Refinement Modeling and Verification of RISC-V Assembly Using Knuckledragger (philipzucker.com)
2
Breaking Control Flow Integrity by Abusing Modern C++ (Coroutines) – BH USA 2025 [video] (youtube.com)
1
Programming the Loop (ianbarber.blog)
2
Scalable Training of Mixture-of-Experts Models with Megatron Core (arxiv.org)
3
PolyBlocks: A Compiler Infrastructure for AI Chips and Programming Frameworks (arxiv.org)
2
Formalizing Data Structures and Algorithms with Agents (risemsr.github.io)
2
Thinnings: Sublist Witnesses and de Bruijn Index Shift Clumping (philipzucker.com)
2
Advent of Computing: Dan Temkin – Forty-Four Esolangs (libsyn.com)
1
Checking Write Bandwidth on GPUs (clamtech.org)
1
Challenges in Decompilation and Reverse Engineering of CUDA-Based Kernels [pdf] (nicolo.dev)
2
Block Number Formats Are Direction Preservers (constantinides.net)
2
Cutie Fly: CuTe Layout Representation and Algebra, CuTeDSL, FlyDSL (ianbarber.blog)
2
Converting Binary Floating-Point Numbers to Shortest Decimal Strings (wiley.com)
2
Controlling Floating-Point Determinism in NVIDIA CCCL (nvidia.com)
2
Bootstrapping Fuzzers for Compilers of Low-Resource Language Dialects Using LLMs (arxiv.org)
2
Custom Data Structures in E-Graphs (uwplse.org)
2
Formal Verification in the Age of AI (verse.systems)
3
CuTe Layout Representation and Algebra (arxiv.org)
1
Bespoke OLAP: Synthesizing Workload-Specific One-Size-Fits-One Database Engines (arxiv.org)
3
SkyDiscover: A Flexible Framework for AI-Driven Sci. and Algorithmic Discovery (skydiscover-ai.github.io)
4
Silent Backwards Compatibility Breaking Changes in PyTorch (ezyang.com)
1
Building an Open-Source Verilog Simulator with AI: 580K Lines in 43 Days (normalcomputing.com)
1
AgentCgroup: Understanding and Controlling OS Resources of AI Agents (github.com/eunomia-bpf)
1
Equality Saturation for Circuit Synthesis and Verification (imperial.ac.uk)
1
An Introduction to Folios (oracle.com)
2
Perplexity Cannot Always Tell Right from Wrong (ianbarber.blog)
1
Ganak: The Making of a Versatile, High Performance Model Counter (msoos.org)
12
TorchLean: Formalizing Neural Networks in Lean (leandojo.org)
1
Fast Autoscheduling for Sparse ML Frameworks (fredrikbk.com)
1
TENSURE: Fuzzing Sparse Tensor Compilers (Registered Report) (ndss-symposium.org)
1
A Reinforcement Learning Environment for Automatic Code Optimization in MLIR (arxiv.org)
2
Metamorphic Testing for Infrastructure-as-Code Engines [pdf] (programming-group.com)
2
K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model (arxiv.org)
1
Midtraining Bridges Pretraining and Posttraining Distributions (arxiv.org)
2
Testing "Raw" GPU Cache Latency (clamtech.org)
4
Hexagon-MLIR: An AI Compilation Stack for Qualcomm's NPUs (arxiv.org)
1
Analyzing Latency Hiding and Parallelism in an MLIR-Based AI Kernel Compiler (arxiv.org)
1
Argus: Automated Discovery of Test Oracles for DBMSs Using LLMs (joyemang33.github.io)
2
A Decade of Docker Containers (acm.org)
1
In Pursuit of High-Fidelity GPU Kernel Benchmarking (standardkernel.com)
2
From ASPLOS to Orbit: Unikernels Twelve Years Later (gazagnaire.org)
1
VeriSoftBench: Repository-Scale Formal Verification Benchmarks for Lean (utopia-group.github.io)
3
CSLib: The Lean Computer Science Library (arxiv.org)
3
Heliostat: Harnessing Ray Tracing Accelerators for Page Table Walks – ISCA 2025 [video] (youtube.com)
2
LDOS: Toward a Learning-Directed Operating System (sigops.org)
1
GenAI for Systems: Recurring Challenges&Design Principles from SW to Silicon (arxiv.org)
3
Precise exceptions in relaxed architectures [video] (youtube.com)
1
BitFields API: Type-Safe Bit Packing for Lock-Free Data Structures (rocksdb.org)
2
ThunderKittens 2.0: Even Faster Kernels for Your GPUs (stanford.edu)
1
Proof Assistants in the Age of AI (leodemoura.github.io)
1
Open Source Software Projects Are Brands (reidkleckner.dev)
1
Evaluating the Hardest CS Problems in the Age of LLMs (frontier-cs.org)
2
SE Radio 708: Jens Gustedt on C in 2026 (se-radio.net)
1
Spaghetti Bench: Evaluating AI Agents on Concurrency Bug Fixes (pastalab.org)
2
Computer Science as Infrastructure: The Spine of the Lean CSLib (arxiv.org)
2
Problems with a weak tryLock operation in C and C++ standards (swift.org)
1
Two mechanisms for dynamic type checks (wingolog.org)
1
Semantics, Operations, and Properties of P3109 Floating-Point Formats in Lean (github.com/rutgers-apl)
2
Oral History of Michael J. Flynn [video] (youtube.com)
2
Productively Programming Accelerated Computing Systems – Rohan Yadav (Stanford) [video] (youtube.com)
8
How to train your program verifier (risemsr.github.io)
3
Minimalist Design for Space Camera Flight Software (acm.org)
1
AMO-Lean: Towards Formally Verified Optimization via Equality Saturation in Lean (lambdaclass.com)
4
Fine-Tuning GPT-5 for GPU Kernel Generation (arxiv.org)
3
"Am I the only one still wondering what is the deal with linear types?" – Jon S (jonmsterling.com)
1
Running the "Reflections on Trusting Trust" Compiler: Revisiting the Backdoor (acm.org)
2
TileIR (ianbarber.blog)
1
Pushing Tensor Accelerators Beyond MatMul in a User-Schedulable Language (arxiv.org)
2