Articles by ashvardanian
46

Full Unicode Search at 50× ICU Speed with AVX‑512 (ashvardanian.com)

2

Emulating AVX-512 intrinsics in Miri (trifectatech.org)

98

Why we built Lightpanda in Zig (lightpanda.io)

3

Nvidia cuTile: Python DSL and a new IR for tile-based CUDA kernels (github.com/nvidia)

9

SereneDB Secures $2.1M to Challenge the Status Quo of Search and Analytics (serenedb.com)

1

Evidently 0.7.17: open-source LLM tracing and dataset management (evidentlyai.com)

8

How Good Are Chinese CPUs? Benchmarking the Loongson 3A6000 (lemire.me)

3

Symmetric Power Transformers (manifestai.com)

2

Drug-like antibody design against challenging targets with atomic precision (chaidiscovery.com)

1

EMMI: Where Experimentation Meets Machine Intelligence (terraytx.com)

1

The GPU Observability Gap: Why We Need eBPF on GPU Devices (eunomia.dev)

11

Performance hacks for faster Python code (jetbrains.com)

1

The Write Last, Read First Rule (tigerbeetle.com)

2

Before AI's Kepler Moment – Are LLMs the Epicycles of Intelligence? (ashvardanian.com)

3

We built a vector search engine that lets you choose precision at query time (clickhouse.com)

3

Powering AI at Scale: Benchmarking 1B Vectors in YugabyteDB (yugabyte.com)

1

I Built Fast Vector Search for Legal Documents (medium.com/adlumal)

3

Tuning TLS: AES-256 Now Beats ChaCha20 on Every Modern CPU (ashvardanian.com)

90

The state of SIMD in Rust in 2025 (shnatsel.medium.com)

2

Nvidia GPU: Discussing Blackwell Limitations and Predicting Rubin Arch (github.com/zartbot)

1

Benchmarking the AMD EPYC 9V64H: Azure HBv5's Custom AMD CPU with HBM3 (phoronix.com)

3

TheWhisper: High-Performance Speech-to-Text (github.com/thestageai)

6

Apple Plans to Open-Source an LLVM Tool to Security Harden Large C++ Codebases (phoronix.com)

1

When `-O3` is 2x slower than `-O2` – profiling binary search in Rust (cat-solstice.github.io)

2

Red Hat to Distribute Nvidia CUDA Across RHEL, Red Hat AI and OpenShift (phoronix.com)

4

Scaling Elections with GPUs and Mojo (ashvardanian.com)

1

Concurrency Step-by-Step: Conforming to Protocols (massicotte.org)

1

Novartis to acquire Avidity Biosciences for $12B (aviditybiosciences.com)

1

EXT4 Patches Enable Block Size Greater Than Page Size Support (phoronix.com)

2

Fixing Intel Foundry Is Like Stopping Tripping Down the Stairs (nextplatform.com)

1

Starter Guide for London Founders (makeinlondon.com)

1

SymSpell C99: Building the Fastest Spell Checker in Pure C (suman-pokhrel.com.np)

1

The Cost of Software Libraries: CLI Parsing in C vs. Rust (cgamedev.substack.com)

1

Water: A Zig chess library, framework, and engine (github.com/trevorswan11)

1

PKBoost: Gradient boosting that adjusts to concept drift in imbalanced data (github.com/pushp-kharat1)

3

A triangular space-filling curve (2012) (ideophilus.wordpress.com)

2

Edge – Vector Database in Zig Using USearch and RocksDB (github.com/antarys-ai)

33

VectorWare – from creators of `rust-GPU` and `rust-CUDA` (vectorware.com)

5

Are Unikernels the Answer for Next-Gen AI Cloud Workloads? (thenewstack.io)

1

Uber and Nebius investing $375M in Avride robotaxis (bloomberg.com)

2

C++26: Printing `std:tuple` with expression templates (cppstories.com)

2

Speculations on arenas and non-trivial destructors (nullprogram.com)

1

SorterHunter: An evolutionary approach to find small sorting networks (github.com/bertdobbelaere)

1

Kosame: Macro-Based Rust ORM Inspired by Prisma and Drizzle (github.com/pikaju)

1

The Impatient Programmer's Guide to Bevy and Rust: Chapter 1 (aibodh.com)

1

New Linux Kernel Patches from Intel Delivering And18% Database Performance (phoronix.com)

1

Memory Allocation in Go (nghiant3223.github.io)

3

PyTorch 2.9 released with C ABI and better multi-GPU support (pytorch.org)

31

ChkTag: x86 Memory Safety (intel.com)

2

AMD Zen5 Turin CPUs arrive to AWS forming the M8a instance family (amazon.com)

1

MetaGraph: Scalable annotated de Bruijn graphs for DNA indexing and alignment (github.com/ratschlab)

1

PyPI Usage September 2025 (clickhouse.com)

5

2x Faster Hashes on AWS Graviton: Neon → SVE2 (ashvardanian.com)

1

Timetraveler – bridging chrono ↔ time for Rust date/time interop (raniz.blog)

1

Tokenization from First Principles (ggrigorev.me)

2

ClickHouse Extends Series C Financing and Expands Leadership Team to Fuel Growth (clickhouse.com)

2

Are LLMs the Epicycles of Intelligence? (ashvardanian.com)

1

Bulk Operations in Boost.Bloom (bannalia.blogspot.com)

3

Exploring .NET Core platform intrinsics: Accelerating SHA-256 on ARMv8 (2018) (mijailovic.net)

1

Correcting Outdated Facts in Wikidata (anj.ai)

1

The Molecular Revolution in Biology: The History, Structure, and Future (drchrisearl.substack.com)

4

C++26: `Std:Optional<T&>` (sandordargo.com)

3

Eigen 5.0 Released (gitlab.com/libeigen)

1

Before AI's Kepler Moment (ashvardanian.com)

1

Matrix Core Programming on AMD CDNA 3 and CDNA 4 Architecture (amd.com)

51

Tinker by Thinking Machines (thinkingmachines.ai)

1

Universal: A header-only C++ template library of custom arithmetic plug-in types (github.com/stillwater-sc)

5

Sonnet 4.5 Review: The first spec-driven model has arrived (zencoder.ai)

2

Rebuild Biotech for the AI Era (benchling.com)

43

Beyond OpenMP in C++ and Rust: Taskflow, Rayon, Fork Union (ashvardanian.com)

7

A Gentle Introduction to CUDA PTX (philipfabianek.com)

2

CUDA Hello World: Done Less Wrong (ashvardanian.com)

20

`std::flip` (morwenn.github.io)

25

Factory Raises $50M Series B (factory.ai)

1

Seven Years of Firecracker (brooker.co.za)

1

Understanding AddressSanitizer: Better memory safety for your code (2024) (trailofbits.com)

7

Modular Raises $250M to Scale AI's Unified Compute Layer (modular.com)

19

Archestra – open-source MCP orchestrator for everyone (github.com/archestra-ai)

1

Lessons from leaders who turned AI challenges into wins (fastcompany.com)

7

Taking a Look at Compression Algorithms (cefboud.com)

21

Processing Strings 109x Faster Than Nvidia on H100 (ashvardanian.com)

2

Why Do LLMs Design Mediocre Architecture? (recurse.ml)

1

Zencoder Lets Developers Bring Their CLI Coding Agent of Choice to Its Platform (thenewstack.io)

1

Proprietary sector indexes across tech and beyond (multiples.vc)

2

The Constexpr Debugger (jetbrains.com)

69

Optimizing ClickHouse for Intel's ultra-high core count processors (clickhouse.com)

1

{fmt} v12: 60% faster double formatting, full constexpr, and C++ 20 modules (github.com/fmtlib)

2

Stringwa.rs on GPUs: Databases and Bioinformatics (ashvardanian.com)

1

Fun with Google Scholar (diffuse.one)

2

AlbumentationsX: Next-generation Albumentations for image augmentations (github.com/albumentations-team)

58

A clickable visual guide to the Rust type system (rustcurious.com)

5

Adjacency Matrix and std:mdspan, C++23 (cppstories.com)

14

C++20 Modules: Practical Insights, Status and TODOs (chuanqixu9.github.io)

1

Nvidia's soaring stock generated two more billionaires in Jensen Huang's C-suite (fortune.com)

1

Laravel inventor tells devs to quit writing 'cathedrals of complexity' (theregister.com)

3

The crawl before the fall of referrals: AI's impact on content providers (cloudflare.com)

4

Fork Union: Beyond OpenMP in C++ and Rust? (ashvardanian.com)

2

Graph-Code: A Graph-Based RAG System for Any Codebases (github.com/vitali87)

8

Memory is slow, Disk is fast – Part 1 (bitflux.ai)

1

Publishing for Machines by Andrew White (diffuse.one)