Next >
19
Graham: Synchronizing Clocks by Leveraging Local Clock Properties (2022) [pdf] (usenix.org)
a month ago | mlerner | usenix.org | frontpage
6
Autothrottle: Resource Management for SLO-Targeted Microservices (usenix.org)
7 months ago | mlerner | usenix.org | frontpage
1
Resiliency at Scale: Managing Google's TPUv4 Machine Learning Supercomputer (micahlerner.com)
8 months ago | mlerner | micahlerner.com | newest
3
How the anti-cheat is anti-cheating so far (leagueoflegends.com)
a year ago | mlerner | leagueoflegends.com | newest
1
ServiceRouter: Hyperscale and Minimal Cost Service Mesh at Meta (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
1
ServiceRouter: Hyperscale and Minimal Cost Service Mesh at Meta (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
2
A Cloud-Scale Characterization of Remote Procedure Calls (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
1
A Cloud-Scale Characterization of Google's Remote Procedure Calls (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
1
Gemini, Amazon's system for fast failure recovery in distributed model training (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
89
Defcon: Preventing overload with graceful feature degradation (2023) (micahlerner.com)
a year ago | mlerner | micahlerner.com | best
1
MotherDuck: DuckDB in the Cloud and in the Client [pdf] (cidrdb.org)
a year ago | mlerner | cidrdb.org | newest
2
Gemini: Fast Failure Recovery in Distributed Training with In-Memory Checkpoints (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
1
Gemini: Fast Failure Recovery in Distributed Training with In-Memory Checkpoints (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
3
XFaaS: Hyperscale and Low Cost Serverless Functions at Meta (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
2
XFaaS: Hyperscale and Low Cost Serverless Functions at Meta (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
2
Blueprint: A Toolchain for Highly-Reconfigurable Microservice Applications (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
14
Gemini: Fast Failure Recovery in Distributed Training with In-Memory Checkpoints [pdf] (rice.edu)
a year ago | mlerner | rice.edu | best
1
XFaaS: Hyperscale and Low Cost Serverless Functions at Meta [pdf] (upenn.edu)
a year ago | mlerner | upenn.edu | newest
2
Efficient Memory Management for Large Language Model Serving with PagedAttention (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
0
Efficient Memory Management for Large Language Model Serving with PagedAttention (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
1
Blueprint: A Toolchain for Highly-Reconfigurable Microservice Applications (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
5
Blueprint: A Toolchain for Highly-Reconfigurable Microservice Applications (micahlerner.com)
a year ago | mlerner | micahlerner.com | newest
4
Defcon: Preventing Overload with Graceful Feature Degradation (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
3
Defcon: Preventing Overload with Graceful Feature Degradation (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
Ask HN: Streaming reading academic CS papers?
2 years ago | mlerner | ycombinator.com | newest
11
Towards an adaptable systems architecture for memory tiering at warehouse-scale (micahlerner.com)
2 years ago | mlerner | micahlerner.com | frontpage
2
Sundial: Fault-Tolerant Clock Synchronization for Datacenters (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
Empowering Azure Storage with RDMA (usenix.org)
2 years ago | mlerner | usenix.org | newest
2
Sundial: Fault-Tolerant Clock Synchronization for Datacenters (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
Automatic Reliability Testing for Cluster Management Controllers (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
13
TelaMalloc: Efficient On-Chip Memory Allocation for Production ML Accelerators (micahlerner.com)
2 years ago | mlerner | micahlerner.com | best
1
Perseus: A Fail-Slow Detection Framework for Cloud Storage Systems (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
Towards an Adaptable Systems Architecture for Memory Tiering at Warehouse-Scale (acm.org)
2 years ago | mlerner | acm.org | newest
2
Elastic Cloud Services: Scaling Snowflake’s Control Plane (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
Meta’s Next-Generation Realtime Monitoring and Analytics Platform (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
Towards an Adaptable Systems Architecture for Memory Tiering at Warehouse-Scale (acm.org)
2 years ago | mlerner | acm.org | newest
11
Ambry: LinkedIn’s Scalable Geo-Distributed Object Store (micahlerner.com)
2 years ago | mlerner | micahlerner.com | frontpage
2
Method Overloading the Circuit (charap.co)
2 years ago | mlerner | charap.co | newest
2
IASO: A Fail-Slow Detection and Mitigation Framework for Storage Services (usenix.org)
2 years ago | mlerner | usenix.org | frontpage
2
Pond: CXL-Based Memory Pooling Systems for Cloud Platforms (acm.org)
2 years ago | mlerner | acm.org | newest
2
Propeller: A Profile Guided Relinking Optimizer for Warehouse-Scale Applications (acm.org)
2 years ago | mlerner | acm.org | newest
1
Ambry: LinkedIn’s Scalable Geo-Distributed Object Store (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
Fail-Slow at Scale: Hardware Performance Faults in Large Prod Systems (usenix.org)
2 years ago | mlerner | usenix.org | newest
1
Perseus: A Fail-Slow Detection Framework for Cloud Storage Systems (usenix.org)
2 years ago | mlerner | usenix.org | newest
2
Ambry: LinkedIn’s Scalable Geo-Distributed Object Store (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
Ambry: LinkedIn’s Scalable Geo-Distributed Object Store [pdf] (uiuc.edu)
2 years ago | mlerner | uiuc.edu | newest
1
SDN in the Stratosphere: Loon’s Aerospace Mesh Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
Meta’s Next-Generation Realtime Monitoring and Analytics Platform (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
Meta’s Next-Generation Realtime Monitoring and Analytics Platform (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
Meta’s Next-Generation Realtime Monitoring and Analytics Platform (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
3
Meta’s Next-Generation Realtime Monitoring and Analytics Platform (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
5
Meta’s Next-Generation Realtime Monitoring and Analytics Platform (micahlerner.com)
2 years ago | mlerner | micahlerner.com | frontpage
2
EPaxos Revisited [pdf] (usenix.org)
2 years ago | mlerner | usenix.org | newest
1
The Tensor Data Platform: Towards an AI-Centric Database System [pdf] (cidrdb.org)
2 years ago | mlerner | cidrdb.org | newest
1
Elastic Cloud Services: Scaling Snowflake’s Control Plane (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
CS Conferences in 2023 (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
Elastic Cloud Services: Scaling Snowflake’s Control Plane (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
4
Elastic Cloud Services: Scaling Snowflake’s Control Plane (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
SDN in the Stratosphere: Loon’s Aerospace Mesh Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
Google’s Datacenter Network: Clos Topologies and Centralized Control (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
The Snowflake Elastic Data Warehouse (acm.org)
2 years ago | mlerner | acm.org | newest
1
Building an Elastic Query Engine on Disaggregated Storage (usenix.org)
2 years ago | mlerner | usenix.org | newest
17
Borg, Omega, Kubernetes: Lessons learned over a decade (2016) (acm.org)
2 years ago | mlerner | acm.org | best
3
SDN in the Stratosphere: Loon’s Aerospace Mesh Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
Thinking about Availability in Large Service Infrastructures (research.google)
2 years ago | mlerner | research.google | newest
2
See it to believe it?: the role of visualisation in systems research (acm.org)
2 years ago | mlerner | acm.org | newest
2
How to fight prod incidents?: an empirical study on a large-scale cloud service (acm.org)
2 years ago | mlerner | acm.org | newest
1
Google’s Datacenter Network: Clos Topologies and Centralized Control (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
Google’s Datacenter Network: Clos Topologies and Centralized Control (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
Clos Topologies and Centralized Control in Google’s Datacenter Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
SDN in the Stratosphere: Loon’s Aerospace Mesh Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
Clos Topologies and Centralized Control in Google’s Datacenter Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
Clos Topologies and Centralized Control in Google’s Datacenter Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
Bluebird: High-Performance SDN for Bare-Metal Cloud Services (usenix.org)
2 years ago | mlerner | usenix.org | newest
1
Building an Elastic Query Engine on Disaggregated Storage (usenix.org)
2 years ago | mlerner | usenix.org | newest
1
Elastic cloud services: scaling snowflake's control plane (acm.org)
2 years ago | mlerner | acm.org | newest
1
SDN in the Stratosphere: Loon’s Aerospace Mesh Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
SDN in the Stratosphere: Loon’s Aerospace Mesh Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
SDN in the Stratosphere: Loon’s Aerospace Mesh Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
A Scalable, Commodity Data Center Network Architecture (acm.org)
2 years ago | mlerner | acm.org | newest
2
Jupiter evolving: transforming Google's network via optical circuit switches/SDN (acm.org)
2 years ago | mlerner | acm.org | newest
4
Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web (micahlerner.com)
2 years ago | mlerner | micahlerner.com | frontpage
2
Sundial: Fault-Tolerant Clock Synchronization for Datacenters (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
Seven years in the life of Hypergiants' off-nets (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
3
Building an Elastic Query Engine on Disaggregated Storage (usenix.org)
2 years ago | mlerner | usenix.org | newest
2
SDN in the Stratosphere: Loon’s Aerospace Mesh Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | frontpage
1
Fail-Slow at Scale: Hardware Performance Faults in Large Prod Systems (acm.org)
2 years ago | mlerner | acm.org | newest
2
Using deep learning to annotate the protein universe [pdf] (nature.com)
2 years ago | mlerner | nature.com | frontpage
1
SDN in the Stratosphere: Loon’s Aerospace Mesh Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
SDN in the Stratosphere: Loon’s Aerospace Mesh Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
2
ByteGraph: A high-performance distributed graph database in ByteDance (acm.org)
2 years ago | mlerner | acm.org | newest
1
SDN in the Stratosphere: Loon’s Aerospace Mesh Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
1
Kademlia: A Peer-to-peer information system based on the XOR Metric [pdf] (csail.mit.edu)
2 years ago | mlerner | mit.edu | newest
1
Seven years in the life of Hypergiants' off-nets (micahlerner.com)
2 years ago | mlerner | micahlerner.com | newest
3
SDN in the Stratosphere: Loon’s Aerospace Mesh Network (micahlerner.com)
2 years ago | mlerner | micahlerner.com | frontpage
1
Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web
2 years ago | mlerner | arxiv.org | newest
1
Seven years in the life of Hypergiants' off-nets
2 years ago | mlerner | micahlerner.com | newest
1
Scaling Memcache at Facebook
3 years ago | mlerner | micahlerner.com | newest
1
Seven years in the life of Hypergiants' off-nets
3 years ago | mlerner | micahlerner.com | newest
1
Scaling Memcache at Facebook
3 years ago | mlerner | micahlerner.com | newest
Next >