Performance Testing · SRE · AI Performance
Make systems fast, reliable, and observable.
Deep, practical engineering on load testing, reliability, observability, and the performance of modern AI systems — plus free interactive calculators you'll actually use.
What we cover
Performance Testing
Load & stress testing with JMeter, k6, Gatling, LoadRunner and NeoLoad.
SRE
SLIs, SLOs, error budgets, capacity planning, incident response, chaos.
AI Performance
LLM latency & throughput, token economics, GPU serving, RAG and vector-DB perf.
Observability
OpenTelemetry, Prometheus, Grafana, Datadog, Dynatrace, New Relic.
Concepts
Little's Law, queueing theory, percentiles, USL, GC tuning.
Latest articles
All posts →Measuring LLM Inference Performance: Latency, Throughput, and Cost
The metrics that actually matter for LLM serving — TTFT, TPOT, tokens/sec, and cost per request — how they trade off, and how to load-test an inference endpoint.
Read →Installing and Configuring JMeter for Real Load Testing
How to install Apache JMeter correctly, the JVM heap settings that matter, and the first configuration changes you should make before your first real test.
Read →What Is Apache JMeter? An Introduction for Performance Testers
What Apache JMeter is, why it's still the most widely used open-source load testing tool, and where it fits next to k6, Gatling, and LoadRunner.
Read →