Writing

Blog

Articles on performance testing, SRE, observability, and AI systems performance.

AI Performance

Measuring LLM Inference Performance: Latency, Throughput, and Cost

The metrics that actually matter for LLM serving — TTFT, TPOT, tokens/sec, and cost per request — how they trade off, and how to load-test an inference endpoint.

Read →
Performance Testing

Installing and Configuring JMeter for Real Load Testing

How to install Apache JMeter correctly, the JVM heap settings that matter, and the first configuration changes you should make before your first real test.

Read →
Performance Testing

What Is Apache JMeter? An Introduction for Performance Testers

What Apache JMeter is, why it's still the most widely used open-source load testing tool, and where it fits next to k6, Gatling, and LoadRunner.

Read →
Performance Testing

JMeter Thread Groups Explained: Users, Ramp-Up, and Loops

How JMeter Thread Groups control virtual users, ramp-up time, and loop count, and how to choose values that actually model your real traffic pattern.

Read →
SRE

SLOs and Error Budgets: A Practical Guide for Performance Engineers

How to turn vague reliability goals into measurable SLIs, SLOs, and error budgets — and how that math directly governs release velocity and on-call load.

Read →
Performance Testing

JMeter Assertions: Validating Responses Under Load

How to use JMeter assertions to catch silent failures — wrong content, slow responses, and unexpected status codes — that a simple pass/fail check misses.

Read →
Performance Testing

The JMeter HTTP Request Sampler: A Deep Dive

Every important setting on JMeter's HTTP Request sampler, from implementation choice to connection reuse, explained for accurate load testing.

Read →
Performance Testing

JMeter Listeners: Collecting and Reporting Results Correctly

Which JMeter listeners to use during scripting versus load generation, and how to produce a trustworthy HTML report from a non-GUI run.

Read →
Performance Testing

JMeter Correlation: Handling Session Tokens and Dynamic Values

How to extract and reuse dynamic values like session tokens, CSRF tokens, and IDs in JMeter so recorded scripts work correctly under load.

Read →
Performance Testing

JMeter Parameterization with CSV Data Config

How to drive JMeter test data from CSV files so virtual users don't all hammer the same account, search term, or product ID.

Read →
Performance Testing

JMeter Timers: Pacing and Think Time Done Right

The difference between JMeter's Constant Timer, Uniform Random Timer, and Constant Throughput Timer, and which one actually controls throughput.

Read →
Performance Testing

Distributed Load Testing with JMeter

How JMeter's controller/agent (master/slave) distributed testing mode works, and what to check before trusting results from multiple load generators.

Read →
Performance Testing

JMeter Logic Controllers: If, Loop, and Transaction Controllers

How JMeter's Logic Controllers (If Controller, Loop Controller, Transaction Controller) shape test flow and how to use them without breaking your results.

Read →
Performance Testing

JMeter Plugins: Extending What JMeter Can Do

An overview of the JMeter Plugins ecosystem — the Plugins Manager, the most widely used plugins, and how to install them safely.

Read →
Performance Testing

JMeter Best Practices and Common Pitfalls

A checklist of JMeter mistakes that produce misleading results, and the practices experienced performance testers use to avoid them.

Read →
Performance Testing

Reading JMeter's HTML Dashboard Report Correctly

A guide to JMeter's generated HTML dashboard report — which graphs matter, which are easy to misread, and how to compare two runs properly.

Read →
Performance Testing

Running JMeter in CI/CD: Non-GUI Mode and Automation

How to run JMeter from the command line in a CI pipeline, fail builds on performance regressions, and avoid common automation pitfalls.

Read →
Performance Testing

Database Load Testing with JMeter's JDBC Sampler

How to load test a database directly with JMeter's JDBC Request sampler, including connection pooling configuration and common gotchas.

Read →
Performance Testing

JMeter Groovy Scripting: Beyond the GUI

How to use JSR223 Groovy scripting in JMeter for custom logic that the built-in components can't express, with practical examples.

Read →
Performance Testing

JMeter vs k6 vs Gatling: Choosing the Right Load Testing Tool

A practical comparison of JMeter, k6, and Gatling across scripting model, protocol support, CI fit, and team skill requirements.

Read →
Performance Testing

Load Testing REST APIs with JMeter: A Practical Walkthrough

A practical walkthrough of scripting a realistic REST API load test in JMeter, from authentication to JSON assertions to reporting.

Read →
Performance Testing

Analyzing JMeter Results: Why Percentiles Beat Averages

How to properly analyze JMeter result data using percentiles instead of averages, with a worked example showing how averages hide real problems.

Read →
Performance Testing

Testing WebSockets with JMeter

How to load test WebSocket connections in JMeter using the WebSocket Samplers plugin, and what makes WebSocket load testing different from HTTP.

Read →
Performance Testing

LoadRunner Architecture: How VuGen, Controller, and Analysis Fit Together

A deeper look at how LoadRunner's three main components — VuGen, Controller, and Analysis — work together in a typical performance testing workflow.

Read →
Performance Testing

Introduction to LoadRunner: OpenText's Performance Engineering Platform

What LoadRunner is, its core components (VuGen, Controller, Analysis), and where it fits in 2026 alongside open-source alternatives.

Read →
Performance Testing

Recording Your First Script in VuGen

A step-by-step guide to recording, replaying, and validating your first LoadRunner VuGen script.

Read →
Performance Testing

LoadRunner Correlation Techniques: Handling Dynamic Values

How to correlate dynamic values in LoadRunner scripts using the Correlation Studio, manual web_reg_save_param, and best practices for reliable scripts.

Read →
Performance Testing

LoadRunner Parameterization: Driving Scripts with Real Test Data

How to parameterize LoadRunner VuGen scripts to avoid testing with hardcoded, repeated data, including parameter types and data allocation strategies.

Read →
Performance Testing

LoadRunner Protocols: Choosing the Right One for Your Application

How to choose the correct LoadRunner protocol for web, Citrix, SAP, and other application types, and why this decision matters more than in open-source tools.

Read →
Performance Testing

LoadRunner Analysis: Reading Graphs and Reports Correctly

A guide to the most useful graphs in LoadRunner Analysis — transaction response time, throughput, and Vuser status — and how to merge them for real insight.

Read →
Performance Testing

LoadRunner Controller: Designing a Load Test Scenario

How to design a LoadRunner Controller scenario, including Vuser groups, scheduling, and load generator assignment.

Read →
Performance Testing

LoadRunner Rendezvous Points and Pacing Explained

How LoadRunner Rendezvous Points create synchronized concurrency spikes, and how Pacing controls iteration timing — two commonly confused concepts.

Read →
Performance Testing

LoadRunner Functions and Custom C Code in VuGen Scripts

How to extend LoadRunner scripts with custom C code and the lr_* runtime API for logic the recorder and built-in functions can't express.

Read →
Performance Testing

Integrating LoadRunner with Monitoring Tools for Root-Cause Analysis

How to correlate LoadRunner test results with server-side and APM monitoring data to find the real cause of performance regressions.

Read →
Performance Testing

LoadRunner vs Modern Load Testing Tools: When to Use It

A practical framework for deciding when LoadRunner is the right choice in 2026 versus open-source alternatives like k6, JMeter, and Gatling.

Read →
Performance Testing

Getting Started with k6: Modern Load Testing in JavaScript

An introduction to k6, Grafana's open-source load testing tool, and why its code-first JavaScript scripting model fits modern CI/CD workflows.

Read →
Performance Testing

k6 Scenarios and Executors: Modeling Realistic Load Shapes

How k6's scenarios and executors let you model open-system arrival-rate traffic and closed-system concurrent-user traffic precisely.

Read →
Performance Testing

k6 Thresholds and Checks: Automating Pass/Fail Criteria

How k6 thresholds turn performance budgets into automated pass/fail criteria for CI, and how they differ from checks.

Read →
Performance Testing

Introduction to Gatling: Scala-Based Load Testing

What Gatling is, its Scala/Java DSL approach to scripting, and where it fits for JVM-comfortable teams doing serious load testing.

Read →
Performance Testing

Running k6 in CI/CD and k6 Cloud

How to integrate k6 into a CI/CD pipeline, and when k6 Cloud's distributed execution is worth it over self-hosted runs.

Read →
Performance Testing

Extending k6 with xk6: Custom Protocols and Functionality

How xk6 lets you build custom k6 binaries with extended protocol support and functionality beyond what's built in.

Read →
Performance Testing

Gatling Assertions and Reports

How Gatling's global and per-request assertions provide CI-friendly pass/fail criteria, and how to read its generated HTML reports.

Read →
Performance Testing

Gatling Simulations and Injection Profiles

How Gatling's injection profiles (rampUsers, constantUsersPerSec, and more) model different load shapes, and how to choose the right one.

Read →
Performance Testing

Introduction to Locust: Python-Based Load Testing

What Locust is, how its Python-based, code-first approach compares to k6 and Gatling, and when it's the right choice for your team.

Read →
SRE

Chaos Engineering: Testing Reliability by Breaking Things on Purpose

What chaos engineering is, how to run a safe first experiment, and how it connects to error budgets and SLOs.

Read →
Performance Testing

Distributed Load Testing with Locust

How to run Locust in distributed mode across multiple machines, and the practical considerations for scaling beyond a single node.

Read →
Performance Testing

Introduction to NeoLoad: Tricentis's Performance Testing Platform

What NeoLoad is, its Design Studio and Controller-based workflow, and where it sits between LoadRunner and open-source tools.

Read →
SRE

Capacity Planning with the Universal Scalability Law

How the Universal Scalability Law models contention and coherency penalties to predict where a system's throughput will actually peak and decline.

Read →
SRE

Writing Incident Response Runbooks That Actually Get Used

What makes an incident runbook useful under real pressure versus one that gets ignored, with a practical structure to follow.

Read →
SRE

On-Call Best Practices That Prevent Burnout

Practical on-call practices — rotation design, alert quality, and post-incident follow-up — that keep on-call sustainable rather than dreaded.

Read →
SRE

Building a Genuine Blameless Postmortem Culture

What separates a blameless postmortem culture that actually works from one that's blameless only in name, and how to build the former.

Read →
SRE

SRE vs DevOps vs Platform Engineering: What Actually Differs

A clear-eyed comparison of SRE, DevOps, and platform engineering as organizational approaches, and where the real differences (and overlaps) lie.

Read →
SRE

Toil Reduction: Identifying and Eliminating Operational Toil

What SRE means by 'toil,' how to identify it systematically, and a practical framework for deciding what to automate first.

Read →
SRE

Monitoring vs Observability: A Practical Distinction

What actually separates monitoring from observability beyond the buzzword, and why the distinction matters for debugging unknown failure modes.

Read →
SRE

Runbooks vs Playbooks: A Useful Distinction for Incident Response

The practical difference between an incident runbook and a playbook, and when each is the right tool to write and maintain.

Read →
SRE

SRE Team Topologies: Embedded, Centralized, and Hybrid Models

How SRE teams are typically organized — embedded, centralized, and hybrid models — and the trade-offs each makes between context and consistency.

Read →
AI Performance

Continuous Batching: How Modern LLM Servers Achieve High Throughput

How continuous batching differs from static batching, why it's central to vLLM and TGI's throughput advantage, and what it costs individual requests.

Read →
AI Performance

Prompt Caching and KV Cache: Why Repeated Context Gets Cheaper

How prompt/KV caching reduces cost and latency for repeated context in LLM applications, and when it actually helps versus doesn't.

Read →
AI Performance

Benchmarking Vector Database Performance for RAG Systems

What actually matters when benchmarking a vector database for retrieval-augmented generation — recall, latency, and indexing trade-offs.

Read →
AI Performance

GPU Utilization for LLM Model Serving: What to Actually Measure

Why GPU utilization percentage alone is a misleading metric for LLM serving, and what to measure instead to understand real efficiency.

Read →
AI Performance

Quantization and Performance Trade-offs in LLM Serving

How model quantization (INT8, INT4, and similar) trades accuracy for latency, throughput, and memory savings, and how to evaluate the trade-off.

Read →
AI Performance

Optimizing RAG Pipeline Latency: Where the Time Actually Goes

A breakdown of where latency accumulates in a retrieval-augmented generation pipeline, and the highest-leverage places to optimize it.

Read →
AI Performance

Benchmarking Open-Source LLM Inference Servers: vLLM, TGI, and Ollama

A practical comparison framework for benchmarking vLLM, TGI, and Ollama, and what each is actually optimized for.

Read →
AI Performance

Load Testing LLM APIs: A Practical Guide

How to design a load test specifically for LLM APIs, covering realistic prompt distributions, streaming measurement, and concurrency sweeps.

Read →
AI Performance

Token Economics 101: Understanding LLM API Cost Structure

How LLM API pricing actually works — input vs output token pricing, why output costs more, and the practical levers for controlling cost.

Read →
Observability

OpenTelemetry for Performance Engineers: A Practical Start

A practical introduction to OpenTelemetry's traces, metrics, and logs, and how to instrument a service for meaningful performance analysis.

Read →
Observability

Prometheus and Grafana Basics for Performance Monitoring

How Prometheus's pull-based metrics model and PromQL work, and how to build Grafana dashboards that actually answer performance questions.

Read →
Observability

The RED Method: Rate, Errors, Duration for Service Monitoring

How the RED method gives a simple, consistent framework for monitoring any request-driven service, and how it complements the USE method.

Read →
Observability

Distributed Tracing Explained: Spans, Context, and Sampling

How distributed tracing actually works under the hood — spans, trace context propagation, and sampling strategies — explained from first principles.

Read →
Observability

Structured Logging Best Practices for Debuggable Systems

Why structured logging (key-value fields, not free text) matters for debugging at scale, and practical conventions worth adopting.

Read →
Observability

The USE Method: Utilization, Saturation, Errors for Resource Monitoring

How Brendan Gregg's USE method systematically checks system resources for performance bottlenecks, and how it pairs with the RED method.

Read →
Observability

APM Tool Comparison: Datadog, Dynatrace, and New Relic

A practical comparison of how Datadog, Dynatrace, and New Relic approach instrumentation, AI-assisted root-cause analysis, and pricing.

Read →
Observability

Building SLO Dashboards That Drive Real Decisions

How to design an SLO dashboard that actually informs the ship/freeze decisions error budgets are meant to enable, not just display pretty graphs.

Read →
Concepts

Little's Law for Performance Engineers, with Worked Examples

An intuitive explanation of Little's Law (L = λW), how to derive concurrency, throughput, or latency from the other two, and common misuses.

Read →
Concepts

Amdahl's Law for Performance Engineers

How Amdahl's Law quantifies the limit parallelization can achieve when part of a workload is inherently serial, with practical examples.

Read →
Concepts

Queueing Theory Basics for Performance Engineers

An accessible introduction to queueing theory concepts — utilization, queue length, and waiting time — and why systems get dramatically slower near full utilization.

Read →
Concepts

Why p99 Matters: Understanding Latency Percentiles

What latency percentiles actually mean, why averages systematically mislead, and the pitfalls of averaging or combining percentiles incorrectly.

Read →
Concepts

Concurrency vs Parallelism: A Clear Distinction

The genuine technical distinction between concurrency and parallelism, why it matters for performance reasoning, and common confusions.

Read →
Concepts

Garbage Collection Tuning Fundamentals

The core concepts behind garbage collector tuning — generational collection, pause times, and throughput trade-offs — applicable across JVM, .NET, and Go.

Read →
Concepts

Throughput vs Latency: Why You Usually Can't Maximize Both

Why throughput and latency often trade off against each other through batching, and how to decide where to sit on that trade-off curve.

Read →
Performance Testing

Setting Performance Budgets for Web Applications

How to set practical performance budgets (page weight, load time, Core Web Vitals) and enforce them in CI before they regress in production.

Read →
Observability

Synthetic Monitoring vs Real User Monitoring (RUM)

How synthetic monitoring and real user monitoring complement each other for understanding production performance, and when to rely on each.

Read →
Performance Testing

Spike, Stress, and Soak Testing: Three Different Questions

How spike testing, stress testing, and soak testing each answer a different reliability question, and why a single load test can't cover all three.

Read →
Performance Testing

How to Write a Performance Test Plan That Answers a Real Question

A practical template for a performance test plan that starts from a specific question, not a generic checklist of tools and metrics.

Read →
Performance Testing

A Pre-Launch Performance Testing Checklist

A practical checklist to run through before considering a performance testing effort complete and ready to inform a launch decision.

Read →
Performance Testing

Top Performance Testing Mistakes (and How to Avoid Them)

A roundup of the most common, costly performance testing mistakes across tools and teams, distilled into a practical avoidance guide.

Read →
Concepts

Understanding Apdex: Translating Latency into User Satisfaction

What the Apdex score actually measures, how to set its thresholds meaningfully, and its limitations as a single summary metric.

Read →
SRE

How to Calculate an Error Budget, Step by Step

A step-by-step walkthrough of calculating an error budget from an SLO, with worked examples at different reliability targets.

Read →
Concepts

What is DevPerfOps? Performance as a First-Class Citizen

DevPerfOps extends DevOps by embedding performance engineering across the entire delivery pipeline — shifting it left from a pre-release gate to a continuous, shared responsibility.

Read →