The USE Method: Utilization, Saturation, Errors for Resource Monitoring

How Brendan Gregg's USE method systematically checks system resources for performance bottlenecks, and how it pairs with the RED method.

· By perf-test.com Editorial · AI-assisted
use-methodmonitoringperformance

The USE method, developed by Brendan Gregg, gives a systematic checklist for diagnosing resource-level performance bottlenecks — checking Utilization, Saturation, and Errors for every system resource (CPU, memory, disk, network) rather than ad hoc, intuition-driven investigation that might miss a relevant resource entirely.

Utilization

The percentage of time a resource is busy doing work — straightforward for CPU (percent busy) but worth defining carefully for other resources (for memory, utilization might mean percentage of capacity used; for a disk, percentage of time servicing requests).

Saturation

The degree to which a resource has more work queued than it can immediately service — this is the metric utilization alone misses. A CPU at 100% utilization with no queued work is simply fully busy and fine; a CPU at 100% utilization with a long run-queue of waiting processes is saturated, a meaningfully worse and more actionable signal. Saturation metrics (queue length, wait time) are often more diagnostically useful than utilization alone for spotting a genuine bottleneck.

Errors

Resource-level error counts — disk I/O errors, network interface errors, memory errors (ECC corrections, for instance) — that might be silently degrading performance (through retries, fallback paths) without showing up clearly in utilization or saturation metrics at all. Easy to overlook, but a real and sometimes surprising root cause of mysterious performance issues.

Why check every resource systematically, not just the obvious one

A common diagnostic mistake is fixating on CPU (the most commonly monitored resource) when the actual bottleneck is disk I/O saturation, network bandwidth, or even something less obvious like file descriptor exhaustion or a memory resource hitting swap. The USE method’s value is precisely in being a checklist that prompts you to check every resource’s utilization, saturation, and errors systematically, rather than jumping straight to whichever resource happens to be top-of-mind.

Applying USE during a load test

When investigating a load test result showing degraded performance at high concurrency (the kind of result covered throughout this site’s tool-specific load testing articles), running through USE on the system under test’s host machine — CPU, memory, disk, network, and for databases specifically, connection pool saturation — is a structured way to find the actual resource bottleneck rather than guessing, directly connecting to the “correlate client-side results with server-side metrics” workflow covered in this site’s LoadRunner monitoring integration article.

USE for software resources, not just hardware

The method extends naturally to software-level resources: a thread pool’s utilization and saturation (queue depth of waiting tasks), a database connection pool’s utilization and saturation, a message queue’s depth (a direct saturation signal) — applying the same three-part check to software resources that don’t map directly onto a hardware metric but behave analogously.

Where USE fits relative to RED

As covered in this site’s RED method article, RED addresses client-facing service health while USE addresses underlying resource health — when a RED dashboard shows degraded Duration or Errors, USE gives you the systematic next step for finding which specific resource is actually responsible, rather than leaving you to guess.

A practical checklist format

For each resource (CPU, memory, disk, network, and relevant software resources like connection/thread pools): what’s the utilization, is there evidence of saturation (queue depth, wait time), and are there any errors — running through this explicitly, resource by resource, during an investigation catches bottlenecks that intuition-driven investigation often misses.

Takeaway: the USE method’s systematic, resource-by-resource checklist (utilization, saturation, errors) is specifically designed to catch the bottleneck that isn’t the resource you intuitively suspected first — pair it with RED’s client-facing metrics for a complete diagnostic picture.

Discussions coming soon.

Comments are powered by Giscus (GitHub Discussions). Enable them by configuring GISCUS in src/consts.ts — see giscus.app.