Monitoring and Performance on Linux System

Linux Performance Monitoring: sar vs. The Alternatives

In our previous post, we looked at sar for CPU monitoring. However, on many Red Hat or CentOS minimal installs, sysstat might not be present. As a Linux engineer, you need a "Plan B."

Today, we’ll explore how to get identical results using mpstat and the batch-mode capability of top.

The Original Goal: Monitoring CPU

You previously used:

sar -u 1 3

This gave you a snapshot of %user, %system, and %iowait. Let’s look at how to get that same data with different tools.


Alternative 1: mpstat (Multi-Processor Statistics)

If you are on a modern RHEL system with multiple CPU cores, mpstat is often more descriptive than sar. It is also part of the sysstat suite but focuses specifically on CPU breakdown.

The Command:

Bash
mpstat 1 3

The Output:

Plaintext
Linux 4.18.0 (cdk-pol-prod2)    02/24/2026    _x86_64_    (2 CPU)

09:45:01 AM  CPU    %usr   %nice    %sys   %iowait    %irq    %soft  %steal  %guest  %gnice   %idle
09:45:02 AM  all    0.50    0.00    1.00     0.00     0.00     0.00    0.00    0.00    0.00   98.50

Why use mpstat instead?

  1. Interrupt Analysis: It shows %irq and %soft, which are critical for troubleshooting network card bottlenecks and "interrupt storms."

  2. Per-Core Granularity: If you have 8 CPUs and want to see if just one is pegged at 100%, run mpstat -P ALL 1.


Alternative 2: top in Batch Mode

What if you cannot install any new packages and only have the standard top command? You can force top to act like sar by using Batch Mode (-b).

The Command:

Bash
top -b -n 3 -d 1 | grep "Cpu(s)"

Breakdown:

  • -b: Batch mode. This prevents the interactive screen and prints plain text.

  • -n 3: Run 3 iterations.

  • -d 1: 1-second delay.

  • grep "Cpu(s)": Filters out the process list so you only see the summary line.

The Output:

Plaintext
%Cpu(s):  0.3 us,  0.7 sy,  0.0 ni, 99.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

Note: Here us = user, sy = system, id = idle, and wa = iowait.


Alternative 3: The /proc Filesystem (The "Zero-Tool" Method)

If you are in a restricted environment where no monitoring binaries are allowed, you can read the raw kernel data directly.

The Command:

Bash
cat /proc/stat | grep '^cpu '

The first line of /proc/stat shows the aggregate amount of time the system spent in various states since boot. While it requires a bit of math to turn these into percentages, it is the ultimate "fail-safe" method.


⚠️ Production Caution: Choosing the Right Tool

While these alternatives provide the same result, their impact on production varies:

ToolBest Use CaseProduction Impact
sarHistorical trends and logging.Low (Background)
mpstatDeep-dive into CPU/Interrupts.Low
top -bQuick checks when no other tools exist.Medium (Parsing top output is resource-heavy)
/proc/statCustom scripting and ultra-restricted shells.Minimal

Pro-Tip for Production:

When troubleshooting F5 BIG-IP or high-traffic Red Hat gateways, always look at the Soft IRQ (%si). If this number is high, your CPU is spending too much time handling network packets and may need "Receive Side Scaling" (RSS) tuning.



Comments

Popular posts from this blog

PPPoE Server Under Ubuntu/Debian

Intrusion Detection and Prevention Using OSSEC

Intrusion Detection Service in IPCOP