May 14, 2014

CPU Load Spiked with No Apparent Reason - Check RAID Battery

Scenario

If you got here in an emergency to bring back your server health, please skip to the Ugent Care section at the bottom of this page.

enter image description here
You are here because you can’t figure out what on earth is eating up your CPU load. I mean, you’ve checked almost everything that can cause high CPU load, things like:

  • Busy application
  • Disk I/O bound
  • Slow database queries
  • System using swap instead or RAM
  • Dependency on a remote mount point being slow

When none of the above seems to be the cause of your CPU load problem, you might want to check your RAID battery status or BBU (Battery Backup Unit) relearn cycle. The only kind of RAID controller covered here is LSI which you would find in most Dell servers.

Root Cause

This is often seen on RAID controllers in Dell servers, specifically LSI controllers. The default behavior of the LSI RAID controller is to set to periodically go through a so-called ‘relearn cycle’ for its battery backup unit (BBU). During the process, it discharges, then re-charges and calibrates itself to find the current charge level. In this time frame, the RAID write cache policy is set to WriteThrough instead of WriteBack where WriteThrough is accessing the physical disk directly and WriteBack uses the RAID cache as the write buffer.

enter image description here
Without the cache buffering, disk I/O is drastically slowed down and will cause CPU to wait much longer for any application requests. During busy hours, it will appears as a symptom of high CPU load with no visible reasons. There are a few things we can do to prevent this happening again in the future, but lets begin with the right tool and commands to use for troubleshooting.

MegaCli and BBU

MegaCli is the command line tool for managing LSI RAID controllers on Dell, and a few other servers. We want to first verify if your server is using LSI RAID controllers.

Check RAID Controller Brand

lspci | grep -i raid

Sample Output:

01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)

Check Storage Device Detail

lshw -class storage

Sample Output:

  *-storage
       description: RAID bus controller
       product: MegaRAID SAS 1078
       vendor: LSI Logic / Symbios Logic
       physical id: 0
       bus info: pci@0000:01:00.0
       version: 04
       width: 64 bits
       clock: 33MHz
       capabilities: storage pciexpress msi msix pm vpd bus_master cap_list rom
       configuration: driver=megaraid_sas latency=0
       resources: irq:16 memory:fc480000-fc4bffff ioport:ec00(size=256) memory:fc440000-fc47ffff memory:fc300000-fc307fff

Install MegaCli Tool

Download the MegaCli package from LSI website. Click Here to Find the right package. I would just get the Latest MegaCli for Linux package for Linux servers.

Unzip the downloaded file and install it with package manager that suits you. I am using Ubuntu here as an example:

unzip MegaCli_Linux.zip
dpkg -i megacli_8.07.08-1_all.deb

Linux Kernel >= 3.0

If your server is running on Kernel that is newer or equal to version 3.0, you want to prefix your MegaCli command like the following as a fix.

setarch x86_64 --uname-2.6 MegaCli

Check RAID and Battery Backup Unit Health

Check Enclosure Info

MegaCli -EncInfo -aALL -NoLog

Sample output:

    Number of enclosures on adapter 0 -- 1

    Enclosure 0:
    Device ID                     : 32
    Number of Slots               : 6
    Number of Power Supplies      : 0
    Number of Fans                : 0
    Number of Temperature Sensors : 0
    Number of Alarms              : 0
    Number of SIM Modules         : 0
    Number of Physical Drives     : 4
    Status                        : Normal
    Position                      : Unavailable
    Connector Name                : Unavailable
    Partner Device Id             : 65535

Show physical drive information of all connected drives

MegaCli -PDList -aALL -NoLog

Sample output:

Adapter #0

Enclosure Device ID: 32
Slot Number: 0
Device Id: 0
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 238.474 GB [0x1dcf32b0 Sectors]
Non Coerced Size: 237.974 GB [0x1dbf32b0 Sectors]
Coerced Size: 237.875 GB [0x1dbc0000 Sectors]
Firmware state: Online
SAS Address(0): 0x4433221107000000
Connected Port Number: 1(path0)
Inquiry Data: S1ATNSAF156143F     Samsung SSD 840 PRO Series              DXM05B0Q
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Foreign State: None
Device Speed: Unknown
Link Speed: Unknown
Media Type: Solid State Device
------------------truncated-----------------------

Show virtual drive information

MegaCli -LDInfo -Lall -aALL -NoLog

Sample output:

Adapter 0 -- Virtual Drive Information:
Virtual Disk: 0 (Target Id: 0)
Name:
RAID Level: Primary-1, Secondary-0, RAID Level Qualifier-0
Size:475.599 GB
State: Optimal
Stripe Size: 64 KB
Number Of Drives per span:2
Span Depth:2
Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Access Policy: Read/Write
Disk Cache Policy: Disk's Default
Encryption Type: None

Make sure the Current Cache Policy is using WriteBack when battery is healthy. If you see WriteThrough instead, you need to check into the BBU health and if it is running a relearn of battery status.

References (extracted from Dell OpenManage doc):

Write Policy: The write policies specify whether the controller sends a write-request completion signal as soon as the data is in the cache or after it has been written to disk.

  • Write-Back. When using write-back caching, the controller sends a write-request completion signal as soon as the data is in the controller cache but has not yet been written to disk. Write-back caching may provide improved performance since subsequent read requests can more quickly retrieve data from the controller cache than they could from the disk. Write-back caching also entails a data security risk, however, since a system failure could prevent the data from being written to disk even though the controller has sent a write-request completion signal. In this case, data may be lost. Other applications may also experience problems when taking actions that assume the data is available on the disk.
  • Write-Through. When using write-through caching, the controller sends a write-request completion signal only after the data is written to the disk. Write-through caching provides better data security than write-back caching, since the system assumes the data is available only after it has been safely written to the disk.

Read Policy: The read policies indicate whether or not the controller should read sequential sectors of the logical drive when seeking data.

  • Read-Ahead. When using read-ahead policy, the controller reads sequential sectors of the logical drive when seeking data. Read-ahead policy may improve system performance if the data is actually written to sequential sectors of the logical drive.
  • No-Read-Ahead. Selecting no-read-ahead policy indicates that the controller should not use read-ahead policy.
  • Adaptive Read-Ahead. When using adaptive read-ahead policy, the controller initiates read-ahead only if the two most recent read requests accessed sequential sectors of the disk. If subsequent read requests access random sectors of the disk, the controller reverts to no-read-ahead policy. The controller continues to evaluate whether read requests are accessing sequential sectors of the disk, and can initiate read-ahead if necessary.

Cache Policy: The Direct I/O and Cache I/O cache policies apply to reads on a specific virtual disk. These settings do not affect the read-ahead policy. The cache policies are as follows:

  • Cache I/O. Specifies that all reads are buffered in cache memory.
  • Direct I/O. Specifies that reads are not buffered in cache memory. When using direct I/O, data is transferred to the controller cache and the host system simultaneously during a read request. If a subsequent read request requires data from the same data block, it can be read directly from the controller cache. The direct I/O setting does not override the cache policy settings. Direct I/O is also the default setting.

Show battery information

MegaCli -AdpBbuCmd -GetBbuStatus -aALL -NoLog

Sample output:

BBU status for Adapter: 0

BatteryType: BBU
Voltage: 4023 mV
Current: 0 mA
Temperature: 20 C

BBU Firmware Status:

  Charging Status              : None
  Voltage                      : OK
  Temperature                  : OK
  Learn Cycle Requested        : No
  Learn Cycle Active           : No
  Learn Cycle Status           : OK
  Learn Cycle Timeout          : No
  I2c Errors Detected          : No
  Battery Pack Missing         : No
  Battery Replacement required : No
  Remaining Capacity Low       : No
  Periodic Learn Required      : No

Battery state:

GasGuageStatus:
  Fully Discharged        : No
  Fully Charged           : No
  Discharging             : Yes
  Initialized             : Yes
  Remaining Time Alarm    : No
  Remaining Capacity Alarm: No
  Discharge Terminated    : No
  Over Temperature        : No
  Charging Terminated     : No
  Over Charged            : No

Relative State of Charge: 83 %
Charger Status: Complete
Remaining Capacity: 699 mAh
Full Charge Capacity: 840 mAh
isSOHGood: Yes

The last section of the output would give you a summary of how much of the charge is still remaining in the battery and its capacity.

Show battery capacity information

MegaCli -AdpBbuCmd -GetBbuCapacityInfo -aALL -NoLog

Sample output:

BBU Capacity Info for Adapter: 0

Relative State of Charge: 57 %
Absolute State of charge: 48 %
Remaining Capacity: 921 mAh
Full Charge Capacity: 1619 mAh
Run time to empty: 65535 Min
Average time to empty: 65535 Min
Average Time to full: 65535 Min
Cycle Count: 11
Max Error: 100 %
Remaining Capacity Alarm: 190 mAh
Remaining Time Alarm: 10 Min

Get event log since last reboot

MegaCli -AdpEventLog -GetSinceReboot -f events.log -aALL
cat events.log

Sample output:

seqNum: 0x00000594
Time: Mon Jan 13 07:28:33 2014

Code: 0x0000009b
Class: 0
Locale: 0x08
Event Description: Battery relearn pending: Battery is under charge
Event Data:
===========
None


seqNum: 0x00000595
Time: Mon Jan 13 08:34:38 2014

Code: 0x00000093
Class: 0
Locale: 0x08
Event Description: Battery started charging
Event Data:
===========
None
-----------------------truncated-----------------------

Use the event log to help understand when did BBU status has changed.

Here is a list of other MegaCli command for reference.

Long Term Solution

Disable BBU Auto-Learn Mode

Show battery learn property:

MegaCli -AdpBbuCmd -GetBbuProperties -aALL

Sample output:

BBU Properties for Adapter: 0

Auto Learn Period: 7776000 Sec
Next Learn time: 455912927 Sec
Learn Delay Interval:0 Hours
Auto-Learn Mode: Enabled

From the above output, we can tell that the auto battery relearn is enabled. And the scheduled learn time is in 455912927 seconds in relation to UTC 2000-01-01. If you want to find out exactly when the scheduled learn time is going to happen, run the following:

date -d "UTC 2000-01-01 455912927 secs"

The result would be Thu Jun 12 11:28:47 PDT 2014. This is just to give you an idea of when the relearn is going to take place, but our purpose here is to disable the automation entirely.

Disable Auto-Learn Mode:

echo "autoLearnMode=1" > tmp.txt
MegaCli -AdpBbuCmd -SetBbuProperties -f tmp.txt -aALL

You can run the show battery learn property command again to verify if you’ve successfully disable the auto learn.

Forced Scheduled BBU Relearn Cycle

Obviously, one would want to perform this action during less busy hours to avoid heavy disk I/O:

MegaCli -AdpBbuCmd -BbuLearn -aALL

Either put the above in a cron job that runs every 90 days or do scheduled manual maintanence every 90 days. 90 days is the default BBU auto relearn cycle, but I guess you can change the interval accordingly toward your needs.

Urgent Care

Do a quick check on RAID cache policy:

MegaCli -LDInfo -Lall -aALL -NoLog

If you found the Current cache Policy to start with WriteThrough and end with No Write Cache if Bad BBU. If the WriteThrough mode is an artifact of BBU in relearn process, but you need it to be in WriteBack mode.

You need to first allow write cache if bad BBU, then change the write cache policy to WriteBack:

MegaCli -LDSetProp CachedBadBBU -Lall -aALL
MegaCli -LDSetProp WB -Lall -aALL

Restore original setting once the situation is under control:

MegaCli -LDSetProp NoCachedBadBBU -Lall -aALL

You also want to consider implementing the long term solution.