CLI for 3Ware 9550SX-8LP
This article shows you how I accessed my 3Ware
9550SX-8LP controller
from a command line interface. You can perform RAID maintenance from within the 3Ware
BIOS, but the CLI allows you to do this while the system is up and running. This is
useful for monitoring the system. I also plan to create a NetSaint
plug-in for RAID card, much like I did for another RAID product.
Given that we have a CLI, the creation of a plug-in is a rather simple procedure.
CLI? why bother?
I took an interest in the CLI when I noticed this in /var/log/messages:
This was worrying. Sector repair? That can't be good. That's when I decided to run with the CLI to find out what I could about the RAID array. There is a FreeBSD port for the 3Ware CLI. You want to install sysutils/tw_cli. I found it worked right out of the box, with no configuration required. You can run it as an interactive shell (just type tw_cli and press ENTER) or you can pass it commands as arguments.Aug 11 13:40:40 opti kernel: twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9A3F9E
CLI - the shell version
Here is what the shell version of the CLI looks like# tw_cli //opti> help Copyright(c) 2004, 2005 Applied Micro Circuits Corporation(AMCC). All rights reserved. AMCC/3ware CLI (version 2.00.03.013) Commands Description ------------------------------------------------------------------- info Displays information about controller(s), unit(s) and port(s). maint Performs maintenance operations on controller(s), unit(s) and ports. alarms Displays current AENs. set Displays or modifies controller and unit settings. sched Schedules bachground tasks on controller(s) (9000 series) quit Exits the CLI. ---- New Command Syntax ---- focus Changes from one object to another. For Interactive Mode Only! show Displays information about controller(s), unit(s) and port(s). flush Flush write cache data to units in the system. rescan Rescan all empty ports for new unit(s) and disk(s). commit Commit dirty DCB to storage on controller(s). (Windows only) /cx Controller specific commands. /cx/ux Unit specific commands. /cx/px Port specific commands. /cx/bbu BBU specific commands. (9000 only) Type help <command> to get more details about a particular command. For more detail information see tw_cli's documentation. //opti> info Ctl Model Ports Drives Units NotOpt RRate VRate BBU ------------------------------------------------------------------------ c0 9550SX-8LP 8 8 3 1 4 4 OK //opti>
Controller zero (c0) is listed correctly as a 9550SX-8LP, with 8 ports, 8 drives,
three units (one of which was not optimal), and a BBU (battery backup unit).
A good start. Now what's going on inside this controller?
//opti> info c0 Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC ------------------------------------------------------------------------------ u0 SPARE OK - - 69.2404 - OFF - u1 SPARE OK - - 69.2404 - OFF - u2 RAID-10 INITIALIZING 77 64K 195.548 ON OFF OFF Port Status Unit Size Blocks Serial --------------------------------------------------------------- p0 OK u2 69.25 GB 145226112 WD-WMAKE23790 p1 OK u2 69.25 GB 145226112 WD-WMAKE23790 p2 OK u2 69.25 GB 145226112 WD-WMAKE23943 p3 OK u2 69.25 GB 145226112 WD-WMAKE23790 p4 OK u2 69.25 GB 145226112 WD-WMAKE23790 p5 OK u2 69.25 GB 145226112 WD-WMAKE23792 p6 OK u0 69.25 GB 145226112 WD-WMAKE23790 p7 OK u1 69.25 GB 145226112 WD-WMAKE23786 Name OnlineState BBUReady Status Volt Temp Hours LastCapTest --------------------------------------------------------------------------- bbu On Yes OK OK OK 0 xx-xxx-xxxx //opti>
There are three units: two spares (u0 and u1) and one RAID-10 array (u2) which is
initializing and is 77% completed. I watched progress for a while, and progress
seemed to go from 77% to 100% without any intermediate steps.
The last section of the output relates to the battery backup unit (BBU).
The LastCapTest refers to the Last Capacity Test, which has never been run.
A battery test takes at least 24 hours. I'll run that one night when the rest
of the family is away. I don't think they'll tolerate the server running when they
are home.
I found the following output very useful. It shows the RAID arrays within the
main RAID10 array:
# tw_cli info c0 u2 Unit UnitType Status %Cmpl Port Stripe Size(GB) Blocks ----------------------------------------------------------------------- u2 RAID-10 OK - - 64K 195.548 410093568 u2-0 RAID-1 OK - - - - - u2-0-0 DISK OK - p0 - 65.1826 136697856 u2-0-1 DISK OK - p1 - 65.1826 136697856 u2-1 RAID-1 OK - - - - - u2-1-0 DISK OK - p2 - 65.1826 136697856 u2-1-1 DISK OK - p3 - 65.1826 136697856 u2-2 RAID-1 OK - - - - - u2-2-0 DISK OK - p4 - 65.1826 136697856 u2-2-1 DISK OK - p5 - 65.1826 136697856
The CLI documentation has this to say about this command:
This command presents detailed information on the specified unit. If the unit
consists of sub-units as is the case in RAID 1, RAID 5, RAID 10, and RAID
50 arrays (applicable for 9000 controllers), then details about each sub-unit
are also presented. One application of this command is to see which sub-unit
of a degraded unit has caused the unit to degrade and which disk within that
sub-unit is the source of degradation.
You can also get very concise status reports:
[root@opti:~] # tw_cli info c0 u0 status /c0/u0 status = OK [root@opti:~] # tw_cli info c0 u1 status /c0/u1 status = OK [root@opti:~] # tw_cli info c0 u2 status /c0/u2 status = OK [root@opti:~] #
I will make use of that command when building the NetSaint plug-in.
CLI - the argument version
Then I tried passing arguments on the command line:
# tw_cli /c0 show unitstatus Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC ------------------------------------------------------------------------------ u0 SPARE OK - - 69.2404 - OFF - u1 SPARE OK - - 69.2404 - OFF - u2 RAID-10 OK - 64K 195.548 ON OFF OFF
As shown above, the status of the RAID10 array has changed to OK. The initialization
had completed. When I noticed the RAID array had settled, I checked /var/log/messages again and
found (the actual messages have been trimmed, the
full messages are here:
twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9DCBA3 last message repeated 2 times twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9DCEBE last message repeated 2 times twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F23A9 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F23A9 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F23A9 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F2A48 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F2D63 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F2D63 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F2D63 twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F307E last message repeated 2 times twa0: INFO: (0x04: 0x0007): Initialize completed: unit=2
There were not actual HDD errors
Although I was initially concerned with the above messages, they do not appear to be hard errors
(that is, actual errors with the driver). When I was grepping for all of the messages, I found this:
twa0: WARNING: (0x04: 0x0008): Unclean shutdown detected: unit=2
Looking at the full log, I found these messages:
acd0: CDROMat ata1-master UDMA33 da0 at twa0 bus 0 target 2 lun 0 da0: Fixed Direct Access SCSI-3 device da0: 100.000MB/s transfers da0: 200241MB (410093568 512 byte sectors: 255H 63S/T 25527C) Trying to mount root from ufs:/dev/da0s1a WARNING: / was not properly dismounted WARNING: /tmp was not properly dismounted WARNING: /usr was not properly dismounted WARNING: /var was not properly dismounted
And sure enough, something did happen but I don't recall what. I think I was playing
with IPMI and it caused a panic or something. I am not sure.
BBU charging
I also found this interesting message:
kernel: twa0: INFO: (0x04: 0x0056): Battery charging completed:
At least now I know I can run the BBU test
The NetSaint plugin
Although I haven't written it, I'm quite sure it will be straight forward. It's a matter of
grepping out the right information. Stay tuned.