CLI for 3Ware 9550SX-8LP

CLI for 3Ware 9550SX-8LP

This article shows you how I accessed my 3Ware
9550SX-8LP controller
from a command line interface. You can perform RAID maintenance from within the 3Ware
BIOS, but the CLI allows you to do this while the system is up and running. This is
useful for monitoring the system. I also plan to create a NetSaint
plug-in for RAID card, much like I did for another RAID product.
Given that we have a CLI, the creation of a plug-in is a rather simple procedure.

CLI? why bother?

I took an interest in the CLI when I noticed this in /var/log/messages:

Aug 11 13:40:40 opti kernel: twa0: WARNING: (0x04: 0x0023): Sector repair completed: 
             port=4, LBA=0x9A3F9E
This was worrying. Sector repair? That can't be good. That's when I decided to run with the CLI to find out what I could about the RAID array. There is a FreeBSD port for the 3Ware CLI. You want to install sysutils/tw_cli. I found it worked right out of the box, with no configuration required. You can run it as an interactive shell (just type tw_cli and press ENTER) or you can pass it commands as arguments.

CLI - the shell version

Here is what the shell version of the CLI looks like
# tw_cli
//opti> help

Copyright(c) 2004, 2005 Applied Micro Circuits Corporation(AMCC). All rights reserved.

AMCC/3ware CLI (version 2.00.03.013)


Commands  Description
-------------------------------------------------------------------
info      Displays information about controller(s), unit(s) and port(s).
maint     Performs maintenance operations on controller(s), unit(s) and ports.
alarms    Displays current AENs.
set       Displays or modifies controller and unit settings.
sched     Schedules bachground tasks on controller(s)             (9000 series)
quit      Exits the CLI.
           ---- New Command Syntax ----
focus     Changes from one object to another.  For Interactive Mode Only!
show      Displays information about controller(s), unit(s) and port(s).
flush     Flush write cache data to units in the system.
rescan    Rescan all empty ports for new unit(s) and disk(s).
commit    Commit dirty DCB to storage on controller(s).        (Windows only)
/cx       Controller specific commands.
/cx/ux    Unit specific commands.
/cx/px    Port specific commands.
/cx/bbu   BBU specific commands.                                  (9000 only)

Type help <command> to get more details about a particular command.
For more detail information see tw_cli's documentation.

//opti> info

Ctl   Model        Ports   Drives   Units   NotOpt   RRate   VRate   BBU
------------------------------------------------------------------------
c0    9550SX-8LP   8       8        3       1        4       4       OK

//opti>

Controller zero (c0) is listed correctly as a 9550SX-8LP, with 8 ports, 8 drives,
three units (one of which was not optimal), and a BBU (battery backup unit).

A good start. Now what's going on inside this controller?

//opti> info c0

Unit  UnitType  Status         %Cmpl  Stripe  Size(GB)  Cache  AVerify  IgnECC
------------------------------------------------------------------------------
u0    SPARE     OK             -      -       69.2404   -      OFF      -
u1    SPARE     OK             -      -       69.2404   -      OFF      -
u2    RAID-10   INITIALIZING   77     64K     195.548   ON     OFF      OFF

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u2     69.25 GB    145226112     WD-WMAKE23790
p1     OK               u2     69.25 GB    145226112     WD-WMAKE23790
p2     OK               u2     69.25 GB    145226112     WD-WMAKE23943
p3     OK               u2     69.25 GB    145226112     WD-WMAKE23790
p4     OK               u2     69.25 GB    145226112     WD-WMAKE23790
p5     OK               u2     69.25 GB    145226112     WD-WMAKE23792
p6     OK               u0     69.25 GB    145226112     WD-WMAKE23790
p7     OK               u1     69.25 GB    145226112     WD-WMAKE23786

Name  OnlineState  BBUReady  Status    Volt     Temp     Hours  LastCapTest
---------------------------------------------------------------------------
bbu   On           Yes       OK        OK       OK       0      xx-xxx-xxxx

//opti>

There are three units: two spares (u0 and u1) and one RAID-10 array (u2) which is
initializing and is 77% completed. I watched progress for a while, and progress
seemed to go from 77% to 100% without any intermediate steps.

The last section of the output relates to the battery backup unit (BBU).
The LastCapTest refers to the Last Capacity Test, which has never been run.
A battery test takes at least 24 hours. I'll run that one night when the rest
of the family is away. I don't think they'll tolerate the server running when they
are home.

I found the following output very useful. It shows the RAID arrays within the
main RAID10 array:

# tw_cli info c0 u2

Unit     UnitType  Status         %Cmpl  Port  Stripe  Size(GB)  Blocks
-----------------------------------------------------------------------
u2       RAID-10   OK             -      -     64K     195.548   410093568
u2-0     RAID-1    OK             -      -     -       -         -
u2-0-0   DISK      OK             -      p0    -       65.1826   136697856
u2-0-1   DISK      OK             -      p1    -       65.1826   136697856
u2-1     RAID-1    OK             -      -     -       -         -
u2-1-0   DISK      OK             -      p2    -       65.1826   136697856
u2-1-1   DISK      OK             -      p3    -       65.1826   136697856
u2-2     RAID-1    OK             -      -     -       -         -
u2-2-0   DISK      OK             -      p4    -       65.1826   136697856
u2-2-1   DISK      OK             -      p5    -       65.1826   136697856

The CLI documentation has this to say about this command:

This command presents detailed information on the specified unit. If the unit
consists of sub-units as is the case in RAID 1, RAID 5, RAID 10, and RAID
50 arrays (applicable for 9000 controllers), then details about each sub-unit
are also presented. One application of this command is to see which sub-unit
of a degraded unit has caused the unit to degrade and which disk within that
sub-unit is the source of degradation.

You can also get very concise status reports:

[root@opti:~] # tw_cli info c0 u0 status
/c0/u0 status = OK

[root@opti:~] # tw_cli info c0 u1 status
/c0/u1 status = OK

[root@opti:~] # tw_cli info c0 u2 status
/c0/u2 status = OK

[root@opti:~] #

I will make use of that command when building the NetSaint plug-in.

CLI - the argument version

Then I tried passing arguments on the command line:

# tw_cli /c0 show unitstatus

Unit  UnitType  Status         %Cmpl  Stripe  Size(GB)  Cache  AVerify  IgnECC
------------------------------------------------------------------------------
u0    SPARE     OK             -      -       69.2404   -      OFF      -
u1    SPARE     OK             -      -       69.2404   -      OFF      -
u2    RAID-10   OK             -      64K     195.548   ON     OFF      OFF

As shown above, the status of the RAID10 array has changed to OK. The initialization
had completed. When I noticed the RAID array had settled, I checked /var/log/messages again and
found (the actual messages have been trimmed, the
full messages are here:

twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9DCBA3
last message repeated 2 times
twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9DCEBE
last message repeated 2 times
twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F23A9
twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F23A9
twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F23A9
twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F2A48
twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F2D63
twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F2D63
twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F2D63
twa0: WARNING: (0x04: 0x0023): Sector repair completed: port=4, LBA=0x9F307E
last message repeated 2 times
twa0: INFO: (0x04: 0x0007): Initialize completed: unit=2

There were not actual HDD errors

Although I was initially concerned with the above messages, they do not appear to be hard errors
(that is, actual errors with the driver). When I was grepping for all of the messages, I found this:

twa0: WARNING: (0x04: 0x0008): Unclean shutdown detected: unit=2

Looking at the full log, I found these messages:

acd0: CDROM  at ata1-master UDMA33
da0 at twa0 bus 0 target 2 lun 0
da0:  Fixed Direct Access SCSI-3 device
da0: 100.000MB/s transfers
da0: 200241MB (410093568 512 byte sectors: 255H 63S/T 25527C)
Trying to mount root from ufs:/dev/da0s1a
WARNING: / was not properly dismounted
WARNING: /tmp was not properly dismounted
WARNING: /usr was not properly dismounted
WARNING: /var was not properly dismounted

And sure enough, something did happen but I don't recall what. I think I was playing
with IPMI and it caused a panic or something. I am not sure.

BBU charging

I also found this interesting message:

kernel: twa0: INFO: (0x04: 0x0056): Battery charging completed:

At least now I know I can run the BBU test

The NetSaint plugin

Although I haven't written it, I'm quite sure it will be straight forward. It's a matter of
grepping out the right information. Stay tuned.

Leave a Comment

Scroll to Top