Of course, it is important to keep track of your RAID array status, so I decided to install the MegaCLI monitoring software under the ESX Server 3.5 Service Console. Here's how I did it and configured the monitoring on my system:
- The MegaCLI software can be downloaded from the LSI Logic website. I used version 1.01.39 for Linux, which comes in a RPM file.
- After uploading the RPM file to the service console, it was a matter of installing it using the "rpm" command:
rpm -i -v MegaCli-1.01.39-0.i386.rpm
This installs the "MegaCli" and "MegaCli64" commands in the /opt/MegaRAID/MegaCli/ directory of the service console.
- /opt/MegaRAID/MegaCli/MegaCli -AdpAllInfo -aALL
This lists the adapter information for all LSI Logic adapters found in your system. - /opt/MegaRAID/MegaCli/MegaCli -LDInfo -LALL -aALL
This lists the logical drives for all LSI Logic adapters found in your system. The "State" should be set to "optimal" in order to have a fully operational array. - /opt/MegaRAID/MegaCli/MegaCli -PDList -aALL
This lists all the physical drives for the adapters in your system; the "Firmware state" indicates whether the drive is online or not.
- I created a file called "analysis.awk" in the /opt/MegaRAID/MegaCLI directory with the following contents:
# This is a little AWK program that interprets MegaCLI output
This awk program processes the output of MegaCli, as you can test by running the following command:
/Device Id/ { counter += 1; device[counter] = $3 }
/Firmware state/ { state_drive[counter] = $3 }
/Inquiry/ { name_drive[counter] = $3 " " $4 " " $5 " " $6 }
END {
for (i=1; i<=counter; i+=1) printf ( "Device %02d (%s) status is: %s <br/>\n", device[i], name_drive[i], state_drive[i]); }
./MegaCli -PDList -aALL | awk -f analysis.awk
when being in the /opt/MegaRAID/MegaCLI directory. - Then I created the cron job by placing a file called raidstatus in /etc/cron.hourly, with the following contents:
#!/bin/sh
/opt/MegaRAID/MegaCli/MegaCli -PdList -aALL| awk -f /opt/MegaRAID/MegaCli/analysis.awk >/tmp/megarc.raidstatus
if grep -qEv "*: Online" /tmp/megarc.raidstatus
then
/usr/local/bin/smtp_send.pl -t tim@pretnet.local -s "Warning: RAID status no longer optimal" -f esx@pretnet.local -m "`cat /tmp/megarc.raidstatus`" -r exchange.pretnet.local
fi
rm -f /tmp/megarc.raidstatus
exit 0
Don't forget to run a chmod a+x /etc/cron.hourly/raidstatus in order to make the file executable by all users.