Sunday, August 16, 2009

Get serial number of disk using smartmontools

(source: http://cafenate.wordpress.com/2009/02/22/setting-up-smartmontools-on-opensolaris/)
Installed smartmontools 5.38:
tar zxvf smartmontools-5.38.tar.gz
cd smartmontools-5.38
./configure
make
make install
(Smartmontools is by default installed in /usr/local)

The first change to make in /usr/local/etc/smartd.conf is to comment out the DEVICESCAN line, which is fine if you want to scan all disks in your system, but I found that smartmontools didn’t like my rpool disk, and it wanted me to declare the disk types as “scsi” for it to do anything at all. Next we have to tell smartd which disks to monitor, so I added the following lines to the end of the smartd.conf file:

/dev/rdsk/c8t0d0 -d scsi -H -m redalert
/dev/rdsk/c8t1d0 -d scsi -H -m redalert
/dev/rdsk/c8t2d0 -d scsi -H -m redalert
/dev/rdsk/c8t3d0 -d scsi -H -m redalert
/dev/rdsk/c8t4d0s0 -d scsi -H -m redalert


goto /usr/local/sbin

root@dawbckup:/usr/local/sbin# ./smartctl -d scsi -a /dev/rdsk/c7t0d0
smartctl version 5.38 [i386-pc-solaris2.11] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Serial number: S13PJDWS647014
Device type: disk
Local Time is: Sun Aug 16 15:11:29 2009 WEST
Device supports SMART and is Enabled
Temperature Warning Disabled or Not Supported
SMART Health Status: OK

Current Drive Temperature: 31 C

Error Counter logging not supported
No self-tests have been logged
root@dawbckup:/usr/local/sbin

c8t0d0: S13PJDWS647014
c8t1d0: S13PJDWS304477
c8t2d0: S13PJDWS647020
c8t3d0: S13PJDWS304478
(the disks of my SAN)

So far so good, but what about having smartd run at bootup, and continuously monitoring the disk status? In Linux, you’d use initd, but since this is OpenSolaris, we’ll use the Service Management Framework (SMF) instead. To do that, paste the following text into /var/svc/manifest/site/smartd.xml, change the file ownership to root:sys, and invoke:
pfexec svccfg -v import /var/svc/manifest/site/smartd.xml
Then check that the service is running (svcs smartd), and if not, enable it using pfexec svcadm enable smartd.

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<service_bundle type="manifest" name="smartd">
  <service
     name="site/smartd"
     type="service"
     version="1">
    <single_instance/>
    <dependency
       name="filesystem-local"
       grouping="require_all"
       restart_on="none"
       type="service">
      <service_fmri value="svc:/system/filesystem/local:default"/>
    </dependency>
    <exec_method
       type="method"
       name="start"
       exec="/usr/local/etc/rc.d/init.d/smartd start"
       timeout_seconds="60">
      <method_context>
        <method_credential user="root" group="root"/>
      </method_context>
    </exec_method>
    <exec_method
       type="method"
       name="stop"
       exec="/usr/local/etc/rc.d/init.d/smartd stop"
       timeout_seconds="60">
    </exec_method>
    <instance name="default" enabled="true"/>
    <stability value="Unstable"/>
    <template>
      <common_name>
        <loctext xml:lang="C">
          SMART monitoring service (smartd)
        </loctext>
      </common_name>
      <documentation>
        <manpage title="smartd" section="1M" manpath="/usr/local/share/man"/>
      </documentation>
    </template>
  </service>
</service_bundle>

A this point we have a managed service that is checking the health of our disks, and if anything comes up, it will send an email to the redalert user.