[OpenIndiana-discuss] mpt_sas target to device mapping

Hetrick, Joseph P joseph-hetrick at uiowa.edu
Wed May 3 13:13:28 UTC 2017


This is very similar to how we do this same thing; so I’d be interested in hearing if somebody has a better way too.

We’ve also got a backup “locate” that uses sg3utils, and uses sg_vpd to build a serials/device/port map so we can turn on LED’s that way, if for some reason the disk is unavailable to sasNircu.

Our standard systems are about half as many disks as yours, but, we’ve noticed similar hangs.  One thing we did to mitigate was to look at iostat –ne output + smartctl output and try to determine when a disk is going to fail.  Typically disks that we see ZFS eventually kickout have been spewing lots of errors either via fmd/iostat and smart, so, we’ve had really good luck pre-emptively failing a disk (eg: in a 24 hour period we set an arbitrary threshold of 20 iostat errors, which generates an alert—9 times out of 10, we’ll see many hundreds or thousands fire off on a single disk in 24 hour period—especially during our monthly scrubs).  The result has been fewer issues with hung pools, and we tend to catch disks that are on their way out, anyway.

Joe

On 5/3/17, 2:42 AM, "Liam Slusser" <lslusser at gmail.com> wrote:

    Hi All -
    
    We have a rather large zfs array of 276 disks plus 2 log devices (in a
    mirror) and 2 ssd cache devices.  The total volume is 798T.  The server is
    a 2u Dell r720xd with 5 LSI 9207-8e SAS controllers and 23 Dell MD1200
    12-disks arrays.  All the disks are SAS.  We're running OpenIndiana.
    
    The system works wonderfully for what we use it for (basically a big nfs
    server) with very little issues.  However, with so many disks, disk
    failures obviously happen.  Generally when we lose a disk ZFS marks the
    disk offline and I can use prtconf and the LSI tool sas2ircu to map the
    device back to a physical location in one of the MD1200s.
    
    However, every once In a while a disk fails in such a way it hangs the
    zpool.  Modifying the io timeout values helped greatly for OI to drop a
    disk when it fails but still sometimes a disk has issues that hang the
    system.
    
    When this happens any command that tries to access the disk hangs.  So you
    can't use system tools like zpool or format.  You also can't modify the
    device state with cfgadm, so a cfgadm -c unconfigure command just hangs.
    Sometimes if you wait long enough through enough retires your command will
    finally succeed, but generally, it just takes too long.  The event log
    shows error messages like this over and over:
    
    May  2 23:02:18 zintstore01 scsi: [ID 107833 kern.warning] WARNING: /pci at ff
    ,0/pci8086,3c0a at 3,2/pci1000,3080 at 0 (mpt_sas7):
    May  2 23:02:18 zintstore01     Disconnected command timeout for Target 70
    May  2 23:02:18 zintstore01 scsi: [ID 365881 kern.info] /pci at ff
    ,0/pci8086,3c0a at 3,2/pci1000,3080 at 0 (mpt_sas7):
    May  2 23:02:18 zintstore01     Log info 0x31140000 received for target 70.
    May  2 23:02:18 zintstore01     scsi_status=0x0, ioc_status=0x8048,
    scsi_state=0xc
    
    Mapping "target 70" back to a physical disk is a massive pain, they don't
    make it easy and I haven't found a command/utility to do this for me.  Here
    is my procedure:
    
    1)  I pull the mptsas mapping out of the kernel:
    
    # echo "::mptsas -t" | mdb -k > /tmp/mdbk-output
    
    2) I found that the event log prints the target in decimal, but the
    mdb kernel output is in hex.  So convert the target ID into hex
    
    # printf "%x\n" 70
    46
    
    3)  Search through your output looking for that dev id to get the sasaddress
    
    # egrep "ON=|devhdl 46" /tmp/mdbk-output
    ffffff1156096000    0     0       0 ON=D0
                     devhdl 46, sasaddress 5000039608caef8a, phymask f0,devinfo
    401
    ffffff11569b9000    6     0       0 ON=D0
                     devhdl 46, sasaddress 50000c0f01c7310a, phymask f0,devinfo
    401
    ffffff1177028000    7     0       0 ON=D0
                     devhdl 46, sasaddress 5000c500579d2885, phymask f0,devinfo
    401
    
    4)  Now I have 5 controllers, so three of them have a deviceID of 46.  To
    figure out which one get a list of controllers and their path
    
    # echo "::mptsas -d" | mdb -k
    ffffff1157c00000    6     0       0 ON=D0
    
                     Path in device tree /pci at ff,0/pci8086,3c02 at 1/pci1000,3040 at 0
    
            mptsas_t inst ncmds suspend  power
    ================================================================================
    ffffff1177fab000    7     0       0 ON=D0
    
                     Path in device tree /pci at ff,0/pci8086,3c0a at 3
    ,2/pci1000,3080 at 0
    ...
    
    Map the device path in the original event log "/pci at ff,0/pci8086,3c0a at 3
    ,2/pci1000,3080 at 0" to the mptsas_t address of ffffff1177fab000.  Now look
    back at step three and match the mptsas target ffffff1177fab000 and grab
    the sasaddress of the drive 5000c500579d2885.
    
    5)  Once you have the sasaddress 5000c500579d2885 you can use prtconf -v
    and search for it, then scroll down a few lines to get the device path
    /dev/dsk/c1t5000C500579D2887d0.  Boom!  Now you can issue a "zpool remove
    pool c1t5000C500579D2887d0".
    
    Unfortunately, the only way I've figured out how to get the slot/enclosure
    information is with the sas2ircu tool but it does not include the
    sasaddress, only the enclosure/slot and serial number.  So I grab the
    device serial number with prtconf then look for it in the sas2ircu output.
    I've found it helpful to keep a copy of the current disk layout using
    sas2ircu so I can map a drive to a slot in case the system is hung in such
    a way I can't use the command.  Then I can call a data center tech to have
    them pull slot N in device Y.  Once the device is pulled the system
    immediately marks the disk bad and everything returns to normal.
    
    Is there any easier way to do this?  I suppose I could script this and make
    a handy tool but I'm wondering if anybody has a better way.
    
    thanks!
    liam
    _______________________________________________
    openindiana-discuss mailing list
    openindiana-discuss at openindiana.org
    https://openindiana.org/mailman/listinfo/openindiana-discuss
    



More information about the openindiana-discuss mailing list