[OpenIndiana-discuss] Zfs stability - our scrub script

Jim Klimov jimklimov at cos.ru
Sat Oct 13 15:48:08 UTC 2012


2012-10-13 0:41, Doug Hughes wrote:
> yes, you shoud do a scrub  and no, there isn't very much risk to this. This
> will scan your disks for bits that have gone stale or the like. You should
> do it. We do a scrub once per week.

Just in case this helps anyone, here's the script we use to
initiate scrubbing from cron (i.e. once a week on fridays).
Just add a line to crontab and receive emails ;)

There's some config-initialization and include cruft at the
start (we have a large package of admin-scripts), I hope
absence of config files (which can be used to override
hardcoded defaults) and libraries won't preclude the script
from running on systems without our package:

# cat /opt/COSas/bin/zpool-scrub.sh
-------------
#!/bin/bash

# $Id: zpool-scrub.sh,v 1.6 2010/11/15 14:32:19 jim Exp $
# this script will go through all pools and scrub them one at a time
#
# Use like this in crontab:
# 0 22 * * 5 [ -x /opt/COSas/bin/zpool-scrub.sh ] && 
/opt/COSas/bin/zpool-scrub.sh
#
# (C) 2007 nickus at aspiringsysadmin.com and commenters
# 
http://aspiringsysadmin.com/blog/2007/06/07/scrub-your-zfs-file-systems-regularly/
# (C) 2009 Jim Klimov, cosmetic mods and logging; 2010 - locking
#

#[ x"$MAILRECIPIENT" = x ] && MAILRECIPIENT=admin at domain.com
[ x"$MAILRECIPIENT" = x ] && MAILRECIPIENT=root

[ x"$ZPOOL" = x ] && ZPOOL=/usr/sbin/zpool
[ x"$TMPFILE" = x ] && TMPFILE="/tmp/scrub.sh.$$.$RANDOM"
[ x"$LOCK" = x ] && LOCK="/tmp/`basename "$0"`.`dirname "$0" | sed 
's/\//_/g'`.lock"

COSAS_BINDIR=`dirname "$0"`
if [ x"$COSAS_BINDIR" = x./ -o x"$COSAS_BINDIR" = x. ]; then
         COSAS_BINDIR=`pwd`
fi

# Source optional config files
[ x"$COSAS_CFGDIR" = x ] && COSAS_CFGDIR="$COSAS_BINDIR/../etc"
if [ -d "$COSAS_CFGDIR" ]; then
     [  -f "$COSAS_CFGDIR/COSas.conf" ] && \
         . "$COSAS_CFGDIR/COSas.conf"
     [  -f "$COSAS_CFGDIR/`basename "$0"`.conf" ] && \
         . "$COSAS_CFGDIR/`basename "$0"`.conf"
fi

[ ! -x "$ZPOOL" ] && exit 1

### Include this after config files, in case of RUNLEVEL_NOKICK mask 
override
RUN_CHECKLEVEL=""
[ -s "$COSAS_BINDIR/runlevel_check.include" ] &&
     . "$COSAS_BINDIR/runlevel_check.include" &&
     block_runlevel

# Check LOCKfile
if [ -f "$LOCK" ]; then
     OLDPID=`head -n 1 "$LOCK"`
     BN="`basename $0`"
     TRYOLDPID=`ps -ef | grep "$BN" | grep -v grep | awk '{ print $2 }' 
| grep "$OLDPID"`
     if [ x"$TRYOLDPID" != x ]; then

         LF=`cat "$LOCK"`

         echo "= ZPoolScrub wrapper aborted because another copy is 
running - lockfile found:
$LF
Aborting..." | wall
         exit 1
     fi
fi
echo "$$" > "$LOCK"

scrub_in_progress() {
         ### Check that we're not yet shutting down
         if [ x"$RUN_CHECKLEVEL" != x ]; then
             if [ x"`check_runlevel`" != x ]; then
                 echo "INFO: System is shutting down. Aborting scrub of 
pool '$1'!" >&2
                 zpool scrub -s "$1"
                 return 1
             fi
         fi

         if $ZPOOL status "$1" | grep "scrub in progress" >/dev/null; then
                 return 0
         else
                 return 1
         fi
}

RESULT=0
for pool in `$ZPOOL list -H -o name`; do
         echo "=== `TZ=UTC date` @ `hostname`: $ZPOOL scrub $pool 
started..."
         $ZPOOL scrub "$pool"

         while scrub_in_progress "$pool"; do sleep 60; done

         echo "=== `TZ=UTC date` @ `hostname`: $ZPOOL scrub $pool completed"

         if ! $ZPOOL status $pool | grep "with 0 errors" >/dev/null; then
                 $ZPOOL status "$pool" | tee -a $TMPFILE
                 RESULT=$(($RESULT+1))
         fi
done

if [ -s "$TMPFILE" ]; then
         cat "$TMPFILE" | mailx -s "zpool scrub on `hostname` generated 
errors" "$MAILRECIPIENT"
fi

rm -f $TMPFILE

# Be nice, clean up
rm -f "$LOCK"

exit $RESULT

-----


HTH,
//Jim Klimov



More information about the OpenIndiana-discuss mailing list