[OpenIndiana-discuss] Checksum errors on high-volume / high-speed writes

wim at vandenberge.us wim at vandenberge.us
Wed Aug 14 17:49:44 UTC 2013


Good morning,
Last week we put three identical oi_151a7 systems into pre-production. Each
system has 240 drives in 9drive RAIDZ1 vdevs (I'm aware of the potential DR
issues with this configuration and I'm ok with them in this case). The drives
are Seagate Enterprise nearline SAS, 7200RPM. The servers are all identical
Supermicro servers with dual 4C Xeons, maxed out memory and LSI 9200-8E HBA's.
Intel MLC ssd for boot and cache, Intel SLC for ZIL. All of these are components
we have used many times before without issue.

While loading up the systems with data we started to see low numbers of checksum
errors across all drives. The first time we saw it we pulled the drives and low
level tested them, no errors. Scrub finds no issues. iostat -EXN shows no hard,
soft or transport errors. iostat -xnz shows no anomalous drives.

During the load test we're pushing between 14 and 16Gb/sec to each system and
the CPU load average does go up significantly (about 8) but that is to be
expected with a RAIDZ1 volume this big and this busy.

I don't want to put the systems into prodcution until I figure out if I have a
problem or not. Thoughts / ideas?

thank you,

Willem


More information about the OpenIndiana-discuss mailing list