[OpenIndiana-discuss] ZFS stalls with oi_151?
George Wilson
george.wilson at delphix.com
Fri Oct 21 17:52:53 UTC 2011
It would be good to get a crash dump of this so that we can figure out what is really happening.
- George
On Oct 21, 2011, at 12:38 PM, Michael Stapleton wrote:
> Hi,
>
> I had similar hard lockups when I accidentally tried to delete a ZFS
> Volume while doing a ZFS send at the same time.
> There seems to be a missing lock.
>
> In the end I had to ensure that I did not run concurrent zfs commands.
>
> Are there any other ZFS commands hung at the same time?
>
>
> Mike
>
> On Fri, 2011-10-21 at 10:16 +0200, Tommy Eriksen wrote:
>
>> Hi guys,
>>
>> I've got a bit of a ZFS problem:
>> All of a sudden, and it doesn't seem related to load or anything, the system will stop writing to the disks in my storage pool. No error messages are logged (that I can find anyway), nothing in dmesg, messages or the likes.
>>
>> ZFS stalls, a simple snapshot command (or the likes) just hangs indefinitely and can't be stopped with ctrl+c or kill -9.
>>
>> Today, the stall happened after I had been running 2 VMs on each (running on vsphere5 connecting via iscsi) running iozone -s 200G (just to generate a bunch of load). Happily, this morning, I saw that they were still running without problem and stopped them. Then, when asking vSphere to delete the VMs, all write I/O stalled. A bit too much irony for me :)
>>
>> However, and this puzzled me, everything else seems to run perfectly, even up to zfs writing new data on the l2arc devices while data is read.
>>
>> Boxes (2 of the same) are:
>> Supermicro based, 24 bay chassis
>> 2*X5645 Xeon
>> 48gigs of RAM
>> 3*LSI2008 controllers coupled to
>> 20 Seagate Constellation ES 3TB SATA
>> 2 Intel 600GB SSD
>> 2 Intel 311 20GB SSD
>>
>> 18 of the 3TB drives are set up in mirrored vdevs, the last 2 are spares.
>>
>> Running oi_151a (trying a downgrade to 148 today, I think, since I have 5 or so boxes running without problems on 148, but both my 151a are playing up).
>>
>> /etc/system variables:
>> set zfs:zfs_vdev_max_pending = 4
>> set zfs:l2arc_noprefetch = 0
>> set zfs:zfs_vdev_cache_size = 0
>>
>>
>> I can write to a (spare) disk on the same controller without errors, so I take it its not a general I/O stall on the controller:
>> root at zfsnas3:/var/adm# dd if=/dev/zero of=/dev/rdsk/c8t5000C50035DE14FAd0s0 bs=1M
>> ^C1640+0 records in
>> 1640+0 records out
>> 1719664640 bytes (1.7 GB) copied, 11.131 s, 154 MB/s
>>
>> iostat reported - note no writes to any of the other drives. All writes just stall.
>>
>> extended device statistics ---- errors ---
>> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device
>> 3631.6 167.2 14505.5 152337.1 0.0 2.2 0.0 0.6 0 157 0 0 0 0 c8
>> 109.0 0.8 472.9 0.0 0.0 0.0 0.0 0.5 0 3 0 0 0 0 c8t5000C50035B922CCd0
>> 143.0 0.8 567.1 0.0 0.0 0.1 0.0 0.5 0 3 0 0 0 0 c8t5000C50035CA8A5Cd0
>> 89.6 0.8 414.1 0.0 0.0 0.1 0.0 0.6 0 2 0 0 0 0 c8t5000C50035CAB258d0
>> 95.8 0.8 443.3 0.0 0.0 0.0 0.0 0.5 0 2 0 0 0 0 c8t5000C50035DE3DEBd0
>> 144.8 0.8 626.4 0.0 0.0 0.1 0.0 0.6 0 4 0 0 0 0 c8t5000C50035BE1945d0
>> 134.0 0.8 505.7 0.0 0.0 0.0 0.0 0.4 0 3 0 0 0 0 c8t5000C50035DDB02Ed0
>> 1.0 0.4 3.4 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c8t5000C50035DE0414d0
>> 107.8 0.8 461.6 0.0 0.0 0.0 0.0 0.3 0 2 0 0 0 0 c8t5000C50035D40D15d0
>> 117.2 0.8 516.5 0.0 0.0 0.1 0.0 0.5 0 3 0 0 0 0 c8t5000C50035DE0C86d0
>> 64.2 0.8 261.2 0.0 0.0 0.0 0.0 0.6 0 2 0 0 0 0 c8t5000C50035DD6044d0
>> 2.0 0.8 6.8 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c8t5001517959582943d0
>> 2.0 0.8 6.8 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c8t5001517959582691d0
>> 109.8 0.8 423.5 0.0 0.0 0.0 0.0 0.3 0 2 0 0 0 0 c8t5000C50035C13A6Bd0
>> 765.0 0.8 3070.9 0.0 0.0 0.2 0.0 0.2 0 7 0 0 0 0 c8t5001517959699FE0d0
>> 1.0 149.2 3.4 152337.1 0.0 1.0 0.0 6.5 0 97 0 0 0 0 c8t5000C50035DE14FAd0
>> 210.4 0.8 775.4 0.0 0.0 0.1 0.0 0.4 0 3 0 0 0 0 c8t5000C50035CA1E58d0
>> 689.4 0.8 2776.6 0.0 0.0 0.1 0.0 0.2 0 7 0 0 0 0 c8t50015179596A8717d0
>> 108.6 0.8 430.5 0.0 0.0 0.0 0.0 0.4 0 2 0 0 0 0 c8t5000C50035CBD12Ad0
>> 165.6 0.8 561.5 0.0 0.0 0.1 0.0 0.4 0 3 0 0 0 0 c8t5000C50035CA90DDd0
>> 164.4 0.8 578.5 0.0 0.0 0.1 0.0 0.4 0 4 0 0 0 0 c8t5000C50035DDFC34d0
>> 125.6 0.8 477.7 0.0 0.0 0.0 0.0 0.4 0 2 0 0 0 0 c8t5000C50035DE2AD3d0
>> 93.2 0.8 371.3 0.0 0.0 0.0 0.0 0.4 0 2 0 0 0 0 c8t5000C50035B94C40d0
>> 113.2 0.8 445.3 0.0 0.0 0.1 0.0 0.5 0 3 0 0 0 0 c8t5000C50035BA02AEd0
>> 75.4 0.8 304.8 0.0 0.0 0.0 0.0 0.4 0 2 0 0 0 0 c8t5000C50035DDA579d0
>>
>>
>> …Is anyone else seeing similar?
>>
>> Thanks a lot,
>> Tommy
>> _______________________________________________
>> OpenIndiana-discuss mailing list
>> OpenIndiana-discuss at openindiana.org
>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
More information about the OpenIndiana-discuss
mailing list