[OpenIndiana-discuss] Diagnosis help needed

michelle michelle at msknight.com
Sun Jun 24 17:00:35 UTC 2012


The motherboard is a Gigabyte GA-H55M-UD2H and has five SATA sockets, 
two IDE and one E-sata.

The SATA are, I believe, Intel.

The mb has five internal sata - two are rpool which are SSDs with plenty 
of space on them.

The other three are given over to a "tank" mounted at /mirror.

The external toaster is a Sharkoom Quickport Duo II, where drive  1 is 
connected via E-sata and, when I connect the second, it is via USB 
because I only have one e-sata port.

I believe that disconnecting the USB has resulted in stopping the IRQ 
resource problem, but now I am having this...

Jun 24 15:59:46 jaguar ahci: [ID 687168 kern.warning] WARNING: ahci0: 
ahci port 3 is trying to do error recovery
Jun 24 15:59:46 jaguar ahci: [ID 693748 kern.warning] WARNING: ahci0: 
ahci port 3 task_file_status = 0x4041
Jun 24 15:59:46 jaguar ahci: [ID 657156 kern.warning] WARNING: ahci0: 
error recovery for port 3 succeed
Jun 24 15:59:46 jaguar ahci: [ID 811322 kern.info] NOTICE: ahci0: 
ahci_tran_reset_dport port 3 reset device
Jun 24 15:59:51 jaguar ahci: [ID 296163 kern.warning] WARNING: ahci0: 
ahci port 3 has task file error
Jun 24 15:59:51 jaguar ahci: [ID 687168 kern.warning] WARNING: ahci0: 
ahci port 3 is trying to do error recovery
Jun 24 15:59:51 jaguar ahci: [ID 693748 kern.warning] WARNING: ahci0: 
ahci port 3 task_file_status = 0x4041
Jun 24 15:59:51 jaguar ahci: [ID 657156 kern.warning] WARNING: ahci0: 
error recovery for port 3 succeed
Jun 24 15:59:54 jaguar ahci: [ID 296163 kern.warning] WARNING: ahci0: 
ahci port 3 has task file error
Jun 24 15:59:54 jaguar ahci: [ID 687168 kern.warning] WARNING: ahci0: 
ahci port 3 is trying to do error recovery
Jun 24 15:59:54 jaguar ahci: [ID 693748 kern.warning] WARNING: ahci0: 
ahci port 3 task_file_status = 0x4041
Jun 24 15:59:54 jaguar ahci: [ID 657156 kern.warning] WARNING: ahci0: 
error recovery for port 3 succeed
Jun 24 16:00:00 jaguar ahci: [ID 296163 kern.warning] WARNING: ahci0: 
ahci port 3 has task file error
Jun 24 16:00:00 jaguar ahci: [ID 687168 kern.warning] WARNING: ahci0: 
ahci port 3 is trying to do error recovery
Jun 24 16:00:00 jaguar ahci: [ID 693748 kern.warning] WARNING: ahci0: 
ahci port 3 task_file_status = 0x4041
Jun 24 16:00:00 jaguar ahci: [ID 657156 kern.warning] WARNING: ahci0: 
error recovery for port 3 succeed


The set is usually automatically scrubbed once a month.

I have removed about 200 gig of data and it seems to be stable-ish.

I'll begin another scrub of the tank now. It will likely take 10 hours.



On 24/06/12 17:44, Jan Owoc wrote:
> On Sun, Jun 24, 2012 at 1:01 AM, michelle<michelle at msknight.com>  wrote:
>> Situation - home-type standard PC, 4gig of RAM, running two SSDs in a
>> mirrored root raid pool. Three 2tb hard drives in a raidz.
>>
>> System is...
>>
>>              OpenIndiana Development oi_151.1.4 X86 (powered by illumos)
> [...]
>> I have an external, "toaster" which takes two hard drives, one is connected
>> via e-sata; the other is running via USB (although for this instance, there
>> was no drive in teh socket) because I've had a long running battle to try
>> and get an affordable (to me) e-sata card that will give me another e-sata
>> channel.
> So it's a 2-bay RAID enclosure with either USB or eSATA connections.
> One of the two bays are occupied, and how is the enclosure being
> connected?
>
>
>> The ZFS set was getting full; something like only 50gig free.
> You probably have two zpools - one is the mirrored "rpool", while the
> other is your data pool, say, "tank". Am I understanding that it's
> "tank" that has 50 GB (out of 4TB) free, while "rpool" does not have
> any problems?
>
>
>> I was starting
>> file copies off the server to an external drive via an SMB client, and going
>> to bed, to wake up and find the process had frozen. Diagnosis led me to the
>> server, which appeared to hang on any log on attempt. It even didn't listen
>> to the power button properly.
> Is this external drive the RAID enclosure discussed above, that you
> connected via Ethernet and it shows up as an SMB device, or is this on
> a separate computer?
>
>
> My thoughts are:
>
> 1) if "rpool" is not full, then the system should not freeze.
> Depending on any snapshots, even removing files from "tank" may fail
> and if you do it via SMB, as opposed to over a local command line, you
> won't know why. Could you try logging on to the system, and copying
> the files from the server to a client, so you see any local error
> messages on the command line?
>
> 2) if, for some reason, data loss crept in, ZFS will refuse to return
> bad data. Could you run a "zpool scrub" on each of the pools and then
> verify that they are healthy?
>
>
>> Could the lack of free space on the ZFS set also have caused a problem, or
>> is it likely that the weight of another problem, possibly the USB external
>> drive connection, caused it to keel over?
> Not sure if it will help, but could you give details on what the
> enclosure is (brand/model), and what motherboard/USB controller are on
> the server?
>
>
> Jan
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss



More information about the OpenIndiana-discuss mailing list