[OpenIndiana-discuss] What happens when a ZIL drive dies?

Nick Hall darknovanick at gmail.com
Tue Jun 5 17:32:20 UTC 2012


On Mon, Jun 4, 2012 at 10:48 AM, Jan Owoc <jsowoc at gmail.com> wrote:

>
> The data on the main pool is always consistent in that a certain
> operation either made it to the disk or it didn't. However, if your
> application depends on the fact that writes make it out to disk in a
> specific order (that's why it's sync'ing, right?), then it's the ZIL
> that would contain a log/journal of what should have been written to
> the disk and in what order. If you lose this, your file system remains
> consistent, but some writes may have made it out to the disk before
> others.
>
>
>
Thanks everyone for all the responses. They were very helpful. My main
application cases are ESXi, which as stated below does a lot of syncs, and
MySQL. I had previously used the zilstat tool, but this is the first time
I've heard of nfssrvtop, and I really appreciate that, as it works really
well for analyzing these usage patterns. After doing more analysis, it
seems as though most of my writes are actually async, so it probably
wouldn't speed things up too dramatically to add an SLOG, so for now I'm
going to stick with what I have, so thank you for that advice.

I'm just wondering, for my own personal knowledge and for anyone else who
finds this thread later, for some clarification on the above quote. So, if
I'm understanding this correctly, are you saying that, say I have an
application and it writes to file A, then it writes to file B, then it
writes to file C, then finally calls fsync, that there could be a case
where if the computer crashed and at the same time the SLOG got fried
(after files A B and C were written to, but before the sync was finished),
then upon restart, the write to file B may have taken affect on the pool
but the write to file A wouldn't be on there? Or am I misunderstanding?
Usually when I think of journals I would think it would roll back the
change to file B because it doesn't have a record in the journal to
indicate that the sync was successful. I understand the possibility of
loosing the last few seconds of writes in this scenario -- I'm just trying
to wrap my head around the possibility of losing *part* of the last few
seconds of data, and the much worse implications this has. Thanks,

Nick


More information about the OpenIndiana-discuss mailing list