[OpenIndiana-discuss] Disk Space Disappearing.

Wed Jul 30 17:48:46 UTC 2014

On Jul 30, 2014, at 4:49 AM, Jim Klimov <jimklimov at cos.ru> wrote:

> 
> Hello Peter, nice to hear from you again!
> 
> after a quick look at zfs-list'ings, I see that many of your current ZBE's have more and more used space while referenced remains the same. Do you have zfs-auto-snap running? To me it seems like your zones keep lots of short-lived data (access logs that are recompressed and deleted, email queues from local daemons, etc.) and that ends up referenced only from snapshots. On the up-side, modern zfs-snap services will actively destroy older auto-snaps to free up the threshold percentage of pool size (configurable in their smf settings).
> 
> It might help to enable compression on the zone-roots if you haven't done so already, and/or split the zone filesystem into more datasets similar in ideology to my split-root setup for global zones (there are implementation differences that i haven't published yet iirc). For example, if you split off the webserver log directories, you can manage a different compression as well as a different auto-snap schedule from the rest of your system.
> 
> HTH, Jim
> --
> Typos courtesy of K-9 Mail on my Samsung Android
> 

Jim, it’s nice to hear your friendly voice on this list again!  I haven’t seen you on it since I rejoined the list on July 4th.  I was starting to wonder if something had happened to you, or if you were just traveling or something.

I was up most of the night, so I’m real tired and not real coherent right now, but I never have quite understood what that reference column on the “zfs list” means.  I have always assumed it was whatever has changed from whatever the source (snapshot?) for that pool was.  Am I even close?

A little background:

After almost a year of procrastinating, I decided last month to take a little time to write a script that would let me easily shut down a zone, detach it, take snapshots of it, do a “zfs send” on it to a file that I could save on another computer to restore from if necessary, reattach the zone, and reboot it.  I then ran that script on each of our non-global zones, and copied those snapshots to my laptop.

I think that both killed me, and helped me.  

Early in the morning of July 3rd or 4th, our main server went down.  Later analysis seems to show that I ran it out of disk space.  I’m guessing it is from the same problem I noticed last night on our current main server.

*I did have automatic snapshots turned on on that computer.  I have never had them turned on the computer we’re currently using as our main server.*  I have since turned automatic snapshots off all of our OI servers, and have deleted all of the automatic snapshots from them.

So anyway, after that server went down on the 3rd/4th — to make a long story short — I took those snapshots that I had saved on my laptop, did a “zfs receive” on them on what we are currently using as our main server, attached them, booted them, updated the data itself, etc.

So, the source of all of the non-global zones on our current main server is those snapshots.

Would the change in data (email, database files, etc.), or most especially, the nightly backups that I described in my previous post, be likely to have caused those big changes in referenced data?

And, since I raced down to the server room and deleted those source snapshots that those zones were created from — which resulted in freeing up a lot of disk space — can I reasonably expect that the server should now be reasonably stable — that I won’t drop 11 G tonight when our nightly backup routines back up all of our data again?  (Deleting the EmailArchive02.tgz file, renaming EmailArchive01.tgz to EmailArchive02.tgz, and backing up the current data to EmailArchive01.tgz.)

Also, if I do a “beadm destroy -s OpenIndiana-151a7-2014-0705A”, will that be a safe thing for me to do, and if so, would that free up 10 G of space on the drive?

Currently, our system shows:

# beadm list
BE                           Active Mountpoint Space Policy Created
OpenIndiana-151a7-2014-0705A -      -          10.9M static 2014-07-05 17:13
OpenIndiana-151a7-2014-0706A NR     /          4.39G static 2014-07-06 22:05
openindiana                  -      -          12.0M static 2014-07-04 17:39

I only got a couple of hours of sleep before having to get up for other duties, so I’m going to try to grab an hour or two of sleep now; but if somebody could shed some light on this mystery for me, I would really appreciate it.

This has been a brutal month of server woes.  I want to go back to programming!