All posts for the month September, 2019

I have a small linux server at home, chiefly for file storage and running a plex server. In the past, I’ve used a Linux software RAID-1 (mirror) of two drives to provide more-robust storage for important files (documents, photos, etc), and a single drive for media files (that could be re-backed-up from their original CD/DVD media, if lost). Recently I wanted to transition to a combined ZFS storage array for everything, for the following reasons:

  • Filesystem checksums (during normal read/write operation, plus weekly scheduled full-array scrubs) to ensure that there is no “bit rot” where files silently change over time.
  • Full mirroring protection of all files
  • In-place expansion capability

Research references

After discussions of my needs and capabilities with a ZFS “expert” friend, here’s the plan I decided to go with:

  • Create a single ZFS pool (top-level filesystem)
  • This ZFS pool will consist of two virtual device (VDEVs in ZFS parlance), where each VDEV is a “mirroring” VDEV, where all drives in each VDEV are mirrored with each other, providing RAID-like drive redundancy within each VDEV.
  • Set up cron jobs to periodically scrub the ZFS pool to verify the checksums and ensure the pool is in good health.
  • Initially set up 2x 4TB drives in one VDEV, and 2x 2TB drives in the other VDEV, resulting in 6 TB total storage in the pool.
  • When more space is required:
    • Add 2x larger drives to the smaller VDEV
    • Wait for the VDEV to resilver [1]
    • Remove the smaller drives from the VDEV
    • Expand the VDEV to the size of the new larger drives

To help plan this out, and learn the ZFS terminology, I created a series of statements about ZFS:

  1. A ZFS pool is made from one or more virtual devices (VDEV), which are, in our cases, 2+ physical drives mirrored together.
  2. A ZFS pool expands when its existing VDEVs become larger, or by adding another VDEV, but you can never remove a VDEV from a pool (but you can *replace* a VDEV, which is swapping one VDEV out for another, using zpool replace).
  3. You can add a new drive to a mirroring VED and it’ll “re-silver” and add that new drive to the mirroring, slowly over time.
  4. You can remove a drive from a mirroring VDEV and it keeps going. Of course, if you remove the last drive of a mirroring VDEV, it can’t keep going.
  5. An autoexpand=on mirror VDEV expands when all the member drives are large enough. Or turn off autoexpand and do it manually using online -e
  6. zpool replace lets you switch the internal architecture of the VDEV (like switch from RAIDZx to/from mirror). If you just add new big drives to the existing VDEV, let them recover, and then remove the old smaller drives, it of course stays as a mirroring VDEV.

Here are the commands used to build, maintain, and expand the ZFS pool:

  • Create the zpool from two mirrors of two drives each:
    • sudo zpool create mypool mirror /dev/sda /dev/sdb mirror /dev/sdc /dev/sdd
    • Note: The command above uses the /dev/sdX device names, which may changed based on device initialization order at boot time, so it’s strongly suggested to instead use the device files in /dev/disk/by-id/ which will not change.
  • Create the ZFS filesystem mount point in the zpool:
    • sudo zfs create mypool/zfs
  • Set the mountpoint:
    • sudo zfs set mountpoint=/whatever mypool/zfs
    • Note that you don’t need to put anything into /etc/fstab for this ZFS pool mountpoint, it’ll be mounted automatically when ZFS starts up at boot.
    • I don’t know how to use ZFS for your boot drive (/) as I only use it for non-OS data.
  • Add an optional drive for ZFS intent log:
    • sudo zpool add -f mypool log /dev/nvme0n1
    • A friend loaned me a nifty PCIe SSD to experiment with, so I added it to store my ZFS intent log (much like the EXT journal). I don’t think my typical “frequent-read with rare-writes” workload really take advantage of this cool device, it was mostly a fun experiment.
  • Add these entries to root’s crontab with sudo crontab -e:
    • # Every Monday at 00:00, start a ZFS scrub
      0 0 * * 1 /sbin/zpool scrub mypool
      # Every Monday at 18:00 (6pm), send the zfs zpool status email
      0 18 * * 1 /sbin/zpool status
  • When we want to upgrade the 2 x 2 TB drives to 2 x 4 TB drives:
    • The syntax here is somewhat odd. We attach each new drive to one of the old smaller drives, which is how the new drives get added to the mirror.
    • sudo zpool attach mypool -f oldDriveName1 newDriveName1
      sudo zpool attach mypool -f oldDriveName1 newDriveName2
    • Use sudo zpool status to monitor the resilvering process. You should see that the VDEV with the smaller drives now has the newly-added larger drives.
    • Once resilver is complete, run a scrub (just in case), to confirm everything is working right and your data is safe.
    • Remove the old drives:
      • sudo zpool detach mypool oldDriveName1
        sudo zpool detach mypool oldDriveName2
    • Expand the new drives:
      • sudo zpool online -e mypool newDriveName1
        sudo zpool online -e mypool newDriveName2
    • You can now use sudo zpool list or df -h to see that the pool has expanded in size, and now you can store more data!

Hope this helps! Feel free to leave a comment if you notice any typos, or have any suggestions to add.

[1]. To “re-silver” is what you’d do to an antique mirror when it became degraded and needed to be restored :-)