セピー

Fixing nonexistent problems

Installing NixOS with ZFS mirrored boot

Table of Contents

// TODO: add PlantUML diagrams

Overview

In this post, we're going to set up a ZFS mirrored boot system with full-disk encryption that is unlockable remotely.

Preparing the installation medium

This step may vary depending on what system you're going to install NixOS into.

This post assumes that you're installing this on a normal server, with a minimal NixOS image.

The community-maintained NixOS wiki contains guides to install NixOS to devices in other conditions, such as a server with only remote access.

You will need a USB stick before proceeding to the next step.

First, download the latest NixOS image, and flash it:

$ curl -L https://channels.nixos.org/nixos-unstable/latest-nixos-minimal-x86_64-linux.iso -O nixos.iso
$ dd if=./nixos.iso of=/dev/sdX bs=1M status=progress

If your target machine architecture is not x86_64, replace it with your desired architecture (e.g. i686, aarch64).

After the image has been successfully flashed into your installation medium, unplug it and boot using the medium on the target machine.

Preparing Disks

We'll start by defining variables pointing to each disk by ID.

According to the Archlinux.org Wiki, If you create zpools using device names (e.g. /dev/sda), ZFS might not be able to detect zpools intermittently on boot.

You can grab the ID via ls -lh /dev/disk/by-id/.

DISK1=/dev/disk/by-id/ata-VENDOR-ID-OF-THE-FIRST-DRIVE
DISK2=/dev/disk/by-id/ata-VENDOR-ID-OF-THE-SECOND-DRIVE

Partitioning

Then we'll partition our disks. Since this is a mirrored setup, we'll have to do the exactly same operation twice. Fortunately, bash function come into rescue.

The partition structure is the following:

1GiB Boot | ~Remaining ZFS
partition() {
    sgdisk --zap-all "$1"
    sgdisk -n 1:0:+1GiB -t 1:EF00 -c 1:boot "$1"
    # Swap is omitted.
    sgdisk -n 2:0:0 -t 2:BF01 -c 2:zfs "$1"
    sgdisk --print "$1"
}

partition $DISK1
partition $DISK2

Creating vfat filesystem for boot

Boot partitions should be formatted with 'vfat', in order for it to mount and function without issues.

mkfs.vfat $DISK1-part1
mkfs.vfat $DISK2-part1

Configuring ZFS pool

This dataset structure is based on Erase your darlings.

Now that we're done partitioning our disks, we'll create a ZFS pool named 'rpool', which is mirrored. This will prompt you to enter a passphrase for your new ZFS pool.

zpool create \
    -o ashift=12 \
    -O mountpoint=none -O atime=off -O acltype=posixacl -O xattr=sa \
    -O compression=lz4 -O encryption=aes-256-gcm -O keyformat=passphrase \
    rpool mirror \
    $DISK1-part2 $DISK2-part2

Then, we create a 'root dataset' which is / (root) for the target machine, then snapshot the empty state as 'blank'.

zfs create -p -o mountpoint=legacy rpool/local/root
zfs snapshot rpool/local/root@blank

Note the 'local' after rpool. In this setup, 'local' is treated as unimportant data, i.e. packages, root, etc., Whereas 'safe' is treated as important data, which needs to be backed up.

And mount it:

mount -t zfs rpool/local/root /mnt

Then we mount the multiple boot partitions we created:

mkdir /mnt/boot
mkdir /mnt/boot-fallback

mount $DISK1-part1 /mnt/boot
mount $DISK2-part1 /mnt/boot-fallback

Create and mount a dataset for /nix:

zfs create -p -o mountpoint=legacy rpool/local/nix
mkdir /mnt/nix
mount -t zfs rpool/local/nix /mnt/nix

And a dataset for /home:

zfs create -p -o mountpoint=legacy rpool/safe/home
mkdir /mnt/home
mount -t zfs rpool/safe/home /mnt/home

And a dataset for states that needs to be persisted between boots:

zfs create -p -o mountpoint=legacy rpool/safe/persist
mkdir /mnt/persist
mount -t zfs rpool/safe/persist /mnt/persist

Note: All states will be wiped each boot after setting up these. Make sure to put states that need to persist on /persist.

Configuring NixOS

Now that we're done with partitions and ZFS, it's time to declaratively configure the machine. This step may vary depending on your machine, please consult the docs when in doubt.

Getting the base configuration

In this post, we're going to use plain nixos-generate-config to get a base configuration files for the machine.

nixos-generate-config --root /mnt

Erasing your darlings

In the previous step, we've made a snapshot of blank root to roll back to it each boot, to keep the system stateless.

Add this to the configuration.nix to wipe the root dataset on each boot by rolling back to the blank snapshot after the devices are made available:

{
  boot.initrd.postDeviceCommands = lib.mkAfter ''
    zfs rollback -r rpool/local/root@blank
  '';
}

Configuring Bootloader

In order to get ZFS to work, we need the following options to be set:

{
  boot.supportedFilesystems = [ "zfs" ];
  networking.hostId = "<8 random chars>";
}

You can grab your machine ID at /etc/machine-id for the hostId.

Then we'll configure grub:

{
  # Whether installer can modify the EFI variables.
  # If you encounter errors, set this to `false`.
  boot.loader.efi.canTouchEfiVariables = true;

  boot.loader.grub.enable = true;
  boot.loader.grub.efiSupport = true;
  boot.loader.grub.device = "nodev";

  # This should be done automatically, but explicitly declare it just in case.
  boot.loader.grub.copyKernels = true;
  # Make sure that you've listed all of the boot partitions here.
  boot.loader.grub.mirroredBoots = [
    { path = "/boot"; devices = ["/dev/disk/by-uuid/<ID-HERE>"]; }
    { path = "/boot-fallback"; devices = ["/dev/disk/by-uuid/<ID-HERE>"]; }
    # ...
  ];
}

Handling boot partitions gracefully

By default, NixOS will throw an error and complain about it when there is a missing partition/disk. Since we want the server to boot smoothly even if there is a missing boot partition, so we need to set the 'nofail' option to those partitions:

{
  fileSystems."/boot".options = [ "nofail" ];
  fileSystems."/boot-fallback".options = [ "nofail" ];
}

Enabling Remote ZFS Unlock

On each boot, ZFS will ask for a passphrase to unlock the ZFS pool. To work around this issue, we can start an SSH server in initrd, that is going to live until the pool is unlocked.

Note: If you rename the keys after, you may have some trouble rolling back to previous generations: See here for details.

To achieve that, we'll first have to generate an SSH host key for the initrd:

ssh-keygen -t ed25519 -N "" -f /mnt/boot/initrd-ssh-key

# Each boot partition should have the same key
cp /mnt/boot/initrd-ssh-key /mnt/boot-fallback/initrd-ssh-key

Then configure initrd:

{
  boot.kernelModules = [ "<YOUR-NETWORK-CARD>" ];
  boot.initrd.kernelModules = [ "<YOUR-NETWORK-CARD>" ];

  # DHCP Configuration, comment on Static IP
  networking.networkmanager.enable = false;
  networking.useDHCP = true;

  # Uncomment on Static IP
  # boot.kernelParams = [
  #   # See <https:#www.kernel.org/doc/Documentation/filesystems/nfs/nfsroot.txt> for documentation.
  #   # ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:<dns0-ip>:<dns1-ip>:<ntp0-ip>
  #   # The server ip refers to the NFS server -- not needed in this case.
  #   "ip=<YOUR-IPV4-ADDR>::<YOUR-IPV4-GATEWAY>:<YOUR-IPV4-NETMASK>:<YOUR-HOSTNAME>-initrd:<YOUR-NETWORK-INTERFACE>:off:<DNS-IP>"
  # ];

  boot.initrd.network.enable = true;
  boot.initrd.network.ssh = {
    enable = true;

    # Using the same port as the actual SSH will cause clients to throw errors
    # related to host key mismatch.
    port = 2222;

    # This takes 'path's, not 'string's.
    hostKeys = [
      /boot/initrd-ssh-key
      /boot-fallback/initrd-ssh-key
      # ...
    ];

    # Public ssh key to log into the initrd ssh
    authorizedKeys = [ "<YOUR-SSH-PUBKEY>" ];
  };
  boot.initrd.network.postCommands = ''
    cat <<EOF > /root/.profile
    if pgrep -x "zfs" > /dev/null
    then
      zfs load-key -a
      killall zfs
    else
      echo "ZFS is not running -- this could be a sign of failure."
    fi
    EOF
  '';
}

Installing NixOS

Run nixos-install, then reboot your machine.

Note: Make sure that you've configured SSH and network for your machine, failure to do so may result in an inaccessible system.

That's it! Enjoy your fresh NixOS machine!

Troubleshooting

Failed to import pool - more than one matching pool

This error might occur when

  • one of your disks were previously used in another ZFS pool, and its metadata weren't properly removed
  • you messed up during install, and you repartitioning the disk without removing its ZFS metadata.

This is because the ZFS metadata doesn't live on a partition, but on a disk.

Note: the following operations will irrevocably delete ANY data on your disk!

To remove those left behind:

sgdisk --zap-all $DISK
# Overwrite first 256M of the disk, removing metadata
# In some cases just `wipefs -a` works, but I found this to be the most
# reliable way to wipe them no matter what operations were performed on the disk
# before.
dd if=/dev/urandom bs=1M count=256 of=$DISK

And then you can try the installation again.

Conclusion

Acknowledgements

I wrote this article because I've noticed that I always forget some steps during NixOS installation to a newly acquired server.

I've compiled resources listed below to make a step-by-step guide for a setup I find 'optimal'. Please do check out those resources!


Should you have any questions or comments, please reach out to me by sending an email or via Matrix.