Installing NixOS with ZFS mirrored boot
Table of Contents
- Overview
- Preparing the installation medium
- Preparing Disks
- Configuring NixOS
- Installing NixOS
- Troubleshooting
- Conclusion
- Acknowledgements
// TODO: add PlantUML diagrams
Overview
In this post, we're going to set up a ZFS mirrored boot system with full-disk encryption that is unlockable remotely.
Preparing the installation medium
This step may vary depending on what system you're going to install NixOS into.
This post assumes that you're installing this on a normal server, with a minimal NixOS image.
The community-maintained NixOS wiki contains guides to install NixOS to devices in other conditions, such as a server with only remote access.
You will need a USB stick before proceeding to the next step.
First, download the latest NixOS image, and flash it:
$ curl -L https://channels.nixos.org/nixos-unstable/latest-nixos-minimal-x86_64-linux.iso -O nixos.iso
$ dd if=./nixos.iso of=/dev/sdX bs=1M status=progress
If your target machine architecture is not x86_64
, replace it with your
desired architecture (e.g. i686
, aarch64
).
After the image has been successfully flashed into your installation medium, unplug it and boot using the medium on the target machine.
Preparing Disks
We'll start by defining variables pointing to each disk by ID.
According to the Archlinux.org Wiki, If you create zpools using device names
(e.g. /dev/sda
), ZFS might not be able to detect zpools intermittently on
boot.
You can grab the ID via ls -lh /dev/disk/by-id/
.
DISK1=/dev/disk/by-id/ata-VENDOR-ID-OF-THE-FIRST-DRIVE
DISK2=/dev/disk/by-id/ata-VENDOR-ID-OF-THE-SECOND-DRIVE
Partitioning
Then we'll partition our disks. Since this is a mirrored setup, we'll have to do the exactly same operation twice. Fortunately, bash function come into rescue.
The partition structure is the following:
1GiB Boot | ~Remaining ZFS
partition() {
sgdisk --zap-all "$1"
sgdisk -n 1:0:+1GiB -t 1:EF00 -c 1:boot "$1"
# Swap is omitted.
sgdisk -n 2:0:0 -t 2:BF01 -c 2:zfs "$1"
sgdisk --print "$1"
}
partition $DISK1
partition $DISK2
Creating vfat filesystem for boot
Boot partitions should be formatted with 'vfat', in order for it to mount and function without issues.
mkfs.vfat $DISK1-part1
mkfs.vfat $DISK2-part1
Configuring ZFS pool
This dataset structure is based on Erase your darlings.
Now that we're done partitioning our disks, we'll create a ZFS pool named 'rpool', which is mirrored. This will prompt you to enter a passphrase for your new ZFS pool.
zpool create \
-o ashift=12 \
-O mountpoint=none -O atime=off -O acltype=posixacl -O xattr=sa \
-O compression=lz4 -O encryption=aes-256-gcm -O keyformat=passphrase \
rpool mirror \
$DISK1-part2 $DISK2-part2
Then, we create a 'root dataset' which is / (root)
for the target machine,
then snapshot the empty state as 'blank'.
zfs create -p -o mountpoint=legacy rpool/local/root
zfs snapshot rpool/local/root@blank
Note the 'local' after rpool. In this setup, 'local' is treated as unimportant data, i.e. packages, root, etc., Whereas 'safe' is treated as important data, which needs to be backed up.
And mount it:
mount -t zfs rpool/local/root /mnt
Then we mount the multiple boot partitions we created:
mkdir /mnt/boot
mkdir /mnt/boot-fallback
mount $DISK1-part1 /mnt/boot
mount $DISK2-part1 /mnt/boot-fallback
Create and mount a dataset for /nix
:
zfs create -p -o mountpoint=legacy rpool/local/nix
mkdir /mnt/nix
mount -t zfs rpool/local/nix /mnt/nix
And a dataset for /home
:
zfs create -p -o mountpoint=legacy rpool/safe/home
mkdir /mnt/home
mount -t zfs rpool/safe/home /mnt/home
And a dataset for states that needs to be persisted between boots:
zfs create -p -o mountpoint=legacy rpool/safe/persist
mkdir /mnt/persist
mount -t zfs rpool/safe/persist /mnt/persist
Note: All states will be wiped each boot after setting up
these.
Make sure to put states that need to persist on /persist
.
Configuring NixOS
Now that we're done with partitions and ZFS, it's time to declaratively configure the machine. This step may vary depending on your machine, please consult the docs when in doubt.
Getting the base configuration
In this post, we're going to use plain nixos-generate-config
to get a base
configuration files for the machine.
nixos-generate-config --root /mnt
Erasing your darlings
In the previous step, we've made a snapshot of blank root to roll back to it each boot, to keep the system stateless.
Add this to the configuration.nix
to wipe the root dataset on each boot by
rolling back to the blank snapshot after the devices are made available:
{
boot.initrd.postDeviceCommands = lib.mkAfter ''
zfs rollback -r rpool/local/root@blank
'';
}
Configuring Bootloader
In order to get ZFS to work, we need the following options to be set:
{
boot.supportedFilesystems = [ "zfs" ];
networking.hostId = "<8 random chars>";
}
You can grab your machine ID at /etc/machine-id
for the hostId
.
Then we'll configure grub:
{
# Whether installer can modify the EFI variables.
# If you encounter errors, set this to `false`.
boot.loader.efi.canTouchEfiVariables = true;
boot.loader.grub.enable = true;
boot.loader.grub.efiSupport = true;
boot.loader.grub.device = "nodev";
# This should be done automatically, but explicitly declare it just in case.
boot.loader.grub.copyKernels = true;
# Make sure that you've listed all of the boot partitions here.
boot.loader.grub.mirroredBoots = [
{ path = "/boot"; devices = ["/dev/disk/by-uuid/<ID-HERE>"]; }
{ path = "/boot-fallback"; devices = ["/dev/disk/by-uuid/<ID-HERE>"]; }
# ...
];
}
Handling boot partitions gracefully
By default, NixOS will throw an error and complain about it when there is a missing partition/disk. Since we want the server to boot smoothly even if there is a missing boot partition, so we need to set the 'nofail' option to those partitions:
{
fileSystems."/boot".options = [ "nofail" ];
fileSystems."/boot-fallback".options = [ "nofail" ];
}
Enabling Remote ZFS Unlock
On each boot, ZFS will ask for a passphrase to unlock the ZFS pool.
To work around this issue, we can start an SSH server in initrd
, that is going
to live until the pool is unlocked.
Note: If you rename the keys after, you may have some trouble rolling back to previous generations: See here for details.
To achieve that, we'll first have to generate an SSH host key for the initrd:
ssh-keygen -t ed25519 -N "" -f /mnt/boot/initrd-ssh-key
# Each boot partition should have the same key
cp /mnt/boot/initrd-ssh-key /mnt/boot-fallback/initrd-ssh-key
Then configure initrd
:
{
boot.kernelModules = [ "<YOUR-NETWORK-CARD>" ];
boot.initrd.kernelModules = [ "<YOUR-NETWORK-CARD>" ];
# DHCP Configuration, comment on Static IP
networking.networkmanager.enable = false;
networking.useDHCP = true;
# Uncomment on Static IP
# boot.kernelParams = [
# # See <https:#www.kernel.org/doc/Documentation/filesystems/nfs/nfsroot.txt> for documentation.
# # ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:<dns0-ip>:<dns1-ip>:<ntp0-ip>
# # The server ip refers to the NFS server -- not needed in this case.
# "ip=<YOUR-IPV4-ADDR>::<YOUR-IPV4-GATEWAY>:<YOUR-IPV4-NETMASK>:<YOUR-HOSTNAME>-initrd:<YOUR-NETWORK-INTERFACE>:off:<DNS-IP>"
# ];
boot.initrd.network.enable = true;
boot.initrd.network.ssh = {
enable = true;
# Using the same port as the actual SSH will cause clients to throw errors
# related to host key mismatch.
port = 2222;
# This takes 'path's, not 'string's.
hostKeys = [
/boot/initrd-ssh-key
/boot-fallback/initrd-ssh-key
# ...
];
# Public ssh key to log into the initrd ssh
authorizedKeys = [ "<YOUR-SSH-PUBKEY>" ];
};
boot.initrd.network.postCommands = ''
cat <<EOF > /root/.profile
if pgrep -x "zfs" > /dev/null
then
zfs load-key -a
killall zfs
else
echo "ZFS is not running -- this could be a sign of failure."
fi
EOF
'';
}
Installing NixOS
Run nixos-install
, then reboot your machine.
Note: Make sure that you've configured SSH and network for your machine, failure to do so may result in an inaccessible system.
That's it! Enjoy your fresh NixOS machine!
Troubleshooting
Failed to import pool - more than one matching pool
This error might occur when
- one of your disks were previously used in another ZFS pool, and its metadata weren't properly removed
- you messed up during install, and you repartitioning the disk without removing its ZFS metadata.
This is because the ZFS metadata doesn't live on a partition, but on a disk.
Note: the following operations will irrevocably delete ANY data on your disk!
To remove those left behind:
sgdisk --zap-all $DISK
# Overwrite first 256M of the disk, removing metadata
# In some cases just `wipefs -a` works, but I found this to be the most
# reliable way to wipe them no matter what operations were performed on the disk
# before.
dd if=/dev/urandom bs=1M count=256 of=$DISK
And then you can try the installation again.
Conclusion
Acknowledgements
I wrote this article because I've noticed that I always forget some steps during NixOS installation to a newly acquired server.
I've compiled resources listed below to make a step-by-step guide for a setup I find 'optimal'. Please do check out those resources!