Build logs show:
> configure: WARNING: non-linux system; not building mount
> configure: WARNING: non-linux system; not building swapon
So skip these on non-Linux
Using getOutput prevents eval failures on other platforms.
Things should stay eval'able with NIXPKGS_ALLOW_UNSUPPORTED_SYSTEM=1
Co-authored-by: Artturin <Artturin@artturin.com>
got broken in 6ea1a2a1be which changed
runCommandCC to runCommand but was not
noticed because it was failing silently
runCommand doesn't include CC or bintools
This reverts commit da905d4cf9.
See the commit linked above for further information on why this was
needed. Apparently this is not needed anymore because the need for
LD_LIBRARY_PATH (which is needed for `modprobe(8)` to find
`libpthread.so.0`) doesn't exist anymore.
Since d33e52b253 the library path of each
binary in extra-utils is patched correctly.
To reduce size, stage 1 (the initrd) is populated by copying specific
binaries in, then copying the libraries specifically needed by those
binaries. `patchelf` is then used to make the binaries search in the
directory where these libraries are copied to instead of their original
store paths.
Some filesystems (e.g. ZFS) do not guarantee that copying the same files
in the same order into a given directory will result in `find` returning
them in any particular order (though the order appears consistent so
long as the directory is not modified).
Therefore, when the binaries are scanned for libraries to copy in, they
might be scanned in a different order each time the derivation is built.
If two binaries need two different libraries with the same name, then a
different instance of the library might be copied in first, changing the
derivation contents and breaking reproducibility.
This turns out to be the case with `libudev.so.1` from both `systemd`
(needed by e.g. `mdadm`) and `systemdMinimal` (needed by e.g.
`dmsetup`). This issue is fixed by sorting the list of binaries to be
scanned instead of relying on filesystem order so that the same instance
always gets seen and copied first.
Both before this change (at least on ext4) and after this change
(without any options that affect stage 1), this is the `libudev.so.1`
from `systemdMinimal` by way of `dmsetup`. Whether this is appropriate
and how much the two different systemd configurations and udev libraries
need to be involved is a topic left for future work.
conversions were done using https://github.com/pennae/nix-doc-munge
using (probably) rev f34e145 running
nix-doc-munge nixos/**/*.nix
nix-doc-munge --import nixos/**/*.nix
the tool ensures that only changes that could affect the generated
manual *but don't* are committed, other changes require manual review
and are discarded.
mostly no rendering changes. some lists (like simplelist) don't have an
exact translation to markdown, so we use a comma-separated list of
literals instead.
commit 0507725061 ("setup-hooks/strip.sh: run RANLIB on static
archives after stripping") added an extra argument to `stripDirs()`
helper.
I did not realize it's used outside the strip hook itself. Restore
stripping by passing $RANLIB as a new argument.
we can't embed syntactic annotations of this kind in markdown code
blocks without yet another extension. replaceable is rare enough to make
this not much worth it, so we'll go with «thing» instead. the module
system already uses this format for its placeholder names in attrsOf
paths.
the conversion procedure is simple:
- find all things that look like options, ie calls to either `mkOption`
or `lib.mkOption` that take an attrset. remember the attrset as the
option
- for all options, find a `description` attribute who's value is not a
call to `mdDoc` or `lib.mdDoc`
- textually convert the entire value of the attribute to MD with a few
simple regexes (the set from mdize-module.sh)
- if the change produced a change in the manual output, discard
- if the change kept the manual unchanged, add some text to the
description to make sure we've actually found an option. if the
manual changes this time, keep the converted description
this procedure converts 80% of nixos options to markdown. around 2000
options remain to be inspected, but most of those fail the "does not
change the manual output check": currently the MD conversion process
does not faithfully convert docbook tags like <code> and <package>, so
any option using such tags will not be converted at all.
`extra-utils` composes the set of programs and libraries needed by
1. copying over all programs
2. copying over all libraries any program directly links against
3. set the runtime path for every program to the library directory
It seems that this approach misses the case where a library itself links
against another library. That is to say, `extra-utils` assumes that
either only progams link against libraries or that every library linked
to by a library is already linked to by a program.
`mount.zfs` linking against `libcrypto`, in turn linking against `libdl`
shows how the current approach falls short:
```
$ objdump -p $(which mount.zfs) | grep NEEDED | grep -e libdl -e libcrypto
NEEDED libcrypto.so.1.1
$ ldd (which mount.zfs) | grep libdl
libdl.so.2 => /nix/store/ybkkrhdwdj227kr20vk8qnzqnmj7a06x-glibc-2.34-115/lib/libdl.so.2 (0x00007f9967a9a000
```
Using `mount.zfs` directly in stage 1 init still works since
`LD_LIBRARY_PATH` overrides this (as intended).
util-linux's `mount` however executes `mount.zfs` with LD_LIBRARY_PATH
removed from its environment as can be seen with strace(1) in an
interactive stage 1 init shell (`boot.shell_on_fail` kernel parameter):
```
# env -i LD_LIBRARY_PATH=$LD_LIBRARY_PATH $(which strace) -ff -e trace=/exec -v -qqq $(which mount) /mnt-root
execve("/nix/store/3gqbb3swgiy749fxd5a4k6kirkr2jr9n-extra-utils/bin/mount", ["/nix/store/3gqbb3swgiy749fxd5a4k"..., "/mnt-root"], ["LD_LIBRARY_PATH=/nix/store/3gqbb"...]) = 0
[pid 1026] execve("/sbin/mount.zfs", ["/sbin/mount.zfs", "<redacted>", "/mnt-root", "-o", "rw,zfsutil"], []) = 0
/sbin/mount.zfs: error while loading shared libraries: libdl.so.2: cannot open shared object file: No such file or directory
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1026, si_uid=0, si_status=127, si_utime=0, si_stime=0} ---
```
env(1) is used for clarity (hence subshells for absoloute paths).
While `mount` uses the right library path, `mount.zfs` is stripped of
it, so ld.so(8) fails resolve `libdl` (as required by `libcrypto`).
To fix this and not rely on `LD_LIBRARY_PATH` to be set, fix the library
path inside libraries as well.
This finally mounts all ZFS filesystems using `zfsutil` with correct and
intended mount options.
Consider ZFS filesystems meant to be mounted with zfs.mount(8), e.g.
```
config.fileSystems."/media".options = [ "zfsutil" ];
config.fileSystems."/nix".options = [ "zfsutil" ];
```
`zfsutil` uses dataset properties as mount options such that zfsprops(7)
do not have to be duplicated in fstab(5) entries or manual mount(8)
invocations.
Given the example configuation above, /media is correctly mounted with
`setuid=off` translated into `nosuid`:
```
$ zfs get -Ho value setuid /media
off
$ findmnt -t zfs -no options /media
rw,nosuid,nodev,noexec,noatime,xattr,posixacl
```
/nix however was mounted with default mount(8) options:
```
$ zfs get -Ho value setuid /nix
off
$ findmnt -t zfs -no options /nix
rw,relatime,xattr,noacl
```
This holds true for all other ZFS properties/mount options, including
`exec/[no]exec`, `devices/[no]dev`, `atime/[no]atime`, etc.
/nix is mounted using BusyBox's `mount` during stage 1 init while /media
is mounted later using proper systemd and/or util-linux's `mount`.
Tracing stage 1 init showed that BusyBox never tried to execute
mount.zfs(8) as intended by `zfsutil`.
Replacing it with util-linux's `mount` and adding the mount helper
showed attempts to execute mount.zfs(8).
Ensure ZFS filesystems are mounted with correct options iff `zfsutil` is
used.
cpio includes the number of directory hard links in archives it creates.
Some filesystems, like btrfs, do not count directory hard links the same
way as more common filesystems like ext4 or tmpfs, so archives built
when /tmp is on such a filesystem do not reproduce. This patch replaces
cpio with bsdtar, which does not have this issue. The specific
invocation is from this page:
https://reproducible-builds.org/docs/archives/
This effectively fixes the majority of all VM tests which were broken
because `/dev/vda` (or any other block device) wasn't mountable:
machine # mounting /dev/vda on /...
machine # mount: mounting /dev/vda on /mnt-root/ failed: No such device[ 2.820976] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
machine # [ 2.821757] CPU: 0 PID: 1 Comm: init Not tainted 5.10.72 #1-NixOS
machine # [ 2.821757] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
machine # [ 2.821757] Call Trace:
machine # [ 2.821757] dump_stack+0x6b/0x83
machine # [ 2.821757] panic+0x101/0x2c8
machine # [ 2.821757] do_exit.cold+0x14/0xb3
machine # [ 2.821757] do_group_exit+0x33/0xa0
machine # [ 2.821757] __x64_sys_exit_group+0x14/0x20
machine # [ 2.821757] do_syscall_64+0x33/0x40
machine # [ 2.821757] entry_SYSCALL_64_after_hwframe+0x44/0xa9
machine # [ 2.821757] RIP: 0033:0x7f67ec2800f6
machine # [ 2.821757] Code: 00 4c 8b 0d 2c 5d 11 00 eb 19 66 2e 0f 1f 84 00 00 00 00 00 89 d7 89 f0 0f 05 48 3d 00 f0 ff ff 77 22 f4 89 d7 44 89 c0 0f 05 <48> 3d 00 f0 ff ff 76 e2 f7 d8 64 41 89 01 eb da 66 2e 0f 1f 84 00
machine # [ 2.821757] RSP: 002b:00007fff8f5a71d8 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
machine # [ 2.821757] RAX: ffffffffffffffda RBX: 0000000000699704 RCX: 00007f67ec2800f6
machine # [ 2.821757] RDX: 0000000000000001 RSI: 000000000000003c RDI: 0000000000000001
machine # [ 2.821757] RBP: 0000000000000004 R08: 00000000000000e7 R09: ffffffffffffff80
machine # [ 2.821757] R10: 00007f67ec33f3e0 R11: 0000000000000202 R12: 000000000000000b
machine # [ 2.821757] R13: 00007fff8f5a75a8 R14: 0000000000000000 R15: 00000000004fc198
machine # [ 2.821757] Kernel Offset: 0x31e00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
machine # [ 2.821757] Rebooting in 1 seconds..
This happened because the kernel failed to load modules such as `ext4`
from `boot.initrd.availableKernelModules`[1] on e.g. a `mount(2)` syscall.
The problem is that `kmod` isn't linked against `libpthread.so.0`
anymore because it got merged into `libc.so.6` (however, the .so still
exists), but still needs it:
machine # newfstatat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/x86_64", 0x7ffd951114c0, 0) = -1 ENOENT (No such file or directory)
machine # openat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/x86_64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
machine # newfstatat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/x86_64", 0x7ffd951114c0, 0) = -1 ENOENT (No such file or directory)
machine # openat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
machine # newfstatat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib", 0x7ffd951114c0, 0) = -1 ENOENT (No such file or directory)
machine # openat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
machine # writev(2, [{iov_base="/nix/store/kdc9n48ksdc1a8y8w512w"..., iov_len=69}, {iov_base=": ", iov_len=2}, {iov_base="error while loading shared libra"..., iov_len=36}, {iov_base=": ", iov_len=2}, {iov_base="libpthread.so.0", iov_len=15}, {iov_base=": ", iov_len=2}, {iov_base="cy
machine # ) = 184
machine # exit_group(127) = ?
machine # +++ exited with 127 +++
machine # mount: mounting /dev/vda on /mnt-root/ failed: No such device
machine # [ 19.167180] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
machine # [ 19.167711] CPU: 0 PID: 1 Comm: init Not tainted 5.10.72 #1-NixOS
This is not a problem
* inside stage-1 because `LD_LIBRARY_PATH` points to `$out/lib` of
extra-utils where `libpthread.so.6` also exists.
* on a running system because `${pkgs.glibc}/lib` is part of kmod's
rpath.
However this is a problem inside the kernel which calls `modprobe` (in
our case `kmod`) to load modules and doesn't know about
`LD_LIBRARY_PATH`. Also, the rpath-reference was nuked.
To work around this, the kernel's `modprobe`
(i.e. `/proc/sys/kernel/modprobe`) now points to a wrapper which
explicitly declares `LD_LIBRARY_PATH`. We can't use `makeWrapper` here
because `modprobe` itself must not be renamed. Otherwise, `kmod` (which
is the link-target of `modprobe`) won't work because it expects
`argv[0] == "modprobe"` to perform modprobe's tasks.
[1] https://nixos.org/manual/nixos/stable/options.html#opt-boot.initrd.availableKernelModules
This option behaves exactly like `boot.extraModprobeConfig`, except that it also includes the generated modprobe.d file in the initrd.
Many years ago, someone tried to include the normal modprobe.d/nixos.conf file generated by `boot.extraModprobeConfig` in the initrd: 0aa2c1dc46. This file contains a reference to a directory with firmware files inside. Including firmware in the initrd made it too big, so the commit was reverted again in 4a4c051a95.
The `boot.extraModprobeConfig` option not changing the initrd caused me much confusion because I tried to set the maximum cache size for ZFS and it didn't work.
Closes https://github.com/NixOS/nixpkgs/issues/25456.
I was confused why I couldn't find a mention of udev.log_priority in
systemd-udevd.service(8). It turns out that it was renamed[1] to
udev.log_level. The old name is still accepted, but it'll avoid
further confusion if we use the new name in our documentation.
[1]: 64a3494c3d
The multipath-tools package had existed in Nixpkgs for some time but
without a nixos module to configure/drive it. This module provides
attributes to drive the majority of multipath configuration options
and is being successfully used in stage-1 and stage-2 boot to mount
/nix from a multipath-serviced iSCSI volume.
Credit goes to @grahamc for early contributions to the module and
authoring the NixOS module test.
This modifies initialRamdiskSecretAppender to stage secrets in
/.initrd-secrets/ and stage-1-init to copy them into place after mounting
special file systems. This allows secrets to be copied into ramfs mounts
like /run/keys for use after stage-1 finishes without copying them to disk
(which would not be very secure).
Renaming an interface must be done in stage-1: otherwise udev will
report the interface as ready and network daemons (networkd, dhcpcd,
etc.) will bring it up. Once up the interface can't be changed and the
renaming will fail.
Note: link files are read directly by udev, so they can be used even
without networkd enabled.