Installation guide for warewulf4
Installation guide for warewulf4
https://mslacken.github.io/2024/03/26/install-wartewulf4.html
Mar 26, 2024 • Christian Goll
Warewulf Preface In High Performance Computing (HPC), computing tasks are usually distributed among many compute threads which are spread across multiples cores, sockets and machines. These threads are tightly coupled together. Therefore, compute clusters consist of a number of largely identical machines that need to be managed to maintain a well-defined and identical setup across all nodes. Once clusters scale up, there are many scalability factors to overcome. Warewulf is there to address this ‘administrative scaling’.
Warewulf is an operating system-agnostic installation and management system for HPC clusters. It is quick and easy to learn and use as many settings are pre-configured to sensible defaults. Also, it still provides the flexibility allowing fine tuning the configuration to local needs. It is released under the BSD licence, its source code is available at https://github.com/warewulf/warewulf. This is where the development happens as well.
This article gives an overview on how to set up Warewulf on openSUSE Leap 15.5 or openSUSE Tumbleweed.
Installing Warewulf Compute clusters consist of at least one management (or head) node which is usually multi-homed connected both to an external network and a cluster private network, as well as multiple compute nodes which reside solely on the private network. Other private networks dedicated to high speed tasks like RDMA and storage access may exist as well. Warewulf gets installed on one of the management nodes of a cluster to manage and oversee the installation and management of the compute nodes. To install Warewulf on a cluster which is running openSUSE Leap 15.5 or openSUSE Tumbleweed, simply run:
zypper install warewulf This package seamlessly integrates into a SUSE system and should therefore be preferred over packages provided on Github.
During the installation, the actual network configuration is written to /etc/warewulf/warewulf.conf. These settings should be verified, as for multi homed hosts a sensible pre-configuration is not always possible.
Check /etc/warewulf/warewulf.conf for the following values:
ipaddr: 172.16.16.250 netmask: 255.255.255.0 network: 172.16.16.0 where ipaddr should be the IP address of this management host. Also check the values of netmask and network - these should match this network.
Additionally, you may want to configure the IP addresse range for dynamic/unknown hosts:
dhcp: range start: 172.16.26.21 range end: 172.16.26.50 If the ISC DHCP server (dhcpd) is used (default on SUSE), make sure the value of DHCPD_INTERFACE in the file /etc/sysconfig/dhcpd has been set to the correct value.
You are now ready to start the warewulfd service itself which delivers the images to the nodes:
systemctl enable –now warewulfd.service Now wwctl can be used to configure the the remaining services needed by Warewulf. Run:
wwctl configure –all which will configure all Warewulf related services.
To conveniently log into compute nodes, you should now log out of and back into the Warewulf host, as this will create an ssh key on the Warewulf host which allows password-less login to the compute nodes. Note however, that this key is not yet pass-phrase protected. If you require protecting your private key by a pass phrase, it is probably a good idea to do so now:
ssh-keygen -p -f $HOME/.ssh/cluster Adding nodes and profiles to Warewulf Warewulf uses the concept of profiles which hold the generalised settings of the individual nodes. It comes with a predefined profile default, to which all new node will be assigned, if not set otherwise. You may obtain the values of the default profile with:
wwctl profile list default Now, a node can be added with the command assigning it an IP address:
wwctl node add node01 -I 172.16.16.101 if the MAC address is known for this node, you can specify this as well:
wwctl node add node01 -I 172.16.16.101 -H cc:aa:ff:ff:ee For adding several nodes at once you may also use a node range, e.g.
wwctl node add node[01-10] -I 172.16.16.101 This will add the nodes with ip addresses starting at the specified address and incremented by Warewulf.
Importing a container Warewulf uses a special 1 container as base to build OS images for the compute nodes. This is self contained and independent of the operating system installed on the Warewulf host.
To import an openSUSE Leap 15.5 container use the command
wwctl container import docker://registry.opensuse.org/science/warewulf/leap-15.5/containers/kernel:latest leap15.5 –setdefault This will import the specified container for the default profile.
Alternative container sources Alternative containers are available from the openSUSE registry under the science project at:
https://registry.opensuse.org/cgi-bin/cooverview?srch_term=project%3D%5Escience%3A
or from the upstream Warewulf community repository:
https://github.com/orgs/warewulf/packages?repo_name=warewulf-node-images
It is also possible to import an image from a local installation into a directory (chroot directories) by using the path to this directory as argument for wwctl import.
Booting nodes As a final preparation you should rebuild the container image, now, by running:
wwctl container build leap15.5 as well as all the configuration overlays with the command:
wwctl overlay build just in case the build of the image may have failed earlier due to an error. If you didn’t assign a hardware address to a node before, you should set the node into the discoverable state before powering it on. This is done with:
wwctl node set node01 –discoverable Also you should run:
wwctl configure hostlist to add the new nodes to the file /etc/hosts. Now you should make sure that the node(s) will boot over PXE from the network interface connected to the specified network and power on the node(s) to boot into assigned image.
Additional configuration The configuration files for the nodes are managed as Golang text templates. The resulting files are overlayed over the node images. There are two types of overlays depending on how they are added to the compute node:
the system overlay which is ‘baked’ into the image during boot as part of the wwinit process. the runtime overlay which is updated on the nodes on a regular base (1 minute per default) via the wwclient service. In the default configuration the overlay called wwinit is used as system overlay. You may list the files in this overlays with the command:
wwctl overlay list wwinit -a which will show a list of all the files in the overlays. Files ending with the suffix .ww are interpreted as template by Warewulf, the suffix is removed in the rendered file. To inspect the content of an overlay file use the command:
wwctl overlay show wwinit /etc/issue.ww To render the template using the values for node01 use:
wwctl overlay show wwinit /etc/issue.ww -r node01 The overlay template itself may be edited using the command:
wwctl overlay edit wwinit /etc/issue.ww Please note that after editing templates, the overlays aren’t updated automatically and you should trigger a rebuild with the command:
wwctl overlay build The variables available in a template can be listed with
wwctl overlay show debug /warewulf/template-variables.md.ww Modifying the container The node container is a self contained operating system image. You can open a shell in the image with the command:
wwctl container shell leap15.5 After you have opened a shell, you may install additional software using zypper.
The shell command provides the option –bind which allows mounting arbitrary host directories into the container during the shell session.
Please note that if a command exits with a non-zero status, the image won’t be rebuilt automatically. Therefore, it is advised to rebuild the container with:
wwctl conainer build leap15.5 after any change.
Network configuration Warewulf allows configuring multiple network interfaces for the compute nodes. Therefore, you can add another network interface for example for infiniband using the command:
wwctl node set node01 –netname infininet -I 172.16.17.101 –netdev ib0 –mtu 9000 –type infiniband This will add the infiniband interface ib0 to the node node01. You can now list the network interfaces of the node:
wwctl node list -n As changes in the settings are not propagated to all configuration files, the node overlays should be rebuilt after this change by running the command:
wwctl overlay build After a reboot, these changes will be present on the nodes; in the above case the Infiniband interface will be active on the node.
A more elegant way to get the same result is to create a profile to hold those values which are identical for all interfaces. In this case, these are mtu and netdev. Create a new profile for an Infiniband network using the command:
wwctl profile add infiniband-nodes –netname infininet –netdev ib0 –mtu 9000 –type infiniband You may now add this profile to a node and remove the node specific settings which are now part of the common profile by executing:
wwctl node set node01 –netname infininet –netdev UNDEF –mtu UNDEF –type UNDEF –profiles default,infiniband-nodes To list the data in a profile use the command:
wwctl profile list -A infiniband-nodes Secure Boot Switch to grub boot By default, Warewulf boots nodes via iPXE, which isn’t signed by SUSE and can’t be used when secure boot is enabled. In order to switch to grub as the boot method you must add or change the following value in /etc/warewulf/warewulf.conf
warewulf: grubboot: true After this change, you will have to reconfigure dhcpd and tftp executing:
wwctl configure dhcp wwctl configure tftp and rebuild the overlays with the command:
wwctl overlay build Also make sure that the packages shim and grub2-x86_64-efi (for x86-64) or grub2-arm64-efi (for aarch64) are installed in the container. shim is required by secure boot.
Cross product secure boot With secure boot is enabled on the compute nodes, if you need to boot different products, you need to make sure that the compute nodes boot with the so-called ‘http’ boot method: For secure boot the signed shim needs to match the signature of the other pieces of the boot chain - including the kernel. However, different products will have different sets of signatures in their boot chain. The ‘http’ boot method is handled by warewulfd. This will look up the image to boot and pick the right shim from the image to deploy to a particular node. Therefore, you need to make sure, that the node container contains the shim package. The default boot method will extract the initial shim for PXE boot from the host running the warewulfd server. Note, however, that the host system shim will also be used for nodes which are in the discoverable state and subsequently have no hardware address assigned, yet.
Disk management It is possible to manage the disks of the compute nodes with Warewulf. Here, Warewulf itself doesn’t manage the disks, but creates a configuration and service files for ignition to do this job.
Prepare container As ignition and its dependencies aren’t installed in most of the containers, you should install the packages ignition and gptfdisk in the container.
wwctl container exec
physical storage device(s) to be used partition(s) on the disks filesystem(s) to be used Warewulf doesn’t manage these entities itself, but creates a configuration and service files for ignition to perform this task. Therefore, you need to make sure to install ignition and gptfdisk on the compute node. Open a shell in the container and run:
zypper -n in -y zypper install ignition gptfdisk Disks The path to the device e.g. /dev/sda must be used for diskname. The only valid configuration option for disks is diskwipe, which should be self-explanatory.
Partitions The partname is the name to the partition which iginition uses as the path for the device files, e.g. /dev/disk/by-partlabel/$PARTNAME.
Additionally, the size and number of the partition need be specified for all but the last partition (the one with the highest number) in which case this partition will be extended to the maximal size possible.
You should also set the boolean variable –partcreate so that a parition is created if it doesn’t exist.
Filesystems Filesystems are defined by the partition which contains them, so the name should have the format /dev/disk/by-partlabel/$PARTNAME. A filesystem needs to have a path if it is to be mounted, but its not mandatory.
Ignition will fail if there is no filesystem type defined.
Examples You can add a scratch partition with
wwctl node set node01
–diskname /dev/vda –diskwipe
–partname scratch –partcreate
–fsname scratch –fsformat btrfs –fspath /scratch –fswipe
This will be the only (and last) partition, therefore it does not require a size. To add another partition as a swap partition, you may run:
wwctl node set n01
–diskname /dev/vda
–partname swap –partsize=1024 –partnumber 1
–fsname swap –fsformat swap –fspath swap
This adds the partition number 1 which will be placed before the scratch partition.
This container is special only in that it is bootable, i.e. it contains a kernel and an init-implementation (systemd). ↩
openSUSE for HPC openSUSE for HPC cgoll@suse.com mslacken Adminitstration of High Perfomance Computing is always trapped between using the newest hardware and supporting a rock solid software stack. Tools and software from openSUSE makes this task easier. Have a lot of Fun!