Systemd-Cryptsetup Crash: Tpm2-measure-bank=sha256 Fix
Hey everyone! Today, we're diving into a tricky issue encountered with systemd-cryptsetup
when using the tpm2-measure-bank=sha256
option in crypttab
. This problem can lead to boot failures, and we're going to break down the cause and how to address it.
The Problem: Systemd-Cryptsetup Crashing
The core issue revolves around a crash in systemd-cryptsetup
during the boot process. Specifically, this occurs when the crypttab
file includes an entry with the tpm2-measure-bank=sha256
option. Let's take a closer look at the scenario.
The Crypttab Configuration
In the reported case, the crypttab
entry looks something like this:
persistent /dev/disk/by-partuuid/xxx - tpm2-measure-pcr=yes,tpm2-device=auto,tpm2-measure-bank=sha256
This configuration aims to use TPM2 to securely unlock a disk partition during boot. The options used are:
tpm2-measure-pcr=yes
: Indicates that PCR measurements should be used.tpm2-device=auto
: Specifies that the system should automatically detect the TPM2 device.tpm2-measure-bank=sha256
: This is the problematic option, instructing the system to use SHA256 as the hashing algorithm for the PCR bank.
The Unexpected Crash
When the system attempts to boot with this configuration, a crash occurs, presenting the following error message:
Aug 14 09:30:33 host systemd-cryptsetup[197]: Assertion 's' failed at src/basic/strv.c:413, function strv_split_full(). Aborting.
This error indicates a failure within the strv_split_full()
function, which is used to split strings into an array of strings. The crash prevents the volume from being mounted, leading to a boot failure.
Debugging the Crash
To understand the crash, some debugging was performed using a patch that adds logging to the cryptsetup.c
file in the systemd source code. This patch reveals the following:
diff --git a/src/cryptsetup/cryptsetup.c b/src/cryptsetup/cryptsetup.c
index 1da9e21d8e..e59c2cbafa 100644
--- a/src/cryptsetup/cryptsetup.c
+++ b/src/cryptsetup/cryptsetup.c
@@ -171,6 +171,8 @@ static int parse_one_option(const char *option) {
assert(option);
+ log_info("parse_one_option() '%s'", option);
+
/* Handled outside of this tool */
if (STR_IN_SET(option, "noauto", "auto", "nofail", "fail", "_netdev", "keyfile-timeout"))
return 0;
@@ -533,7 +535,8 @@ static int parse_one_option(const char *option) {
#if HAVE_OPENSSL
_cleanup_strv_free_ char **l = NULL;
- l = strv_split(optarg, ":");
+ log_info("strv_split(optarg=%s)", optarg);
+ l = strv_split(optarg, ":"); // This fails.
if (!l)
return log_oom();
This patch adds logging statements to the parse_one_option()
function, which is responsible for parsing the options in the crypttab
entry. The log output shows:
Aug 14 09:30:33 host systemd-cryptsetup[197]: parse_one_option() 'tpm2-measure-pcr=yes'
Aug 14 09:30:33 host systemd-cryptsetup[197]: parse_one_option() 'tpm2-device=auto'
Aug 14 09:30:33 host systemd-cryptsetup[197]: parse_one_option() 'tpm2-measure-bank=sha256'
Aug 14 09:30:33 host systemd-cryptsetup[197]: strv_split(optarg=(null))
Aug 14 09:30:33 host systemd-cryptsetup[197]: Assertion 's' failed at src/basic/strv.c:413, function strv_split_full(). Aborting.
The critical line here is strv_split(optarg=(null))
. This indicates that the strv_split()
function is being called with a NULL
argument, which leads to the assertion failure and the crash. The strv_split
function is used to split a string based on a delimiter and the error happens because it's receiving a null value where it expects a string. This usually points to an issue in how the options are parsed or passed within the systemd-cryptsetup
code.
Root Cause Analysis
The investigation suggests that the issue lies within the cryptsetup.c
code, specifically around how the tpm2-measure-bank
option is handled. The code appears to be attempting to split a string based on a colon (:
) delimiter, but it's doing so with a NULL
optarg
. This likely stems from a logical error where the code is not correctly extracting the value associated with the tpm2-measure-bank
option or is not handling the case where the option doesn't have a value to split.
The relevant code snippet from cryptsetup.c
is:
#if HAVE_OPENSSL
_cleanup_strv_free_ char **l = NULL;
log_info("strv_split(optarg=%s)", optarg);
l = strv_split(optarg, ":"); // This fails.
if (!l)
return log_oom();
It seems the optarg
variable is NULL
when strv_split
is called. This happens because, unlike other tpm2-
options, tpm2-measure-bank
does not have a value associated with it in the startswith
check. The code incorrectly assumes that optarg
will always have a value, leading to the crash.
Workarounds and Solutions
Now that we understand the problem, let's explore how to work around it and potential solutions.
Immediate Workaround
The simplest workaround is to remove the tpm2-measure-bank=sha256
option from the crypttab
entry. This will prevent the crash and allow the system to boot, albeit without the SHA256 PCR bank measurement. So, if you're facing this issue, quickly removing this line is your best bet to get your system back up.
Identifying the Correct Options
It's also worth noting that you should double-check the syntax and usage of the tpm2-measure-bank
option in your specific systemd version. It's possible that the option is being used incorrectly. Referring to the systemd documentation or man pages for crypttab
is a good starting point. Make sure that you're typing the options and values exactly as they are meant to be, as even a small typo can cause issues.
Patching Systemd (Advanced)
For those comfortable with patching systemd, a potential fix involves modifying the cryptsetup.c
code to handle the case where optarg
is NULL
. This could involve adding a check for NULL
before calling strv_split()
or using a different approach to extract the value associated with the tpm2-measure-bank
option. However, patching systemd requires careful consideration and testing to avoid introducing new issues. This is a more advanced approach, so be sure you know what you're doing before you dive in.
Reporting the Issue
It's crucial to report this issue to the systemd developers so that a proper fix can be implemented in future releases. Providing detailed information about the issue, including the systemd version, kernel version, and steps to reproduce the problem, will help the developers address the bug effectively. You can usually report issues through the distribution's bug tracker or the systemd project's issue tracker.
Digging Deeper into the Code
To further understand the issue, let's examine the relevant parts of the cryptsetup.c
code more closely. The parse_one_option()
function is responsible for parsing the options from the crypttab
file. This function iterates through each option and calls startswith()
to check if the option begins with a specific prefix, such as tpm2-measure-bank=
. If a match is found, the code extracts the value associated with the option and processes it accordingly.
In the case of tpm2-measure-bank
, the code incorrectly assumes that there will always be a value after the =
sign. However, tpm2-measure-bank
itself doesn't require a value; it's a flag that indicates whether the PCR bank should be measured. This discrepancy leads to optarg
being NULL
when strv_split()
is called.
The other tpm2-
options, such as tpm2-measure-pcr=yes
and tpm2-device=auto
, have values associated with them, so the code works correctly in those cases. This highlights a specific bug related to handling flag-like options without explicit values.
Broader Implications and Lessons Learned
This issue underscores the importance of thorough error handling and input validation in software development. The systemd-cryptsetup
code should have included a check for NULL
before calling strv_split()
to prevent the crash. This simple check could have saved users from encountering boot failures and the associated frustration.
Additionally, this case highlights the value of debugging tools and techniques. The use of logging statements in the patch helped pinpoint the exact location of the crash and the value of optarg
at the time of the failure. Debugging is an essential skill for any software developer or system administrator.
Community Contributions and Fixes
The open-source community plays a crucial role in identifying and fixing bugs like this. User reports, patches, and discussions help improve the quality and reliability of software. If you encounter a similar issue, don't hesitate to share your findings and contribute to the community.
Conclusion
The systemd-cryptsetup
crash caused by tpm2-measure-bank=sha256
is a prime example of how a seemingly small coding error can have significant consequences. By understanding the root cause of the issue and implementing appropriate workarounds and fixes, we can prevent such problems from occurring in the future. Remember to report any issues you encounter to help the community and the developers make the software better for everyone.
Keep an eye out for updates and patches from the systemd team to address this issue. In the meantime, if you're facing this problem, the workaround of removing the tpm2-measure-bank=sha256
option should get you back on your feet.
Thanks for reading, and happy troubleshooting!