or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
![image alt](https:// "title") | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Syncing
xxxxxxxxxx
FOD sandbox bypass
Original report
Impact
Severe. This bug would permit, in theory, if two malicious FODs are scheduled around the same time on the same Hydra builder, to poison the next version of the bash sources, for instance.
Do note that since some version of Nix after 2.3, this is no longer as severe an issue: You can no longer race the daemon to convince it to remove the CA flag from a CA output (and thus have it be considered non-corrupted if
nix-store --verify
is run)Root cause
Abstract Unix sockets are namespaced using network namespace, allowing processes to send each other fds if they are simply in the same net namespace, without any filesystem in common.
Mitigation
There are three possible mitigations. We consider that FODs communicating with each other is uninteresting, because they can just use TCP; the offensive part is sending file handles amongst themselves.
Block abstract unix sockets
This is the most surgical solution, and probably reduces the number of breakages this would cause, especially if we only apply the blocking to FODs, which already shouldn't be doing much other than just fetching from the network.
Advantages:
Disadvantages:
The approaches to do this all involve a Linux Security Module (LSM); seccomp cannot do this by design, since the argument of the socket address is a pointer. However, there is a LSM hook that does get the data, directly on bind(2) entry: https://github.com/torvalds/linux/blob/5db8752c3b81bd33a549f6f812bab81e3bb61b20/net/socket.c#L1833-L1854.
There are two viable LSMs that can be used to restrict this functionality:
AppArmor
AppArmor is definitely the older of the available LSMs.
This could be achieved with the following rule:
Sounds trivial?
Well.
Consider how the container implementations do it:
In short, they write a file to
/etc/apparmor.d
of the outer system, runapparmor_parser -Kr
to load the new profile. Then after fork prior toexec
, put the new profile name in/proc/self/attr/apparmor/exec
(throughaa_change_onexec
or otherwise), andexecve
.The significant problem here is that Nix would have to own part of
/etc/apparmor.d
, which is some rather ugly mutable state in/etc
shared with the rest of the system. We could either add it as a static file in a package or write it at runtime, both of which are not easy for Nix to do.To add a profile there, we would have to silently reload apparmor profiles behind the sysadmin's back; in principle this is not too unsafe but it feels gross. Also, we would depend on the apparmor userspace tools, which is somewhat unfortunate.
BPF-LSM
Systemd uses this for various sandboxing, and it seems viable. It can be attached to one single cgroup, and it doesn't have any ugly userspace state.
I haven't checked precisely, but I think it's likely it works back to kernel 6.0, which is relatively quite far back; certainly it's available on the latest LTS kernel.
Here's an example blog post: https://kinvolk.io/blog/2021/04/extending-systemd-security-features-with-ebpf/
Here is basically the actual code that would be required, but not cgroupified due to "that seems like a pain to write the string manipulation necessary into a PoC in C": https://gist.github.com/lf-/bf569280dfc7f863fe274bc3def65e3d
To make it work with cgroups, you would have to do the following:
struct bpf_link * bpf_program__attach_cgroup(const struct bpf_program *prog, int cgroup_fd)
:skel->links.socket_bind = bpf_program__attach_cgroup(skel->progs.socket_bind, cg_fd);
Advantages:
Disadvantages:
Tomoyo
Can also do this, but isn't built into at least the archlinux kernel so isn't really viable.
Put the FOD builder in a netns
This can be done with
slirp4netns
, which sticks the container in NAT (with IPv6 support). Note that there are a couple of caveats; for example,/etc/resolv.conf
needs to be set up in the container.Example code in my project Clipper: https://github.com/lf-/clipper/blob/main/crates/wire_blahaj/src/unprivileged.rs#L279
Advantages:
Disadvantages:
Neutralize the effect of FD smuggling
Another way to mitigate this is to make it so that the exploit doesn't achieve anything. We could do this by copying the output paths after the builder is done, but before hashing.
Thus, any retained handles have no effect, since we are hashing something that the now-former builder has no access to.
This should probably be done anyway regardless of other mitigations.
Advantages:
Disadvantages:
Finding current usage
It might be a good idea to figure out how much people are actually using abstract namespace Unix sockets to see if we would be breaking anyone if we banned them in FOD.
We could use auditd on Hydra and audit all unix socket bind/connect, then analyze the audit logs:
which can then be analyzed like so; 010000 at the start of the address in the log is an anonymous Unix socket