/proc/[tid]
subdirectoriesLinux: 5.18-rc5, commit 1728c0567f70 ("net: phy: smsc: add LAN8742 phy support.")
ypl@home:~$ cat /proc/$$/stat | cut -d' ' -f2
(bash)
ypl@home:~$ cat /proc/self/stat | cut -d' ' -f2
(cat)
ypl@home:~$ cut -d' ' -f2 < /proc/self/stat
(cut)
This post briefly explains why ls
doesn't show /proc/[tid]
subdirectories for child threads.
According to man proc(5):
The proc filesystem is a pseudo-filesystem which provides an interface to kernel data structures. It is commonly mounted at
/proc
.
ypl@home:~$ ls /proc
1 177 34 45 58 686 acpi irq net
10 18 35 450 59 69 buddyinfo kallsyms pagetypeinfo
11 183 36 46 60 690 bus kcore partitions
114 184 37 466 61 7 cgroups key-users schedstat
116 19 38 469 615 70 cmdline keys self
...
See those numbers? Each of these so-called [pid]
subdirectories corresponds to a process, or a thread group leader (TGL). However, ls /proc
doesn't show [tid]
subdirectories. For example, imagine an application with 2 threads:
ypl@home:~$ ls /proc/662/task
662 663
Here, 662
is the thread group leader, and 663
is a child thread. ls /proc
only shows 662
:
ypl@home:~$ ls /proc | grep 662
662
The 663
subdirectory is not shown, but somehow you can cd
into it:
ypl@home:~$ ls /proc | grep 663
ypl@home:~$ cd /proc/663
ypl@home:/proc/663$ ls
arch_status environ mountinfo personality statm
attr exe mounts projid_map status
autogroup fd mountstats root syscall
...
It's there, just "invisible" to ls
, as also documented in man proc(5):
The
/proc/[tid]
subdirectories are not visible when iterating through/proc
with getdents(2) (and thus are not visible when one uses ls(1) to view the contents of/proc
).
I found this behavior very interesting. How is it implemented?
(disclaimer: for recreational purposes only! :-)
Apply this to your kernel:
diff --git a/fs/proc/base.c b/fs/proc/base.c
index c1031843cc6a..579ee323b797 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3420,7 +3420,7 @@ static struct tgid_iter next_tgid(struct pid_namespace *ns, struct tgid_iter ite
pid = find_ge_pid(iter.tgid, ns);
if (pid) {
iter.tgid = pid_nr_ns(pid, ns);
- iter.task = pid_task(pid, PIDTYPE_TGID);
+ iter.task = pid_task(pid, PIDTYPE_PID);
if (!iter.task) {
iter.tgid += 1;
goto retry;
Now ls /proc
shows both [pid]
and [tid]
directories. Yay!
ypl@home:~$ ls /proc/662/task
662 663
ypl@home:~$ ls /proc | grep 662
662
ypl@home:~$ ls /proc | grep 663
663
It's probably gonna break a lot of stuff based on procfs though…
(My) ls
uses the getdents64(2) system call to read directory entries from /proc
:
stat("/proc", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
openat(AT_FDCWD, "/proc", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
fstat(3, {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
getdents64(3, /* 184 entries */, 32768) = 4808
getdents64(3, /* 0 entries */, 32768) = 0
close(3) = 0
getdents64(2) is defined in fs/readdir.c
:
SYSCALL_DEFINE3(getdents64, unsigned int, fd,
struct linux_dirent64 __user *, dirent, unsigned int, count)
{
struct fd f;
struct getdents_callback64 buf = {
.ctx.actor = filldir64,
.count = count,
.current_dir = dirent
};
int error;
f = fdget_pos(fd);
if (!f.file)
return -EBADF;
error = iterate_dir(f.file, &buf.ctx);
...
It calls iterate_dir()
, which first checks if /proc
is actually a directory:
int iterate_dir(struct file *file, struct dir_context *ctx)
{
struct inode *inode = file_inode(file);
bool shared = false;
int res = -ENOTDIR;
if (file->f_op->iterate_shared)
shared = true;
else if (!file->f_op->iterate)
goto out;
...
If neither .iterate_shared
nor .iterate
is implemented, iterate_dir()
returns -ENOTDIR
. In our case though, it then calls /proc
's own .iterate_shared
implementation, proc_root_readdir()
:
static int proc_root_readdir(struct file *file, struct dir_context *ctx)
{
if (ctx->pos < FIRST_PROCESS_ENTRY) {
int error = proc_readdir(file, ctx);
if (unlikely(error <= 0))
return error;
ctx->pos = FIRST_PROCESS_ENTRY;
}
return proc_pid_readdir(file, ctx);
}
Here, proc_pid_readdir()
uses next_tgid()
to take care of those [pid]
subdirectories in a loop:
...
for (iter = next_tgid(ns, iter);
iter.task;
iter.tgid += 1, iter = next_tgid(ns, iter)) {
char name[10 + 1];
unsigned int len;
cond_resched();
if (!has_pid_permissions(fs_info, iter.task, HIDEPID_INVISIBLE))
continue;
len = snprintf(name, sizeof(name), "%u", iter.tgid);
ctx->pos = iter.tgid + TGID_OFFSET;
if (!proc_fill_cache(file, ctx, name, len,
proc_pid_instantiate, iter.task, NULL)) {
put_task_struct(iter.task);
return 0;
}
}
...
Yep! This is where our TL;DR diff comes into play. Take another look at next_gid()
:
...
retry:
iter.task = NULL;
pid = find_ge_pid(iter.tgid, ns);
if (pid) {
iter.tgid = pid_nr_ns(pid, ns);
iter.task = pid_task(pid, PIDTYPE_TGID);
if (!iter.task) {
iter.tgid += 1;
goto retry;
...
It skips pid
if it's not a PIDTYPE_TGID
(thread group ID). In other words, proc_pid_readdir()
only reports thread group leaders. This is exactly why ls /proc
doesn't show [tid]
subdirectories!
fs/readdir.c:SYSCALL_DEFINE3(getdents64)
:iterate_dir() /* file->f_op->iterate_shared() */
fs/proc/root.c:proc_root_readdir()
fs/proc/base.c:proc_pid_readdir()
:next_tgid()