Introduction to Gthulhu

# Introduction to Gthulhu :::warning - This page is translated version of [Gthulhu 介紹](https://hackmd.io/@cndi2025/Syt8xHSige). - This page is translated by using GitHub Copilot. ::: [TOC]    ## What is a Scheduler? - Picks runnable tasks for you - Decides the priority of runnable tasks - Decides how much CPU time they get - Decides which CPU should execute them - Must ensure fairness - Must preserve interactivity :::info Further reading: [Linux Kernel Design: Schedulers do more than pick tasks](https://hackmd.io/@sysprog/linux-scheduler) ::: ## Introduction to eBPF eBPF has been called the “JavaScript of the Linux kernel.” It fundamentally changes how we interact with the Linux kernel. ![image](https://hackmd.io/_uploads/SkWu5soTgx.png) eBPF system overview. Source: eBPF official docs Linux adds hooks all over the kernel so that users can dynamically insert eBPF programs and have them run at the right observation points, such as: - When a specific kernel function runs - When a packet enters the network stack - When a system call is handled Meanwhile, the eBPF verifier guarantees that any loaded eBPF program will never: - Call an unknown function - Contain a loop - Contain unreachable instructions - Jump with a destination out of range - Fall through from one function to the next This guarantees the safety and stability of eBPF programs at runtime. ## Extensible Scheduler Class: `sched_ext` Since Linux v6.12, sched_ext (Scheduler Extension) lets us change the OS scheduler dynamically from user space: - Customize and hot-plug an OS scheduler as an eBPF program. - A kernel watchdog prevents deadlocks and starvation (via [cmwq](https://www.kernel.org/doc/html/v4.10/core-api/workqueue.html) to periodically check starvation). If your custom scheduler fails to schedule all tasks within a time budget, the system will evict the injected scx scheduler. - BPF enforces safety (no memory errors, no kernel panics). - Source code: https://github.com/sched-ext/scx <video src="https://github.com/sched-ext/scx/assets/1051723/42ec3bf2-9f1f-4403-80ab-bf5d66b7c2d5" controls="controls" muted="muted" class="d-block rounded-bottom-2 border-top width-fit" style="max-height:640px; min-height: 200px"></video> SCX DEMO: Improving Linux gaming experience ### Scheduling Cycle ![螢幕錄製 2025-10-14 下午5.50.16](https://hackmd.io/_uploads/HyWsp5iTel.gif) DSQ workflow illustration sched_ext introduces the concept of Dispatch Queues (DSQ). With multiple DSQs, you can implement FIFO or priority queues: - By default, the system has a global DSQ SCX_DSQ_GLOBAL and a per-CPU local DSQ SCX_DSQ_LOCAL. - A BPF scheduler can create additional DSQs with scx_bpf_create_dsq() and destroy them with scx_bpf_destroy_dsq(). - A CPU always pulls runnable work from its local DSQ. Tasks in other DSQs must be moved to the local DSQ to run. 1. When a task is woken up, we enter the select-cpu phase and the eBPF program for `.select_cpu` runs. If an idle CPU is selected, it will be woken. If the task has a cpu_mask, this selection can be overridden. 2. After choosing the target CPU, we enter the `.enqueue` phase, and the eBPF program for `.enqueue` runs. It can: 1. Call `scx_bpf_dispatch()` to insert the task into the global DSQ SCX_DSQ_GLOBAL or a CPU’s local DSQ SCX_DSQ_LOCAL 2. Store the task in a custom data structure 3. Insert the task into a custom DSQ 3. When a CPU is ready to accept work, it checks its local DSQ first. If tasks exist, it dequeues and runs one. Otherwise it checks the global DSQ. 4. If neither the local nor the global DSQ has runnable tasks, the `.dispatch` phase runs the eBPF program for `.dispatch`. It can: 1. Use `scx_bpf_dispatch` to send specific tasks to any DSQ 2. Use `scx_bpf_consume` to move tasks from a specified DSQ to the local DSQ 5. After `.dispatch`, the local and global DSQs are checked again. If a task exists, it’s dequeued and executed. 6. If step 4 dispatched tasks, we jump back into `.enqueue` to try acquiring work. Otherwise, if the previously running task is an SCX task and still runnable, continue it. If everything fails, the CPU goes idle. ## Building a Custom Scheduler in Go Inspired by Andrea Righi’s talk “Crafting a Linux kernel scheduler in Rust,” I saw the potential of eBPF: we can load very different schedulers depending on the scenario. The default scheduler focuses on fairness, which can sacrifice performance for some workloads; a targeted scheduler can “fairly be unfair” toward the workloads you care about. ### Gthulhu Overview Gthulhu, a system scheduler dedicated to cloud-native workloads. The mission I gave Gthulhu is to “make it easier for operators to maximize the utilization of compute resources.” :::info - Note: The name Gthulhu comes from Cthulhu; since it’s written in Go, I swapped the ‘C’ with a ‘G’, hoping this project will steer the Kubernetes helm like an octopus. - GitHub Repo: https://github.com/Gthulhu/Gthulhu ::: ```mermaid timeline title Gthulhu 2025 Roadmap section 2025 Q1 - Q2 <br> Gthulhu -- bare metal scx_goland (qumun) : ☑️ 24x7 test : ☑️ CI/CD pipeline Gthulhu : ☑️ CI/CD pipeline : ☑️ Official docs K8s integration : ☑️ Helm chart support : ☑️ API Server section 2025 Q3 - Q4 <br> Cloud-Native Scheduling Solution Gthulhu : ☑️ plugin mode : ☑️ Running on Ubuntu 25.04 K8s integration : ☑️ Container image release : ☑️ MCP tool : Multiple node management system Release 1 : ☑️ R1 DEMO (free5GC) : ☑️ R1 DEMO (MCP) : R1 DEMO (Agent Builder) ``` Gthulhu Roadmap #### Feature (1) Custom scheduler via open interfaces scx_goland delivers runnable tasks from kernel space to a user-space scheduler: ![image](https://hackmd.io/_uploads/ryBc5ispel.png) scx_goland architecture The user-space scheduler decides for each task: - How much CPU time it gets - Which CPU it runs on - When it can obtain CPU time This design makes the scheduler much more flexible. You can modularize scheduling logic and inject it as a plugin into the user-space scheduler: ![image](https://hackmd.io/_uploads/ByP61YsTxl.png) Gthulhu/plugins concept Gthulhu provides a plugin interface that lets you implement the following and have Gthulhu load your scheduler plugin: ```go= type Sched interface { DequeueTask(task *models.QueuedTask) DefaultSelectCPU(t *models.QueuedTask) (error, int32) } type CustomScheduler interface { // Drain the queued task from eBPF and return the number of tasks drained DrainQueuedTask(s Sched) int // Select a task from the queued tasks and return it SelectQueuedTask(s Sched) *models.QueuedTask // Select a CPU for the given queued task, After selecting the CPU, the task will be dispatched to that CPU by Scheduler SelectCPU(s Sched, t *models.QueuedTask) (error, int32) // Determine the time slice for the given task DetermineTimeSlice(s Sched, t *models.QueuedTask) uint64 // Get the number of objects in the pool (waiting to be dispatched) // GetPoolCount will be called by the scheduler to notify the number of tasks waiting to be dispatched (NotifyComplete) GetPoolCount() uint64 } ``` ![image](https://hackmd.io/_uploads/H13I0dj6lx.png) Gthulhu/plugin execution flow :::info Dynamic plugin injection has a clear benefit: complete decoupling between the scheduler implementation and Gthulhu. Using an Apache 2.0-licensed plugin is friendly to users who don’t want to open-source their own scheduler—they can develop a closed-source scheduler against the exposed interfaces or extend an existing Gthulhu scheduler. ::: :::spoiler Gthulhu’s eBPF scheduler uses two kinds of DSQs: - SHARED DSQ: shared by all CPUs; lower priority than LOCAL DSQ. - LOCAL DSQ: one per CPU. You can implement a FIFO or a simple weighted-deadline scheduler using the SHARED DSQ. ```go type CustomScheduler interface { // Drain the queued task from eBPF and return the number of tasks drained DrainQueuedTask(s Sched) int // Select a task from the queued tasks and return it SelectQueuedTask(s Sched) *models.QueuedTask // Select a CPU for the given queued task, After selecting the CPU, the task will be dispatched to that CPU by Scheduler SelectCPU(s Sched, t *models.QueuedTask) (error, int32) // Determine the time slice for the given task DetermineTimeSlice(s Sched, t *models.QueuedTask) uint64 // Get the number of objects in the pool (waiting to be dispatched) // GetPoolCount will be called by the scheduler to notify the number of tasks waiting to be dispatched (NotifyComplete) GetPoolCount() uint64 } ``` Let’s walk through how to implement these hooks: ```go // DrainQueuedTask drains tasks from the scheduler queue into the task pool func (s *SimplePlugin) DrainQueuedTask(sched plugin.Sched) int { count := 0 // Keep draining until the pool is full or no more tasks available for { var queuedTask models.QueuedTask sched.DequeueTask(&queuedTask) // Validate task before processing to prevent corruption if queuedTask.Pid <= 0 { // Skip invalid tasks return count } // Create task and enqueue it task := s.enqueueTask(&queuedTask) s.insertTaskToPool(task) count++ s.globalQueueCount++ } } ``` We take tasks out of the RingBuffer eBPF map until there’s nothing left to schedule (`queuedTask.Pid <= 0`). Extracted tasks are appended to a global slice. When the scheduler calls `SelectQueuedTask()`, it will pop tasks in insertion order, achieving FIFO: ```go // getTaskFromPool retrieves the next task from the pool func (s *SimplePlugin) getTaskFromPool() *models.QueuedTask { if len(s.taskPool) == 0 { return nil } // Get the first task task := &s.taskPool[0] // Remove the first task from slice selectedTask := task.QueuedTask s.taskPool = s.taskPool[1:] // Update running task vtime (for weighted vtime scheduling) if !s.fifoMode { // Ensure task vtime is never 0 before updating global vtime if selectedTask.Vtime == 0 { selectedTask.Vtime = 1 } s.updateRunningTask(selectedTask) } return selectedTask } ``` CPU selection: I want all tasks to enter the SHARED DSQ so idle CPUs can pull from it. So CPU selection always returns ANY CPU (`1<<20`): ```go // SelectCPU selects a CPU for the given task func (s *SimplePlugin) SelectCPU(sched plugin.Sched, task *models.QueuedTask) (error, int32) { return nil, 1 << 20 } ``` Time slice assignment: the simple scheduler always returns the default time slice: ```go // DetermineTimeSlice determines the time slice for the given task func (s *SimplePlugin) DetermineTimeSlice(sched plugin.Sched, task *models.QueuedTask) uint64 { // Always return default slice return s.sliceDefault } ``` Thanks to the plugin mechanism, you can implement a scheduler in about 200 lines of code—without writing any eBPF. If you’re interested, contributions are welcome! ::: #### Feature (2) Policy Server ![image](https://hackmd.io/_uploads/Hk8_Adopxg.png) Gthulhu/api server architecture Gthulhu ships an API Server that accepts user intent and translates it into Scheduling Policies the Gthulhu scheduler understands: - What is the timeslice? - Which PIDs are included? - Is this a high-priority workload? From the user’s perspective, what they care about is: - Which cloud-native workloads (Pods) to optimize - How large the timeslice should be - Whether low latency is required These three are what we call user intent. #### Feature (3) MCP Integration ![image](https://hackmd.io/_uploads/B1dyCKopel.png) MCP concept. Source: https://www.bnext.com.tw/article/82706/what-is-mcp Gthulhu provides an MCP implementation so users can express intent in natural language, and the Agent converts semantics into information the API Server understands. <iframe width="560" height="315" src="https://www.youtube.com/embed/p7cPlWHQrDY?si=WmI7TXsxTixD3E2C" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> ## Integration Case: Optimizing Per-Slice Data Plane with Gthulhu Over the past year, my mentees and I experimented with integrating eBPF and GTP5G. Details: - https://free5gc.org/blog/20241224/ explores debugging GTP5G with eBPF - https://free5gc.org/blog/20250913/20250913/ further observes which process context GTP5G packet processing runs in using eBPF: ![image](https://hackmd.io/_uploads/r1haadsplx.png) GTP5G uplink/downlink packet processing With those results, we can identify which contexts cause GTP5G scheduling delays. To showcase Gthulhu, we add some constraints so the process contexts that handle GTP5G remain stable: - Deploy UERANSIM on Kubernetes - Deploy 5GC on Kubernetes - Place UERANSIM and 5GC on the same machine - Use MACVLAN (Multus CNI) for RAN N3 and UPF N3 - Use ping to observe processing latency: to avoid external interference, add an IP to the UPF Pod, so ICMP packets traverse the `uesimtun` device established by the UE through the PDU Session and reach that IP. Using GTP5G-tracer shows that uplink packets from the UE go from `uesimtun` to the UPF’s `N3` device, to `upfgtp`, and finally to `N6`’s IP. The ICMP reply follows the reverse path. All that processing happens in the `nr-gnb` process context: ```shell nr-gnb-168420 [016] b.s41 6158463.012636: bpf_trace_printk: gtp5g_xmit_skb_ipv4: PID=168420, TGID=168410, CPU=16 nr-gnb-168420 [016] b.s41 6158464.012282: bpf_trace_printk: gtp5g_xmit_skb_ipv4: PID=168420, TGID=168410, CPU=16 nr-gnb-168420 [017] b.s41 6158465.012408: bpf_trace_printk: gtp5g_xmit_skb_ipv4: PID=168420, TGID=168410, CPU=17 nr-gnb-168420 [017] b.s41 6158466.012551: bpf_trace_printk: gtp5g_xmit_skb_ipv4: PID=168420, TGID=168410, CPU=17 nr-gnb-168420 [016] b.s41 6158467.012401: bpf_trace_printk: gtp5g_xmit_skb_ipv4: PID=168420, TGID=168410, CPU=16 nr-gnb-168420 [006] b.s41 6158468.012565: bpf_trace_printk: gtp5g_xmit_skb_ipv4: PID=168420, TGID=168410, CPU=6 nr-gnb-168420 [006] b.s41 6158469.012700: bpf_trace_printk: gtp5g_xmit_skb_ipv4: PID=168420, TGID=168410, CPU=6 nr-gnb-168420 [006] b.s41 6158470.012549: bpf_trace_printk: gtp5g_xmit_skb_ipv4: PID=168420, TGID=168410, CPU=6 nr-gnb-168420 [006] b.s41 6158471.012763: bpf_trace_printk: gtp5g_xmit_skb_ipv4: PID=168420, TGID=168410, CPU=6 nr-gnb-168420 [006] b.s41 6158472.012862: bpf_trace_printk: gtp5g_xmit_skb_ipv4: PID=168420, TGID=168410, CPU=6 ``` So when designing the Scheduling Policy, we only need to adjust processes in the UERANSIM Pod: ```yaml= { "server": { "port": ":8080", "read_timeout": 15, "write_timeout": 15, "idle_timeout": 60 }, "logging": { "level": "info", "format": "text" }, "jwt": { "private_key_path": "./config/jwt_private_key.key", "token_duration": 24 }, "strategies": { "default": [ { "priority": true, "execution_time": 20000, "selectors": [ { "key": "app", "value": "ueransim-macvlan" } ], "command_regex": "nr-gnb|nr-ue|ping" } ] } } ``` We also include `nr-ue`, because after the UPF sends ICMP to `nr-gnb`, the response still goes back to `nr-ue` via IPC. If we only optimize `nr-gnb`, `nr-ue` might still be scheduled too late and introduce latency. With the strategy understood, let’s watch the video to validate whether Gthulhu effectively reduces round-trip latency: <iframe width="560" height="315" src="https://www.youtube.com/embed/MfU64idQcHg?si=_dW1Uvbig5RDOAAN" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> ## Porting Challenges ### Page Fault At first, I chose to build a Linux scheduler in Go for a simple reason—it would be cool if I could make it work. However, even with Aqua Security’s libbpfgo, porting the results from rustland still ran into unexpected problems. Setting aside the missing libbpfgo APIs for the moment, here’s the first thing I overlooked during the port. After I filled in the missing APIs, the scx_rustland eBPF program could indeed be loaded into the kernel by my user-space app. But as soon as it was loaded, the system would hang for about 5 seconds until the watchdog evicted scx_rustland. Initially, I spent a lot of time debugging both the user ring buffer and the ring buffer itself. Later, by validating both with a syscall-type eBPF program, I ruled them out. With no clear lead, I asked Andrea Righi for help. It turned out he had faced the same issue: he immediately confirmed it was a deadlock caused by page faults. To mitigate the performance cost introduced by page faults, Andrea even implemented a buddy allocator. It’s worth noting that page faults only increased overhead in scx_rustland, but for scx_goland they directly caused a deadlock. Both used the same eBPF program yet behaved very differently. After some investigation, I found cgo was the indirect cause: 1. libbpfgo is a wrapper over libbpf; every function call is a cgo call, and cgo calls may spawn extra goroutines. 2. Goroutines are scheduled by the Go runtime, so the actual thread entity that runs can differ. scx_rustland needs to know the PID of the user-space scheduler so that the eBPF program can schedule it directly, without the scheduler intervening itself. This caused threads spawned by Go to be unrecognizable by the eBPF program because the PID of a Go runtime thread differs from the PID of the user-space scheduler. When a page fault occurs, the offending process enters kernel mode to resolve it. During this time, the multiple thread entities corresponding to the user-space scheduler cannot schedule their own goroutines, which ultimately leads to a deadlock. The fix is to match by TGID: if a process’s TGID equals the user-space scheduler’s PID, always let the eBPF program schedule it. For performance, use as much pre-allocated memory as possible and combine it with mlockall to reduce the frequency of page faults. Note: - Older versions of scx_rustland applied special handling to processes in page fault: - https://github.com/Gthulhu/qumun/blob/96cebdd3348b46ae96044d0269cd824213e56772/main.bpf.c#L854 - https://github.com/Gthulhu/qumun/blob/96cebdd3348b46ae96044d0269cd824213e56772/main.bpf.c#L277 - This mechanism was later removed in this commit because it could cause starvation for tasks not in page fault: - https://github.com/sched-ext/scx/commit/67c058c1ba802490764275fe319e3fafd357faed ### watchdog failed to check in for default timeout ![](https://github.com/Gthulhu/qumun/raw/main/assets/demo.gif) After a series of efforts, I finally resolved the scheduler hang issue (page fault). Here is the second problem I encountered while porting scx_rustland. > Note: > I first reimplemented scx_rustland in Go and named it scx_goland. Later, following jserv’s suggestion, I renamed it to qunum (Bunun for “heart”). I found that after running for a while, qunum would always be evicted by the watchdog, accompanied by a “runnable task stall” error. Here’s a quick primer on the scx watchdog design: 1. The watchdog runs using the Concurrency Managed Workqueue (CMWQ) mechanism. See: [Linux Kernel Design: Timer and its management](https://hackmd.io/@sysprog/linux-timer) and [Linux Kernel Design: Concurrency Managed Workqueue (CMWQ)](https://hackmd.io/@RinHizakura/H1PKDev6h). 2. The watchdog evicts the scheduler if a runnable task cannot be scheduled within the configured timeout. 3. Leveraging (1), if the watchdog’s own check cannot finish within the timeout, it will also evict the scheduler. For point (3), my initial WORKAROUND was quite heavy-handed: ![image](https://hackmd.io/_uploads/SkBj-8Eneg.png) I forced the kworker running events_unbound (which handles CMWQ tasks) to be scheduled directly by the eBPF program to prevent the user-space scheduler from giving it too low a priority, which could cause eviction. This worked but only treated the symptom. I later refactored the user-space scheduler loop: ```go for true { select { case <-ctx.Done(): log.Println("context done, exiting scheduler loop") return default: } bpfModule.DrainQueuedTask() t = bpfModule.SelectQueuedTask() if t == nil { bpfModule.BlockTilReadyForDequeue(ctx) } else if t.Pid != -1 { task = core.NewDispatchedTask(t) // Evaluate used task time slice. nrWaiting := core.GetNrQueued() + core.GetNrScheduled() + 1 task.Vtime = t.Vtime // Check if a custom execution time was set by a scheduling strategy customTime := bpfModule.DetermineTimeSlice(t) if customTime > 0 { // Use the custom execution time from the scheduling strategy task.SliceNs = min(customTime, (t.StopTs-t.StartTs)*11/10) } else { // No custom execution time, use default algorithm task.SliceNs = max(SLICE_NS_DEFAULT/nrWaiting, SLICE_NS_MIN) } err, cpu = bpfModule.SelectCPU(t) if err != nil { log.Printf("SelectCPU failed: %v", err) } task.Cpu = cpu err = bpfModule.DispatchTask(task) if err != nil { log.Printf("DispatchTask failed: %v", err) continue } err = core.NotifyComplete(bpfModule.GetPoolCount()) if err != nil { log.Printf("NotifyComplete failed: %v", err) } } } ``` Initially, to avoid the loop hogging CPU, I only proceeded when the task queue had multiple items, but that increased scheduling delay. I then introduced the key function BlockTilReadyForDequeue: ```go func (s *Sched) BlockTilReadyForDequeue(ctx context.Context) { select { case t, ok := <-s.queue: if !ok { return } s.queue <- t return case <-ctx.Done(): return } } ``` The idea is simple: if a task arrives from the user ring buffer (dispatched by the eBPF program), continue; otherwise block the loop. This keeps scheduler latency as low as possible, preventing watchdog starvation and allowing me to remove the ugly WORKAROUND. > Note: > This “latency” is the “bubble” Andrea Righi mentions in his blog; both refer to the time from runnable to running. It is especially important for low-latency applications. When we later applied Gthulhu to a 5G URLLC case, latency was the first issue we addressed. ### Data Race After overcoming the previous two major challenges, Gthulhu became fairly stable (it survived long 7x24-hour runs on my main development machine). However, one issue still bothered me. In general, an eBPF program’s global variables are placed into different segments depending on their declaration: ```c struct { struct bpf_map *cpu_ctx_stor; struct bpf_map *task_ctx_stor; struct bpf_map *queued; struct bpf_map *dispatched; struct bpf_map *priority_tasks; struct bpf_map *running_task; struct bpf_map *usersched_timer; struct bpf_map *rodata; struct bpf_map *data_uei_dump; struct bpf_map *data; struct bpf_map *bss; struct bpf_map *goland; } maps; ``` The code above is part of the skeleton file generated by bpftool. You can see the eBPF program used by Gthulhu has at least: - bss - data - rodata Among them, the data in .bss is critical: ``` struct main_bpf__bss { u64 usersched_last_run_at; u64 nr_queued; u64 nr_scheduled; u64 nr_running; u64 nr_online_cpus; u64 nr_user_dispatches; u64 nr_kernel_dispatches; u64 nr_cancel_dispatches; u64 nr_bounce_dispatches; u64 nr_failed_dispatches; u64 nr_sched_congested; } *bss; ``` The fields nr_scheduled and nr_queued affect how the user-space scheduler allocates a time slice for a task (as noted earlier, scx_rustland adjusts the time slice based on the number of pending tasks). However, libbpfgo APIs treat a .bss section as a single eBPF map. If I read and then update this map while the eBPF program concurrently increments/decrements its fields, a data race occurs. In database terms this resembles oversell/overbuy; in DB scenarios you would solve it with a transaction or lock. scx_rustland directly uses the skeleton API and can update an individual field in the .bss map. To address this in Go, I leveraged the eBPF skeleton: ``` // wrapper.c #include "wrapper.h" struct main_bpf *global_obj; void *open_skel() { struct main_bpf *obj = NULL; obj = main_bpf__open(); main_bpf__create_skeleton(obj); global_obj = obj; return obj->obj; } u32 get_usersched_pid() { return global_obj->rodata->usersched_pid; } void set_usersched_pid(u32 id) { global_obj->rodata->usersched_pid = id; } void set_kugepagepid(u32 id) { global_obj->rodata->khugepaged_pid = id; } void set_early_processing(bool enabled) { global_obj->rodata->early_processing = enabled; } void set_default_slice(u64 t) { global_obj->rodata->default_slice = t; } void set_debug(bool enabled) { global_obj->rodata->debug = enabled; } void set_builtin_idle(bool enabled) { global_obj->rodata->builtin_idle = enabled; } u64 get_nr_scheduled() { return global_obj->bss->nr_scheduled; } u64 get_nr_queued() { return global_obj->bss->nr_queued; } void notify_complete(u64 nr_pending) { global_obj->bss->nr_scheduled = nr_pending; } void sub_nr_queued() { if (global_obj->bss->nr_queued){ global_obj->bss->nr_queued--; } } void destroy_skel(void*skel) { main_bpf__destroy(skel); } ``` Although Go cannot use the skeleton API directly like Rust, I can wrap these APIs and call them via cgo. ``` wrapper: bpftool gen skeleton main.bpf.o > main.skeleton.h clang -g -O2 -Wall -fPIC -I scx/build/libbpf/src/usr/include -I scx/build/libbpf/include/uapi -I scx/scheds/include -I scx/scheds/include/arch/x86 -I scx/scheds/include/bpf-compat -I scx/scheds/include/lib -c wrapper.c -o wrapper.o ar rcs libwrapper.a wrapper.o ``` With the above, the wrapper is built as a static library for Gthulhu: ``` CGOFLAG = CC=clang CGO_CFLAGS="-I$(BASEDIR) -I$(BASEDIR)/$(OUTPUT)" CGO_LDFLAGS="-lelf -lz $(LIBBPF_OBJ) -lzstd $(BASEDIR)/libwrapper.a" ``` This allows Go to call the wrapped APIs: ```go func (s *Sched) AssignUserSchedPid(pid int) error { C.set_kugepagepid(C.u32(KhugepagePid())) C.set_usersched_pid(C.u32(pid)) return nil } func (s *Sched) SetDebug(enabled bool) { C.set_debug(C.bool(enabled)) } func (s *Sched) SetBuiltinIdle(enabled bool) { C.set_builtin_idle(C.bool(enabled)) } func (s *Sched) SetEarlyProcessing(enabled bool) { C.set_early_processing(C.bool(enabled)) } func (s *Sched) SetDefaultSlice(t uint64) { C.set_default_slice(C.u64(t)) } ```