Protect the System Call, Protect (most of) the World with BASTION
website
Abstract
- enforces at runtime
- 3 contexts
- (Call type): which system call, how's call
- (Control Flow)
- (Argument Integrity)
- Bastion
- a compiler
- a runtime monitor system
- Case study
- NGINX, SQLite, and vsFTPd
- indication: reduce overhead of 0.60%, 2.01% 1.65%
Introduction
- Traditional methods:
- debloating, system call filtering, and system call sandboxing
- idea: disabling unused system calls
- problem: still allow system calls to be invoked (cuz still needed, e.g code-reuse)
- Contribution:
- Novel system call contexts for system call integrity
- 3 system call contexts, namely Call Type, Control Flow, and Argument Integrity contexts
- Bastion defense enforcing system call integrity
- compiler pass
- analyzes all system call usage, performs instrumentation, and generates metadata
- runtime monitor
- static and dynamic aspects of each system call context
- Security & performance evaluation
- NGINX, SQLite, and vsFTPd
Background
System call Usage in Attaacks
- 400+ system calls in recent Linux kernel
- Only few system calls are desired by attackers, call sensitive system calls
Current system call protection mechanisms
- Attack surface reduction
- Debloating techniques
- Remove unused code
- static program analysis or dynamic coverage analysis
- Problem: many sensitive system calls are used forlibrary loading (mmap, mprotect)
- System call filtering
- Seccomp. system call filtering framework
- User needs to define an allowlist/denylist given an application
- Can restrict a system call argument
- Problem: cannot remove sensitve-but-necessary system calls
- Problem: restricting arguments to a constant value is applied across the entire application scope. eg. there are to callsites of mprotect, one is read-only, another is read-executable, then the policy for all app scope is read-executable
- Control-flow integrity (CFI)
- LLVM support (backward)
- perform analysis to generage an allowed set of targets per-callsite, called an equivalence class(EC)
- Problem: Larges ECs -> inconvient; Small ECs -> dangerous
- Problem: runtime overhead, Eg. Intel PT
- Data-flow integrity (DFI)
- instrument every
load
and store
instruction
- Problem: overhead
Contexts for system call integrity
Q: What's a legitimate use of a system call?
A: two variants: (1) control-flow integrity (2) data integrity
In this paper, thress contexts are defined accrodingly
- Call-type context
- Control-flow context
- Argument integrity context
Call-type context
only permitted system call are called in the right manner.
- direct call or indirect call
- direct call:
int ret = chmod("AAA", S_IWOTH)
- indirect call: function pointer
- a system call is one of the categories:
- not-callable
- directly-callable
- indirectly-callable
- It is rare for system calls to be called from an indirect call site
Control-flow context
- Keep the valid pathes of all sensitive system calls, and enforce this context at runtime
Argument integrity context
- A system call argument type is either (1) a direct argument or (2) an extended argument
- direct argument: eg. constant, local variable
- indirect argument: eg. pointer
- if there's struct, take care of the filed
Real world code examples
- ctx->output_filter = ngx_execute_proc
- ctx->path = "/bin/sh"
BASTION:
- Call-type context: At static analysis, we know that "execve" has to be a direct call -> but at runtime, it is a indirect call
- Control-Flow context: detect
- Argument integreity context: detect
- buffer overflow: modify index, such that v[index].get_handler = mprotect
- r = memory region to exploit
- change permission
BASTION:
- mprotext does not have indirect call
- control path is problematic
- argument is wrong
Threat model and assumptions
- arbitrary memory read/write
- Data Execution Prevention(DEP)
- attackers cannot inject or modify code due to DEP
- Address Space Layout Randomization (ASLR)
- Shadow Stack (CET)
- Hardware and OS kernel are trusted
- Attackers going for OS kernel and hardware (Spectre) are out of scope
- BASTION protects a subset of available system calls
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
BASTION design and implementation
- Choose 20 sensiteve system calls (Table1)
- BASTION = BASTION-compiler pass + BASTION-monitor
BASTION compiler
- Generate context metadata
- Leverage light-weight library API for dynamic tracking of sensitive data (Argument integrity)
Analysis for Call-Type Context
Analysis for Control-Flow Context
- performed only when a system call is invoked -> reduce overhead (CFL enforce for every indirect control-flow transfer)
compile time:
- generate CFG
- identify all function callee→caller relationships that reach system call callsites
- For each callsite, recursively records all callee→caller associations
- stops once reaching either main() or an indirect function call
runtime:
- unwind stack frame
- verifies callee→caller relations until the bottom of the stack (i.e., main), or an indirect callsite
Analysis for Argument Integrity Context
- check not only system call arguments but also an arguments’ data-dependent variables -> sensitive variables
- maintains a shadow copy of the sensitive variable’s legitimate value in a shadow memory region
- updates the shadow copy whenever the sensitive variable is updated legitimately
- binds each argument to a certain position for the system call so the Bastion runtime monitor can check argument integrity

- enumerates all variables used in system call arguments
- performs a backward data-flow analysis, traversing the use-def chains to derive any other variables used to define sensitive variables
- newly identified data-dependent variables are added to the set of sensitive variables
- , if there is a write to a field of a struct (e.g., size field of gshm in Figure 2), that write is added to the sensitive variables
repeat 2, 3 until no new sensitive variables are found
- Once all sensitive variables are identified, Bastion instruments ctx_write_mem after any memory-backed sensitive variable store to keep its shadow copy up-to-date
- Before each sensitive system call callsite, Bastion instruments ctx_bind_mem_X or ctx_-bind_const_X to bind an arguments to their respective argument position X
Bastion runtime monitor
Initializing the Bastion Monitor
- Loading metadata:
- The monitor retrieves ELF, DWARF, and linked library file information to recover symbol addresses
- loads Bastion context metadata into the monitor’s memory
- Launching a Bastion-protected application:
- performs fork to spawn a child process where the child runs the Bastion-enabled application
- initializes a shadow memory region under a segmentation register
- initializes seccomp: trap on sensitive system calls in the child process
- ptrace: access the application’s state
- Trapping a system call invocation:
- custom seccomp-BPF filter to trap on the application’s sensitive system call
- SECCOMP_-RET_ALLOW: non-sensitive system calls, ignore
- SECCOMP_RET_KILL: disables any notcallable system calls
- SECCOMP_RET_TRACE: directlyand indirectly-callable system calls so these system calls can be verified by the Bastion monitor
Enforcing Call-Type Context
- Take $PC, look meatedata, check call type
Enforcing Control-Flow Context
- stack trace: unwinds and gets each function callsite offset
- CFG metadata:a list of callees and their respective valid callers
- until the entire stack has been vetted or an indirect call is encountered
Enforcing Argument Integrity Context
- verifies integrity of all sensitive variables in the current call stack
- Take $PC, check the associated argument integrity context metadata
Implementation
- Linux x86-64 v5.19.14
- LLVM Module
- hardware-based shadow stack
-fcf-protection=full
- Intel Tiger Lake and AMD Ryzen 7 processors
- Glibc v2.28+
- Binutils v2.29+
Efforts:
- LLVM module: 3,939 lines of code
- Bastion’s C runtime library: 659 lines of code
- Bastion runtime monitor is a C-program: 7313 lines of code
EVALUATION
Evaluation Methodology
- 8-core (16-hardware thread) machine featuring an AMD Ryzen 7 PRO 5850U processor and 16 GB DDR4 memory
- Bastion LLVM compiler
- Results are reported average over five runs
- NGINX, SQLite, and vsftpd
-
NGINX:
- wrk, HTTP benchmarking tool
- sends concurrent HTTP requests
- measure throughput
- NGINX maximum of 1,024 connections per processor
- 32 worker threads
- never incurred more than 0.60% degradation compared to the unprotected NGINX baseline
- Argument Integrity context adds the most overhead
- utilizes a vast sensitive system calls (e.g., mprotect, mmap) during its initialization phase while seldom using when idle or processing requests -> Bastion rarely being triggered during runtime
- average call-depth is only 5.2 frames, with 4 and 9 being minimum and maximum stack call-depths
-
SQLite:
- DBT2, database transaction processing benchmark
- mix of read and write SQL operations for large data warehouse transactions
- 10 second new thread delay and a 10 minute workload duration
- number of new-order transactions per-minute (NOTPM) for performance
- Overhead:
- Call-Type: 0.92%
- Control-Flow: 1.48%
- Argument Integrity: 2.01%
-
VSFTPD:
- dkftpbench, FTP benchmark program
- fetch a 100 MB file from vsftpd launching clients one after another for a 120 second duration
- Overhead: worst 1.65%

- Argument Integrity context is most costly
- LLVM CFI is expensive cuz it is triggered for every indirect callsite => NGINX does not have many indirect call


SECURITY EVALUATION
ROP Attacks
- libc library call
system
= fork
+ execl
- exec-type system call, to create access to a root shell
mprotect
or chmod
system calls to change memory or file permissions to be executable
Direct Attacker Manipulation of System Calls
- Go after system calls directly, setup callsites and arguments to desired values
- The CsCFI attack leverages mprotect to make the entire libc readable, writable, and executable, revealing the code layout to perform arbitrary code execution
- AOCR’s Attack 1 open and write to reveal the code layout of NGINX to execute arbitrary code
- Control-Flow and Argument integrity contexts to detect
- LLVM CFI cannot defend against either attack.
- In the CsCFI attack, mprotect is never used, its address is still taken as this system call is necessary to support dynamic loading of shared libraries
Indirect Attack Manipulation of System Calls
- full-function code re-use, data-oriented attacks, and COOP
- The NEWTON CPI attack avoids corrupting any code or data pointers. It corrupts the index variable of an array of function pointers to make the array index point to a system call location
- Call-Type context blocks the invocation of a system call never used in the program code base
DISCUSSION AND LIMITATIONS
Bastion under Arbitrary Memory Corruption
- To bypass all three of Bastion’s contexts, the attacker realistically would needto perform arbitrary read/write many times to match the expected context values without violating static constraints
- The main challenge is that this type of system call is called much more frequently
- Full Bastion context checking incurs high overhead – e.g., 96.7% for NGINX
- A majority of overhead results from fetching protected process state using
ptrace
(< 95.7%, delta between Rows 1 and 2)
- additional context switching overhead to access the protected program
- eliminate ptrace overhead would be to run the Bastion monitor inside the kernel