compiler
gcc
This article aims to explain how does GNU GCC plugins work, what can it do, and how to write one.
According to kernel.org
In the following article, we will discuss the concepts of callbacks, GIMPLE, IPA, and RTL passes
note that: the experiments are based on GCC-11
There are a lot of modules in the open source projects. For instance, one can find some modules in the linux/scripts/gcc-plugins
or in gcc-11.3.0/gcc/testsuite/gcc.dg/plugin/one_time_plugin.c
In this section, let's try to understand the mechanism behind GCC plugin by tracing these works
one_time_plugin.c:
Accorind to gccint.pdf:
So we should always define this global variable
plugin_init() is the entry of a plugin, and for each event of interest, the plugin should call register_callback specifying the name of the event and address of the callback function that will handle that event.
There are two arguments to int plugin_init (struct plugin_name_args *plugin_info, struct plugin_gcc_version *version)
One common way to check the compatibility is:
The prototype of a callback is:
and there are several pre-defined events
In this example, we choose PLUGIN_PASS_MANAGER_SETUP
to hook pass manager
See void register_callback (const char *plugin_name, int event, plugin_callback_func callback, void *user_data)
in plugin.c
it takes void *user_data
, which is a struct register_pass_info
that we pass, and pass it to void register_pass (struct register_pass_info *pass_info)
in passes.c
and in the register_pass
function, it does the following things:
Plus, remember the PASS_POS_INSERT_AFTER argument? in position_pass (struct register_pass_info *new_pass_info, opt_pass **pass_list), opt_pass is a linked list structure and this argument determines where is this pass inserted.
take all_lowering_passes
for example, it will be call at execute_pass_list (cfun, passes->all_lowering_passes)
in cgraphunit.c:cgraph_node::add_new_function()
and execute_pass_list(function *fn, opt_pass *pass)
take a function fn, make it go through the pass, so that the function can be optimized or what. and execute_pass_list
roughly does the following things:
bool execute_one_pass (opt_pass *pass)
in passes.c
pass->gate (cfun)
to check whether gate check should be avoided. (gate
is a virutal function that user can override)pass->execute (cfun)
to do the pass (execute
is also a virtual function that user can override)In this section we see the rough picture behind a real plugin module by looking into the API function.
We've learned how to compose an easy plugin by defining a init function and calling a regist API.
In additional, we look into the mechanism behine the register function and see the general picture about how's a customized pass inserted into the five pass list and how does it get executed.
gcc.gnu.org:Plugins
gcc.gnu.org
TBN
gcc.gnu.org: GIMPLE
GIMPLE is a three-address representation. (see Three address code in Compiler to understand what's three-address representation)
The compiler pass which converts GENERIC into GIMPLE is referred to as the gimplifier
. The gimplifier works recursively, generating GIMPLE tuples out of the original GENERIC expressions.
The GENERIC representation of a function is stored in the DECL_SAVED_TREE field of the associated FUNCTION_DECL tree node. It is converted to GIMPLE by a call to gimplify_function_tree
.
The C and C++ front ends currently convert directly from front end trees to GIMPLE, and hand that off to the back end rather than first converting to GENERIC. (So C/C++ skip the generic part)
Here's an example of hello world by give gcc the -fdump-tree-gimple
flag
complie:
and let's change the main.c a little bit and see the difference!
We can see that printf("Hello World %d\n", (argc + 1) * 5)
is not a three addresses representation and it's converted to …
TBN
TBN
TBN
TBN
In this section, we explore the methods to generate CFG (control-flow graph).
We will start with GCC API to explore how's it generate a CFG. Then we will try to dicuss how to modify this method to make it more flexible.
print_graph_cfg
Refer to this PoC, which uses print_graph_cfg()
to generate CFG and dump to a file.
The prototype can be found in graph.h
And the implementations is in graph.c
We see that this function call print_graph_cfg (FILE *fp, struct function *fun)
to do the work, and store the CFG in the format for Graphviz (https://en.wikipedia.org/wiki/Graphviz), which is a package of open-source tools initiated by AT&T Labs Research for drawing graphs specified in DOT language scripts
So this function firstly gets the function name and creates a subgraph for this function, and then it draws nodes, which should be basic blocks. Finally, it draws edges, which should be jump instructions(?)
static void draw_cfg_nodes (pretty_printer *pp, struct function *fun)
if there's a loop tree, it calls draw_cfg_nodes_for_loop
to:
body
which can be considered as an array of basic blocks
basic block
, calls draw_cfg_node
to draw the basic blocksnote: see static void draw_cfg_nodes_for_loop (pretty_printer *pp, int funcdef_no, class loop *loop)
in graph.c
for detail
For the case that there's no loop tree for a function, it tries to computer the topological order starting from the entry block of the given function using DFS.
Then, according to such post order, it calls draw_cfg_node
to draws basic blocks
XNEWVEC (int, n_basic_blocks_for_fn (fun));
: allocate an array of int
with the size of n_basic_blocks_for_fn (fun)
include/libiberty.h:366:#define XNEWVEC(T, N) ((T *) xmalloc (sizeof (T) * (N)))
auto_sbitmap visited (last_basic_block_for_fn (cfun));
:
class auto_sbitmap
, a class that ties the lifetime of a sbitmap to its scopesbitmap.h
n = pre_and_rev_post_order_compute_fn (fun, NULL, rpo, true)
:
cfganal.c
NULL
implies that we don't care the pre-orderrpo
suggests that we want to store the post order herefor (i = n_basic_blocks_for_fn (fun) - n; i < n_basic_blocks_for_fn (fun); i++)
:
basic_block bb = BASIC_BLOCK_FOR_FN (cfun, rpo[i]);
to retrieve the basic blockdraw_cfg_node (pp, fun->funcdef_no, bb);
to draw the basick blockif (n != n_basic_blocks_for_fn (fun))
: this part handles those basic blocks which r unreachableAnd how's it draw a BB? let's take a look at draw_cfg_node
in graph.h
We can see that this function treats the entry and exit blocks specially by giving them special shape, color, and corresponding names.
As for the rest of the BBs it calls dump_bb_for_graph
to do the work
void dump_bb_for_graph (pretty_printer *pp, basic_block bb)
is defined in cfghooks.c
with the implementation below:
cfg_hooks->dump_bb_for_graph (pp, bb)
suggests this callback function was registered somewhere. I've found two implementations so far:
void rtl_dump_bb_for_graph (pretty_printer *pp, basic_block bb)
in print-rtl.c
void gimple_dump_bb_for_graph (pretty_printer *pp, basic_block bb)
in gimple-pretty-print.c
In this section, I wil explore the rtl implementation for sake of simplicity
On the second thought, let's take a look at the GIMPLE implementation gimple_dump_bb_for_graph
void gimple_dump_bb_for_graph (pretty_printer *pp, basic_block bb)
in gimple-pretty-print.c
for this function, we see it firstly prints the bb's index.
then it deals with phi nodes, finally gimple statement.
PHI function or PHI node is used in SSA (Static Single Assignment) when we found the flow of control makes it impossible to determine the most recent version of a variable.
for example:
the SSA renamer creates a new version a_4 which is assigned the result of “merging” a_1, a_2 and a_3. Hence, PHI nodes mean “one of these operands. I don’t know which”.
for more details, please refer to:
note: on Ubunto22.04, gcc11