compiler
gcc
This article aims to explain how does GNU GCC plugins work, what can it do, and how to write one.
According to kernel.org
GCC plugins are loadable modules that provide extra features to the
compiler [1]_. They are useful for runtime instrumentation and static analysis.
We can analyse, change and add further code during compilation via
callbacks [2]_, GIMPLE [3]_, IPA [4]_ and RTL passes [5]_.
...
Plugin source files have to be compilable by a C++ compiler.
In the following article, we will discuss the concepts of callbacks, GIMPLE, IPA, and RTL passes
note that: the experiments are based on GCC-11
There are a lot of modules in the open source projects. For instance, one can find some modules in the linux/scripts/gcc-plugins
or in gcc-11.3.0/gcc/testsuite/gcc.dg/plugin/one_time_plugin.c
In this section, let's try to understand the mechanism behind GCC plugin by tracing these works
one_time_plugin.c:
/* Plugin that prints message if it inserted (and invoked) more than once. */
#include "config.h"
#include "gcc-plugin.h"
#include "system.h"
#include "coretypes.h"
#include "tree.h"
#include "tm.h"
#include "toplev.h"
#include "hash-table.h"
#include "vec.h"
#include "ggc.h"
#include "basic-block.h"
#include "tree-ssa-alias.h"
#include "internal-fn.h"
#include "gimple-fold.h"
#include "tree-eh.h"
#include "gimple-expr.h"
#include "is-a.h"
#include "gimple.h"
#include "tree-pass.h"
#include "intl.h"
#include "context.h"
int plugin_is_GPL_compatible;
namespace {
const pass_data pass_data_one_pass =
{
GIMPLE_PASS, /* type */
"cfg", /* name */
OPTGROUP_NONE, /* optinfo_flags */
TV_NONE, /* tv_id */
PROP_gimple_any, /* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
0, /* todo_flags_finish */
};
class one_pass : public gimple_opt_pass
{
public:
one_pass(gcc::context *ctxt)
: gimple_opt_pass(pass_data_one_pass, ctxt),
counter(0)
{}
/* opt_pass methods: */
virtual bool gate (function *);
virtual unsigned int execute (function *);
private:
int counter;
}; // class one_pass
} // anon namespace
bool one_pass::gate (function *)
{
return true;
}
unsigned int
one_pass::execute (function *)
{
if (counter > 0) {
printf ("Executed more than once \n");
}
counter++;
return 0;
}
gimple_opt_pass *
make_one_pass (gcc::context *ctxt)
{
return new one_pass (ctxt);
}
int plugin_init (struct plugin_name_args *plugin_info,
struct plugin_gcc_version *version)
{
struct register_pass_info p;
p.pass = make_one_pass (g);
p.reference_pass_name = "cfg";
p.ref_pass_instance_number = 1;
p.pos_op = PASS_POS_INSERT_AFTER;
register_callback ("one_pass", PLUGIN_PASS_MANAGER_SETUP, NULL, &p);
return 0;
}
Accorind to gccint.pdf:
Every plugin should define the global symbol plugin_is_GPL_compatible to assert that
it has been licensed under a GPL-compatible license. If this symbol does not exist, the
compiler will emit a fatal error and exit with the error message:
fatal error: plugin name is not licensed under a GPL-compatible license
name: undefined symbol: plugin_is_GPL_compatible
compilation terminated
So we should always define this global variable
plugin_init() is the entry of a plugin, and for each event of interest, the plugin should call register_callback specifying the name of the event and address of the callback function that will handle that event.
There are two arguments to int plugin_init (struct plugin_name_args *plugin_info, struct plugin_gcc_version *version)
struct plugin_name_args
{
char *base_name; /* Short name of the plugin
(filename without .so suffix). */
const char *full_name; /* Path to the plugin as specified with
-fplugin=. */
int argc; /* Number of arguments specified with
-fplugin-arg-.... */
struct plugin_argument *argv; /* Array of ARGC key-value pairs. */
const char *version; /* Version string provided by plugin. */
const char *help; /* Help string provided by plugin. */
}
struct plugin_gcc_version
{
const char *basever;
const char *datestamp;
const char *devphase;
const char *revision;
const char *configuration_arguments;
};
One common way to check the compatibility is:
#include "plugin-version.h"
...
int
plugin_init (struct plugin_name_args *plugin_info,
struct plugin_gcc_version *version)
{
if (!plugin_default_version_check (version, &gcc_version))
return 1;
}
The prototype of a callback is:
/* The prototype for a plugin callback function.
gcc_data - event-specific data provided by GCC
user_data - plugin-specific data provided by the plug-in. */
typedef void (*plugin_callback_func)(void *gcc_data, void *user_data);
and there are several pre-defined events
but you can also check the individual fields if you want a less strict check.
24.2.3 Plugin callbacks
Callback functions have the following prototype:
/* The prototype for a plugin callback function.
gcc_data - event-specific data provided by GCC
user_data - plugin-specific data provided by the plug-in. */
typedef void (*plugin_callback_func)(void *gcc_data, void *user_data);
Callbacks can be invoked at the following pre-determined events:
enum plugin_event
{
PLUGIN_START_PARSE_FUNCTION, /* Called before parsing the body of a function. */
PLUGIN_FINISH_PARSE_FUNCTION, /* After finishing parsing a function. */
PLUGIN_PASS_MANAGER_SETUP, /* To hook into pass manager. */
PLUGIN_FINISH_TYPE, /* After finishing parsing a type. */
PLUGIN_FINISH_DECL, /* After finishing parsing a declaration. */
PLUGIN_FINISH_UNIT, /* Useful for summary processing. */
PLUGIN_PRE_GENERICIZE, /* Allows to see low level AST in C and C++ frontends. */
PLUGIN_FINISH, /* Called before GCC exits. */
PLUGIN_INFO, /* Information about the plugin. */
PLUGIN_GGC_START, /* Called at start of GCC Garbage Collection. */
PLUGIN_GGC_MARKING, /* Extend the GGC marking. */
PLUGIN_GGC_END, /* Called at end of GGC. */
PLUGIN_REGISTER_GGC_ROOTS, /* Register an extra GGC root table. */
PLUGIN_ATTRIBUTES, /* Called during attribute registration */
PLUGIN_START_UNIT, /* Called before processing a translation unit. */
PLUGIN_PRAGMAS, /* Called during pragma registration. */
/* Called before first pass from all_passes. */
PLUGIN_ALL_PASSES_START,
/* Called after last pass from all_passes. */
PLUGIN_ALL_PASSES_END,
/* Called before first ipa pass. */
PLUGIN_ALL_IPA_PASSES_START,
/* Called after last ipa pass. */
PLUGIN_ALL_IPA_PASSES_END,
/* Allows to override pass gate decision for current_pass. */
PLUGIN_OVERRIDE_GATE,
/* Called before executing a pass. */
PLUGIN_PASS_EXECUTION,
/* Called before executing subpasses of a GIMPLE_PASS in
execute_ipa_pass_list. */
PLUGIN_EARLY_GIMPLE_PASSES_START,
/* Called after executing subpasses of a GIMPLE_PASS in
execute_ipa_pass_list. */
PLUGIN_EARLY_GIMPLE_PASSES_END,
/* Called when a pass is first instantiated. */
PLUGIN_NEW_PASS,
/* Called when a file is #include-d or given via the #line directive.
This could happen many times. The event data is the included file path,
as a const char* pointer. */
PLUGIN_INCLUDE_FILE,
/* Called when -fanalyzer starts. The event data is an
ana::plugin_analyzer_init_iface *. */
PLUGIN_ANALYZER_INIT,
PLUGIN_EVENT_FIRST_DYNAMIC /* Dummy event used for indexing callback
array. */
};
In this example, we choose PLUGIN_PASS_MANAGER_SETUP
to hook pass manager
See void register_callback (const char *plugin_name, int event, plugin_callback_func callback, void *user_data)
in plugin.c
void
register_callback (const char *plugin_name,
int event,
plugin_callback_func callback,
void *user_data)
{
switch (event)
{
case PLUGIN_PASS_MANAGER_SETUP:
gcc_assert (!callback);
register_pass ((struct register_pass_info *) user_data);
break;
...
it takes void *user_data
, which is a struct register_pass_info
that we pass, and pass it to void register_pass (struct register_pass_info *pass_info)
in passes.c
and in the register_pass
function, it does the following things:
Plus, remember the PASS_POS_INSERT_AFTER argument? in position_pass (struct register_pass_info *new_pass_info, opt_pass **pass_list), opt_pass is a linked list structure and this argument determines where is this pass inserted.
/* Construct the pass tree. The sequencing of passes is driven by
the cgraph routines:
finalize_compilation_unit ()
for each node N in the cgraph
cgraph_analyze_function (N)
cgraph_lower_function (N) -> all_lowering_passes
If we are optimizing, compile is then invoked:
compile ()
ipa_passes () -> all_small_ipa_passes
-> Analysis of all_regular_ipa_passes
* possible LTO streaming at copmilation time *
-> Execution of all_regular_ipa_passes
* possible LTO streaming at link time *
-> all_late_ipa_passes
expand_all_functions ()
for each node N in the cgraph
expand_function (N) -> Transformation of all_regular_ipa_passes
-> all_passes
*/
take all_lowering_passes
for example, it will be call at execute_pass_list (cfun, passes->all_lowering_passes)
in cgraphunit.c:cgraph_node::add_new_function()
and execute_pass_list(function *fn, opt_pass *pass)
take a function fn, make it go through the pass, so that the function can be optimized or what. and execute_pass_list
roughly does the following things:
bool execute_one_pass (opt_pass *pass)
in passes.c
pass->gate (cfun)
to check whether gate check should be avoided. (gate
is a virutal function that user can override)pass->execute (cfun)
to do the pass (execute
is also a virtual function that user can override)In this section we see the rough picture behind a real plugin module by looking into the API function.
We've learned how to compose an easy plugin by defining a init function and calling a regist API.
In additional, we look into the mechanism behine the register function and see the general picture about how's a customized pass inserted into the five pass list and how does it get executed.
gcc.gnu.org:Plugins
gcc.gnu.org
TBN
gcc.gnu.org: GIMPLE
GIMPLE is a three-address representation. (see Three address code in Compiler to understand what's three-address representation)
The compiler pass which converts GENERIC into GIMPLE is referred to as the gimplifier
. The gimplifier works recursively, generating GIMPLE tuples out of the original GENERIC expressions.
The GENERIC representation of a function is stored in the DECL_SAVED_TREE field of the associated FUNCTION_DECL tree node. It is converted to GIMPLE by a call to gimplify_function_tree
.
The C and C++ front ends currently convert directly from front end trees to GIMPLE, and hand that off to the back end rather than first converting to GENERIC. (So C/C++ skip the generic part)
Here's an example of hello world by give gcc the -fdump-tree-gimple
flag
#include <stdio.h>
int main(int argc, char **argv)
{
printf("Hello World\n");
return 0;
}
int main (int argc, char * * argv)
{
int D.4243;
{
printf ("Hello World\n");
D.4243 = 0;
return D.4243;
}
D.4243 = 0;
return D.4243;
}
complie:
gcc -O0 main.c -o main -fdump-tree-gimple
and let's change the main.c a little bit and see the difference!
#include <stdio.h>
int main(int argc, char **argv)
{
printf("Hello World %d\n", (argc + 1) * 5);
return 0;
}
int main (int argc, char * * argv)
{
int D.4243;
{
_1 = argc + 1;
_2 = _1 * 5;
printf ("Hello World %d\n", _2);
D.4243 = 0;
return D.4243;
}
D.4243 = 0;
return D.4243;
}
We can see that printf("Hello World %d\n", (argc + 1) * 5)
is not a three addresses representation and it's converted to …
_1 = argc + 1;
_2 = _1 * 5;
printf ("Hello World %d\n", _2);
TBN
TBN
TBN
TBN
In this section, we explore the methods to generate CFG (control-flow graph).
In a control-flow graph each node in the graph represents a basic block,
i.e. a straight-line piece of code without any jumps or jump targets;
jump targets start a block, and jumps end a block.
Directed edges are used to represent jumps in the control flow.
We will start with GCC API to explore how's it generate a CFG. Then we will try to dicuss how to modify this method to make it more flexible.
print_graph_cfg
Refer to this PoC, which uses print_graph_cfg()
to generate CFG and dump to a file.
The prototype can be found in graph.h
extern void print_graph_cfg (const char *, struct function *);
extern void clean_graph_dump_file (const char *);
extern void finish_graph_dump_file (const char *);
And the implementations is in graph.c
/* Print a graphical representation of the CFG of function FUN.
First print all basic blocks. Draw all edges at the end to get
subgraphs right for GraphViz, which requires nodes to be defined
before edges to cluster nodes properly. */
void
print_graph_cfg (const char *base, struct function *fun)
{
FILE *fp = open_graph_file (base, "a");
print_graph_cfg (fp, fun);
fclose (fp);
}
We see that this function call print_graph_cfg (FILE *fp, struct function *fun)
to do the work, and store the CFG in the format for Graphviz (https://en.wikipedia.org/wiki/Graphviz), which is a package of open-source tools initiated by AT&T Labs Research for drawing graphs specified in DOT language scripts
/* Print a graphical representation of the CFG of function FUN.
First print all basic blocks. Draw all edges at the end to get
subgraphs right for GraphViz, which requires nodes to be defined
before edges to cluster nodes properly. */
void DEBUG_FUNCTION
print_graph_cfg (FILE *fp, struct function *fun)
{
pretty_printer graph_slim_pp;
graph_slim_pp.buffer->stream = fp;
pretty_printer *const pp = &graph_slim_pp;
const char *funcname = function_name (fun);
pp_printf (pp, "subgraph \"cluster_%s\" {\n"
"\tstyle=\"dashed\";\n"
"\tcolor=\"black\";\n"
"\tlabel=\"%s ()\";\n",
funcname, funcname);
draw_cfg_nodes (pp, fun);
draw_cfg_edges (pp, fun);
pp_printf (pp, "}\n");
pp_flush (pp);
}
So this function firstly gets the function name and creates a subgraph for this function, and then it draws nodes, which should be basic blocks. Finally, it draws edges, which should be jump instructions(?)
static void draw_cfg_nodes (pretty_printer *pp, struct function *fun)
/* Draw all the basic blocks in the CFG in case the loop tree is available.
All loop bodys are printed in clusters. */
static void
draw_cfg_nodes (pretty_printer *pp, struct function *fun)
{
if (loops_for_fn (fun))
draw_cfg_nodes_for_loop (pp, fun->funcdef_no, get_loop (fun, 0));
else
draw_cfg_nodes_no_loops (pp, fun);
}
if there's a loop tree, it calls draw_cfg_nodes_for_loop
to:
body
which can be considered as an array of basic blocks
basic block
, calls draw_cfg_node
to draw the basic blocksnote: see static void draw_cfg_nodes_for_loop (pretty_printer *pp, int funcdef_no, class loop *loop)
in graph.c
for detail
For the case that there's no loop tree for a function, it tries to computer the topological order starting from the entry block of the given function using DFS.
Then, according to such post order, it calls draw_cfg_node
to draws basic blocks
/* Draw all the basic blocks in the CFG in case loops are not available.
First compute a topological order of the blocks to get a good ranking of
the nodes. Then, if any nodes are not reachable from ENTRY, add them at
the end. */
static void
draw_cfg_nodes_no_loops (pretty_printer *pp, struct function *fun)
{
int *rpo = XNEWVEC (int, n_basic_blocks_for_fn (fun));
int i, n;
auto_sbitmap visited (last_basic_block_for_fn (cfun));
bitmap_clear (visited);
n = pre_and_rev_post_order_compute_fn (fun, NULL, rpo, true);
for (i = n_basic_blocks_for_fn (fun) - n;
i < n_basic_blocks_for_fn (fun); i++)
{
basic_block bb = BASIC_BLOCK_FOR_FN (cfun, rpo[i]);
draw_cfg_node (pp, fun->funcdef_no, bb);
bitmap_set_bit (visited, bb->index);
}
free (rpo);
if (n != n_basic_blocks_for_fn (fun))
{
/* Some blocks are unreachable. We still want to dump them. */
basic_block bb;
FOR_ALL_BB_FN (bb, fun)
if (! bitmap_bit_p (visited, bb->index))
draw_cfg_node (pp, fun->funcdef_no, bb);
}
}
XNEWVEC (int, n_basic_blocks_for_fn (fun));
: allocate an array of int
with the size of n_basic_blocks_for_fn (fun)
include/libiberty.h:366:#define XNEWVEC(T, N) ((T *) xmalloc (sizeof (T) * (N)))
auto_sbitmap visited (last_basic_block_for_fn (cfun));
:
class auto_sbitmap
, a class that ties the lifetime of a sbitmap to its scopesbitmap.h
n = pre_and_rev_post_order_compute_fn (fun, NULL, rpo, true)
:
cfganal.c
NULL
implies that we don't care the pre-orderrpo
suggests that we want to store the post order herefor (i = n_basic_blocks_for_fn (fun) - n; i < n_basic_blocks_for_fn (fun); i++)
:
basic_block bb = BASIC_BLOCK_FOR_FN (cfun, rpo[i]);
to retrieve the basic blockdraw_cfg_node (pp, fun->funcdef_no, bb);
to draw the basick blockif (n != n_basic_blocks_for_fn (fun))
: this part handles those basic blocks which r unreachableAnd how's it draw a BB? let's take a look at draw_cfg_node
in graph.h
/* Draw a basic block BB belonging to the function with FUNCDEF_NO
as its unique number. */
static void
draw_cfg_node (pretty_printer *pp, int funcdef_no, basic_block bb)
{
const char *shape;
const char *fillcolor;
if (bb->index == ENTRY_BLOCK || bb->index == EXIT_BLOCK)
{
shape = "Mdiamond";
fillcolor = "white";
}
else
{
shape = "record";
fillcolor =
BB_PARTITION (bb) == BB_HOT_PARTITION ? "lightpink"
: BB_PARTITION (bb) == BB_COLD_PARTITION ? "lightblue"
: "lightgrey";
}
pp_printf (pp,
"\tfn_%d_basic_block_%d "
"[shape=%s,style=filled,fillcolor=%s,label=\"",
funcdef_no, bb->index, shape, fillcolor);
if (bb->index == ENTRY_BLOCK)
pp_string (pp, "ENTRY");
else if (bb->index == EXIT_BLOCK)
pp_string (pp, "EXIT");
else
{
pp_left_brace (pp);
pp_write_text_to_stream (pp);
dump_bb_for_graph (pp, bb);
pp_right_brace (pp);
}
pp_string (pp, "\"];\n\n");
pp_flush (pp);
}
We can see that this function treats the entry and exit blocks specially by giving them special shape, color, and corresponding names.
As for the rest of the BBs it calls dump_bb_for_graph
to do the work
void dump_bb_for_graph (pretty_printer *pp, basic_block bb)
is defined in cfghooks.c
with the implementation below:
/* Dumps basic block BB to pretty-printer PP, for use as a label of
a DOT graph record-node. The implementation of this hook is
expected to write the label to the stream that is attached to PP.
Field separators between instructions are pipe characters printed
verbatim. Instructions should be written with some characters
escaped, using pp_write_text_as_dot_label_to_stream(). */
void
dump_bb_for_graph (pretty_printer *pp, basic_block bb)
{
if (!cfg_hooks->dump_bb_for_graph)
internal_error ("%s does not support dump_bb_for_graph",
cfg_hooks->name);
/* TODO: Add pretty printer for counter. */
if (bb->count.initialized_p ())
pp_printf (pp, "COUNT:" "%" PRId64, bb->count.to_gcov_type ());
pp_write_text_to_stream (pp);
if (!(dump_flags & TDF_SLIM))
cfg_hooks->dump_bb_for_graph (pp, bb);
}
cfg_hooks->dump_bb_for_graph (pp, bb)
suggests this callback function was registered somewhere. I've found two implementations so far:
void rtl_dump_bb_for_graph (pretty_printer *pp, basic_block bb)
in print-rtl.c
void gimple_dump_bb_for_graph (pretty_printer *pp, basic_block bb)
in gimple-pretty-print.c
In this section, I wil explore the rtl implementation for sake of simplicity
/* Dumps basic block BB to pretty-printer PP in slim form and without and
no indentation, for use as a label of a DOT graph record-node. */
void
rtl_dump_bb_for_graph (pretty_printer *pp, basic_block bb)
{
rtx_insn *insn;
bool first = true;
/* TODO: inter-bb stuff. */
FOR_BB_INSNS (bb, insn)
{
if (! first)
{
pp_bar (pp);
pp_write_text_to_stream (pp);
}
first = false;
print_insn_with_notes (pp, insn);
pp_write_text_as_dot_label_to_stream (pp, /*for_record=*/true);
}
}
/* Pretty-print a slim dump of X (an insn) to PP, including any register
note attached to the instruction. */
void
print_insn_with_notes (pretty_printer *pp, const rtx_insn *x)
{
pp_string (pp, print_rtx_head);
print_insn (pp, x, 1);
pp_newline (pp);
if (INSN_P (x) && REG_NOTES (x))
for (rtx note = REG_NOTES (x); note; note = XEXP (note, 1))
{
pp_printf (pp, "%s %s ", print_rtx_head,
GET_REG_NOTE_NAME (REG_NOTE_KIND (note)));
if (GET_CODE (note) == INT_LIST)
pp_printf (pp, "%d", XINT (note, 0));
else
print_pattern (pp, XEXP (note, 0), 1);
pp_newline (pp);
}
}
On the second thought, let's take a look at the GIMPLE implementation gimple_dump_bb_for_graph
void gimple_dump_bb_for_graph (pretty_printer *pp, basic_block bb)
in gimple-pretty-print.c
/* Dumps basic block BB to pretty-printer PP with default dump flags and
no indentation, for use as a label of a DOT graph record-node.
??? Should just use gimple_dump_bb_buff here, except that value profiling
histogram dumping doesn't know about pretty-printers. */
void
gimple_dump_bb_for_graph (pretty_printer *pp, basic_block bb)
{
pp_printf (pp, "<bb %d>:\n", bb->index);
pp_write_text_as_dot_label_to_stream (pp, /*for_record=*/true);
for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
gsi_next (&gsi))
{
gphi *phi = gsi.phi ();
if (!virtual_operand_p (gimple_phi_result (phi))
|| (dump_flags & TDF_VOPS))
{
pp_bar (pp);
pp_write_text_to_stream (pp);
pp_string (pp, "# ");
pp_gimple_stmt_1 (pp, phi, 0, dump_flags);
pp_newline (pp);
pp_write_text_as_dot_label_to_stream (pp, /*for_record=*/true);
}
}
for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
gsi_next (&gsi))
{
gimple *stmt = gsi_stmt (gsi);
pp_bar (pp);
pp_write_text_to_stream (pp);
pp_gimple_stmt_1 (pp, stmt, 0, dump_flags);
pp_newline (pp);
pp_write_text_as_dot_label_to_stream (pp, /*for_record=*/true);
}
dump_implicit_edges (pp, bb, 0, dump_flags);
pp_write_text_as_dot_label_to_stream (pp, /*for_record=*/true);
}
for this function, we see it firstly prints the bb's index.
then it deals with phi nodes, finally gimple statement.
PHI function or PHI node is used in SSA (Static Single Assignment) when we found the flow of control makes it impossible to determine the most recent version of a variable.
for example:
if (…)
a_1 = 5;
else if (…)
a_2 = 2;
else
a_3 = 13;
# a_4 = PHI <a_1, a_2, a_3>
return a_4;
the SSA renamer creates a new version a_4 which is assigned the result of “merging” a_1, a_2 and a_3. Hence, PHI nodes mean “one of these operands. I don’t know which”.
for more details, please refer to:
note: on Ubunto22.04, gcc11
apt-get install gcc make -y
apt-get install gcc-11-plugin-dev -y