# MISRA-C Rule 20 Preprocessing directives ###### tags: `GNU GCC` > This note is yours, feel free to play around. ## :memo: MISRA-C Rule 20 Rule 20 主要包含對預處理的規範。 我們在實現這個規則前必須先了解GCC內部的一些結構體以及其成員包含何種資料。 include header主要的儲存形式 ![permalink setting demo](https://i.imgur.com/ZJtx5kL.png) ### Rule 20.1 #include directives should only be preceded by preprocessor directives or comments ### Rule 20.2 The ', " or \ cha racters and the /* or // character sequences shall not occur in a headerfile name 因為規則要求不能出現指定的字元,引此我們需要創建一個pointer to char來指向fname,並走訪該字串查找有無違規字元出現。 - 修改位置 gcc-source 為您的開發版本 ex: gcc-7.5.0 ```c gcc-source/libcpp/directives.c static const char * parse_include (cpp_reader *pfile, int *pangle_brackets, const cpp_token ***buf, source_location *location) ``` - 規則描述 ```c= /* Following statement is Non-compliant */ #include "fi'le.h" ``` - 解決方法 ```c= const cpp_token *header; char *misra_c_tmp_fname; // for Misra-C Rule 20.2 header->src_loc header->val.str.text misra_c_tmp_fname = fname; while (*misra_c_tmp_fname) { if (*misra_c_tmp_fname == ',' || *misra_c_tmp_fname == '\\' || *misra_c_tmp_fname == '*' || *misra_c_tmp_fname == '/' || *misra_c_tmp_fname == '\'' || *misra_c_tmp_fname == '"') { if (CPP_OPTION(pfile, Wmisra_cpp_trigger)) { cpp_pedwarning(pfile, CPP_W_NONE, "Misra-C Rule 20.2\n"); cpp_warning_with_line(pfile, CPP_W_NONE, *location, 0, "Misra-C Rule 20.2\n"); } } misra_c_tmp_fname++; } ``` ### Rule 20.3 The #include directive shall be followed by either a \<filename> or "filename" sequence 我們新增一個變數來保留 header->val.str.text,他儲存了一個include 檔案的完整名,我們只需查找檔案名的第一個字元是否屬於規範的 `"` 開頭 或者 `<` 。 - 修改位置 gcc-source 為您的開發版本 ex: gcc-7.5.0 ```c gcc-source/libcpp/directives.c static const char * parse_include (cpp_reader *pfile, int *pangle_brackets, const cpp_token ***buf, source_location *location) ``` - 規則描述 ```c= #include FILENAME /* Non-compliant */ #include another.h /* Non-compliant */ #include/* a comment */"R_20_03.h" /* Compliant - comment permitted */ ``` - 解決方法 ```c= misra_c_tmp_header_text = (char *)header->val.str.text; if (*misra_c_tmp_header_text == '"' || *misra_c_tmp_header_text == '<') { ; // correct } else if (CPP_OPTION(pfile, Wmisra_cpp_trigger)) { cpp_warning_with_line(pfile, CPP_W_NONE, *location, 0, "Misra-C Rule 20.3\n"); } ``` ### Rule 20.4 A macro shall not be defined with the same name as a keyword 為了避開所有可能的保留字(keyword),我們選擇將所有C語言的保留字放入一個array之中,雖然也可以使用GCC內部的keyword枚舉型態,但由於該整數array之中穿插不少非C語言所定義的部分,以及考慮到可讀性的問題,所以選擇重構該保留字陣列,並將marco的字串與保留字字串陣列依序比較,如果有重複就回報錯誤。 - 修改位置 gcc-source 為您的開發版本 ex: gcc-7.5.0 ```c gcc-source/libcpp/directives.c static void do_define (cpp_reader *pfile) ``` - 規則描述 ```c= #define while( E ) for ( ; ( E ) ; ) /* Non-compliant - redefined while */ #define inline /* Non-compliant C99, Compliant C90 */ ``` - 解決方法 ```c= int misra_compare_string(char *str1, char *str2) { while (*str1 && *str2) { if (*str1 != *str2) { return 1; } str1++; str2++; } return 0; } cpp_hashnode *node = lex_macro_node (pfile, true); char *misra_tmp_macro_name = (char *)node->ident.str; int c_keyword_size = 0; int c_keyword_number = 0; int misra_c_define = 0; // misra-c 20.4 //fprintf(stderr, (char *)node->ident.str); const char *misra_tmp_c_keyword[] = { "static", "unsigned", "long", "const", "extern", "register", "typedef", "short", "inline", "volatile", "signed", "auto", "restrict", "noreturn", "atomic", "int", "char", "float", "double", "void", "enum", "struct", "union", "if", "else", "while", "do", "for", "switch", "case", "default", "break", "continue", "return", "goto", "sizeof" }; c_keyword_size = sizeof(misra_tmp_c_keyword); c_keyword_number = c_keyword_size / sizeof(char *); if (CPP_OPTION(pfile,Wmisra_cpp_trigger)) { if (*misra_tmp_macro_name != '_') { for (int i = 0; i < c_keyword_number; i++) { if (!misra_compare_string(misra_tmp_c_keyword[i], misra_tmp_macro_name)) { cpp_pedwarning (pfile, CPP_W_NONE, "Misra-C Rule 20.4\n"); } } } } ``` ### Rule 20.5 #undef should not be used 我們只需在GCC要解析undef的加入提式訊息即可 - 修改位置 gcc-source 為您的開發版本 ex: gcc-7.5.0 ```c gcc-source/libcpp/directives.c static void do_undef (cpp_reader *pfile) ``` - 規則描述 ```c= #define QUALIFIER volatile #undef QUALIFIER /* Non-compliant */ #ifdef QUALIFIER ``` - 解決方法 ```c= if(CPP_OPTION(pfile,Wmisra_cpp_trigger)) cpp_pedwarning (pfile,CPP_W_NONE,"\n================================Misra-c 2012 rule violation:20.5================================\n"\ "Rule 20.5:#undef should not be used\n"\ "Category: Advisory\n"\ "Analysis: Decidable, Single translation unit\n" "Allpies to:C90, C99\n"); ``` ### Rule 20.6 Tokens that look like a preprocessing directive shall not occur within a macro argument 首先我們查找fun_like確認該macro使用方式為類函式模式,創建一個變數misra_macro_temp來保存pfile->buffer->cur的值,因為我們要走訪他避免破壞原有的指向,接著只要在該字串中找到有使用類似於preprocessing的子字串就提供錯誤資訊。 - 修改位置 gcc-source 為您的開發版本 ex: gcc-7.5.0 ```c gcc-source/libcpp/macro.c static _cpp_buff * collect_args (cpp_reader *pfile, const cpp_hashnode *node, _cpp_buff **pragma_buff, unsigned *num_args) ``` - 規則描述 ```c= #include <stdio.h> #define M( A ) ( void ) printf( #A ) /* breaks D.4.9, R.21.6 and R.20.10 */ void R_20_6 ( void ) { M ( #ifdef SW /* Non-compliant */ "Message 1" #else /* Non-compliant */ "Message 2" #endif /* Non-compliant */ ); } ``` - 解決方法 ```c= if (node->value.macro->fun_like && CPP_OPTION(pfile, Wmisra_cpp_trigger)) { // MISRA-C 20.6 misra_macro_temp = (unsigned char *)pfile->buffer->cur; while (*misra_macro_temp) { if (*misra_macro_temp == '\n' || *misra_macro_temp == '\t' || *misra_macro_temp == ' ') { misra_macro_temp++; } else if (*misra_macro_temp == '#') { misra_macro_temp++; misra_preprocesser = 1; break; } else { break; } } if (misra_preprocesser && *misra_macro_temp) { if (*misra_macro_temp == 'i' && *(misra_macro_temp + 1) == 'f' || *misra_macro_temp == 'e' && *(misra_macro_temp + 1) == 'l' || *misra_macro_temp == 'e' && *(misra_macro_temp + 1) == 'n') { //inform(token->src_loc, "Misra-C 20.6\n"); cpp_error (pfile, CPP_DL_NOTE,"Misra-C Rule 20.6\n"); SYNTAX_WARNING_AT(node->value.macro->line, "The macro is define in here\n"); } } } ``` ### Rule 20.7 Expressions resulting from the expansion of macro parameters shall be enclosed in parentheses ### Rule 20.8 The controlling expression of a #if or #elif preprocessing directive shall evaluate to 0 or 1 該規則要求 #if #elif 後的運算子必須擁有boolean type並限制值為0或1,所以我們查找 top,他會去訪問堆疊並計算該運算最終的值。 該函式解析和評估 C 運算式,從 Pfile 讀取。返回表達式的真值。 該實現是一個運算符優先級解析器,即自底向上的解析器,使用堆棧來表示尚未減少的標記。 堆疊基礎是op_stack,當前堆疊指標是'top'。 每個運算元都有一個堆疊元素(only),最近推送的運算符是"top->op"。 操作數(值)存儲在它前面的運算符的堆疊元素的"值"成員中。 其型態如下: ```c= struct op { const cpp_token *token; /* The token forming op (for diagnostics). */ cpp_num value; /* The value logically "right" of op. */ source_location loc; /* The location of this value. */ enum cpp_ttype op; }; typedef uint64_t cpp_num_part; typedef struct cpp_num cpp_num; struct cpp_num { cpp_num_part high; cpp_num_part low; bool unsignedp; /* True if value should be treated as unsigned. */ bool overflow; /* True if the most recent calculation overflowed. */ }; ``` 我們查找其中的low成員,他會存放最終的運算結果。 - 修改位置 ```c gcc-source/libcpp/expr.c bool _cpp_parse_expr (cpp_reader *pfile, bool is_if) ``` - 規則描述 ```c= #if 10 /* Non-compliant */ #endif #if ! defined ( X ) /* Compliant */ #endif #if A > B /* Compliant */ #endif ``` - 解決方法 ```c= #define SYNTAX_WARNING_AT(loc, msgid) \ do { cpp_error_with_line (pfile, CPP_DL_NOTE, (loc), 0, msgid); ; } \ while(0) if (top->value.low == 1 || top->value.low == 0) { ; } else if (CPP_OPTION(pfile , Wmisra_cpp_trigger)) { SYNTAX_WARNING_AT(pfile->directive_line, "Misra-C Rule 20.8\n"); } ``` ### Rule 20.9 All identifiers used in the controlling exp ression of #if or #elif preprocessing directives shall be #define'd before evaluation - 修改位置 ```c gcc-source/libcpp/expr.c static cpp_num eval_token (cpp_reader *pfile, const cpp_token *token, source_location virtual_location) ``` - 規則描述 ```c= #if M == 0 /* Non-compliant */ /* Does 'M' expand to zero or is it undefined? */ #endif #if defined ( M ) /* Compliant - M is not evaluated */ #if M == 0 /* Compliant - M is known to be defined */ /* 'M' must expand to zero. */ #endif #endif ``` - 解決方法 ```c= libcpp/internal.h ptype pfile->state type = struct lexer_state { unsigned char in_directive; unsigned char directive_wants_padding; unsigned char skipping; unsigned char angled_headers; unsigned char in_expression; unsigned char save_comments; unsigned char va_args_ok; unsigned char poisoned_ok; unsigned char prevent_expansion; unsigned char parsing_args; unsigned char in__has_include__; unsigned char discarding_output; unsigned int skip_eval; unsigned char in_deferred_pragma; unsigned char pragma_allow_expansion; } /* Nonzero to skip evaluating part of an expression. */ unsigned int skip_eval; if (CPP_OPTION(pfile , Wmisra_cpp_trigger) && !pfile->state.skip_eval) { SYNTAX_WARNING_AT(virtual_location, "Misra-C Rule 20.9\n"); } ``` ### Rule 20.10 The # and ## preprocessor operators should not be used 由於該規則要求預處理運算式裏頭不能包含 '#' 以及 '##',故我們只要找出GCC解析preprocessor operators的位置,並判斷該token的型態是不是 CPP_HASH即可,並依據該token的位置提出警示。 - 修改位置 ```c gcc-source/libcpp/macro.c static bool create_iso_definition (cpp_reader *pfile, cpp_macro *macro) gcc-source/libcpp/directives.c static void do_define (cpp_reader *pfile) ``` - 規則描述 ```c= #define A( x ) #x /* Non-compliant */ #define B( x, y ) x##y = 0 /* Non-compliant */ ``` - 解決方法 ```c= if (macro->count > 1 && token[-1].type == CPP_HASH && macro->fun_like) { if (token->type == CPP_MACRO_ARG && CPP_OPTION(pfile,Wmisra_cpp_trigger)) { //cpp_warning (pfile, CPP_W_NONE, "Misra-C Rule 20.10\n"); SYNTAX_WARNING_AT(token[-1].src_loc, "Misra-C Rule 20.10\n"); } . . . misra_macro_20_10 = (unsigned char *)pfile->buffer->cur; if (CPP_OPTION(pfile, Wmisra_cpp_trigger)) { if (*misra_tmp_macro_name != '_') while (*misra_macro_20_10) { if (*misra_macro_20_10 == '#' && *(misra_macro_20_10 + 1) && *(misra_macro_20_10 + 1) == '#') { cpp_pedwarning (pfile, CPP_W_NONE, "Misra-C Rule 20.10\n"); } else if (*misra_macro_20_10 == '\n') { break; } misra_macro_20_10++; } } ``` ### Rule 20.11 A macro parameter immediately following a # operator shall not immediately be followed by a ## operator 承接上一個Rule,我們繼續走訪macro參數。 由於上一個Rule我們已經找到 '#',因此只要往後連續找到兩個'##'就能提報錯誤。 - 修改位置 ```c gcc-source/libcpp/macro.c static bool create_iso_definition (cpp_reader *pfile, cpp_macro *macro) ``` - 規則描述 ```c= #define A( x ) #x /* Compliant */ #define B( x, y ) x ## y /* Compliant */ #define C( x, y ) #x ## y /* Non-compliant */ ``` - 解決方法 ```c= if (token->type == CPP_MACRO_ARG && CPP_OPTION(pfile, Wmisra_cpp_trigger)) { //cpp_warning (pfile, CPP_DL_NOTE, "Misra-C Rule 20.10\n"); SYNTAX_WARNING_AT(token[-1].src_loc, "Misra-C Rule 20.10\n"); misra_macro_20_11 = (unsigned char *)pfile->buffer->cur; while (*misra_macro_20_11) { if (*misra_macro_20_11 == '#' && *(misra_macro_20_11 + 1) && *(misra_macro_20_11 + 1) == '#') { cpp_warning (pfile, CPP_W_NONE, "Misra-C Rule 20.11\n"); } else if (*misra_macro_20_11 == '\n') { break; } misra_macro_20_11++; } } ``` ### Rule 20.13 A line whose first token is # shall be a valid preprocessing directive 首先我們從do_else function開始,該函式負責處理C preprocessor的#else,dump buffer->cur跟 buffer->line_base可以看到以下資訊。 ![](https://i.imgur.com/FgedicD.png) 透過反推即可得知cur指向去除#else後的記憶體位置,為了驗證#後的token 是否符合規範,我們將創造一個新的指標來指向line_base,以此來檢索其是否正確。 也可以返回呼叫該函式的上一級函式,透過_cpp_handle_directive函室內的資訊來幫助我們驗證該規則,我們創建一個pointer to char名稱為misra_20_13並初始化為(void *)0接著觀察const directive *dir變數,如果他指向空表示CPP解析到一個未知的指令,我們將misra_20_13轉為指向其目前解析的token的名稱。 ```c= (gdb) p dir $5 = (const directive *) 0x0 Then misra_20_13 point to token name (gdb) p dname->val.node->node->ident.str $12 = (const unsigned char *) 0x7ffff7418090 "else1" ``` 透過ptype搭配print我們可以逐層搜索一個複雜的struct內涵的元素型態,以及其自身的資料型態。 ```c= (gdb) p dname->val.node->node->ident $11 = {str = 0x7ffff7418090 "else1", len = 5, hash_value = 4051658144} (gdb) ptype dname->val.node->node->ident type = struct ht_identifier { const unsigned char *str; unsigned int len; unsigned int hash_value; } (gdb) p dname->val.node->node->ident.str $12 = (const unsigned char *) 0x7ffff7418090 "else1" ``` ![](https://i.imgur.com/obUlxGc.png) 我們印出dname的值,可以看到他是一個pointer to const cpp_token,其中的cpp_token是一個結構,逐層dereference後可以得到最後的結構ht_identifier,裡面的成員str及指向我們要查找的名稱(else1)。 至此我們已經完成攔截有違反MISRA-C Rule 20.13規則的部份了。 ![](https://i.imgur.com/cVyVgc3.png) 回到前面dump出來的dname資訊,裡面有個被宣告型態為source_location的變數,他的值為11522692,這個就是該名稱在CPP解析階段分配給他的虛擬位置。 搭配inform function,我們可以看到他確實指向else1的開頭。 ```c= (gdb) p *dname $16 = {src_loc = 11522692, type = CPP_NAME, flags = 0, val = {node = {node = 0x7ffff7413e78, spelling = 0x7ffff7413e78}, source = 0x7ffff7413e78, str = { len = 4148248184, text = 0x7ffff7413e78 "\220\200A\367\377\177"}, macro_arg = {arg_no = 4148248184, spelling = 0x7ffff7413e78}, token_no = 4148248184, pragma = 4148248184}} (gdb) p inform(11522692, "Misra-C Rule 20.13") ``` - 修改位置 ```c= gcc-source/libcpp/directives.c int _cpp_handle_directive (cpp_reader *pfile, int indented) ``` - 規則描述 ```c= #ifndef AAA x = 1; #else1 /* Non-compliant */ x = AAA; #endif ``` ### Rule 20.14 All #else, #elif and #endif preprocessor directives shall reside in the same file as the #if, #ifdef or #ifndef directive to which they are related 我們查看結構 if_stack 其型態如下: ```c= (gdb) ptype ifs type = struct if_stack { if_stack *next; source_location line; const cpp_hashnode *mi_cmacro; bool skip_elses; bool was_skipping; int type; } * ``` 其中的成員mi_cmacro是ifndef的巨集名稱,line的型態同樣為source_location,代表其同樣保存在原始碼內的對應位置。 一班正常的解析情形下,if_stack不會指向0x0(NULL),而是會像如下所示: ```c= (gdb) p *ifs->mi_cmacro $6 = {ident = {str = 0x7ffff73b31b0 "_STDC_PREDEF_H", len = 14, hash_value = 4173406337}, is_directive = 0, directive_index = 0, rid_code = 0 '\000', type = NT_MACRO, flags = 0, value = {macro = 0x7ffff73b4ba0, answers = 0x7ffff73b4ba0, builtin = 4147858336, arg_index = 19360}} (gdb) p inform(69760, "Misra test") /usr/include/stdc-predef.h:18:0: note: Misra test #ifndef _STDC_PREDEF_H $7 = void ``` 如果我們已經很清楚的知道跟解析函式的執行流程,由於CPP在解析的過程還會涉及很多GCC內部的資訊,因此我們可以藉由放置條件式斷點來直接攔截我們要查找的部分。 因為if_stack指向0x0即表示#else、#elif、#endif在別處定義或是沒有被定義,因此違反該規則。 ```c= (gdb) b 2281 if ifs == 0 ``` 直接放置條件斷點並觀察警告是否觸發即可驗證對該規則的修改是否正確。 不過由於該條件下 ifs 為 0 無法對其做操作取得其對應的原始程式碼位置。 因此我們需要從別處取得其準確位置,我們可以查看struct cpp_reader的成員 source_location directive_line,該變數保存的位置為當前解析的以"#"開頭的行號位置。 ```c= (gdb) ptype pfile type = struct cpp_reader { cpp_buffer *buffer; cpp_buffer *overlaid_buffer; lexer_state state; line_maps *line_table; source_location directive_line; ... (後面省略) ``` - 修改位置 ```c= gcc-source/libcpp/directives.c static void do_endif (cpp_reader *pfile) static void do_elif (cpp_reader *pfile) static void do_else (cpp_reader *pfile) ``` - 規則描述 ```c= #if 1 /* Non-compliant */ #include "R_20_14_2.h" ```