How to c in 2016

Outdated. Need to rewirte

tags: C tips

compiler option

** Warnings**

  • -Wall ,-Wextra ,-pedantic/-Wpedantic
    • -Wpedantic:ๆชขๆŸฅๆ˜ฏๅฆ็ฌฆๅˆiso C่ฆ็ฏ„
  • during testing you should add -Werror and -Wshadow on all your platforms
  • -Wstrict-overflow ,-fno-strict-aliasing
  • Arch
    • -march=native
    • give the compiler permission to use your CPU's full feature set
      • again, performance testing and regression testing is important (then comparing the results across multiple compilers and/or compiler versions) is important to make sure any enabled optimizations don't have adverse side effects.

writting

Types

โ€‹โ€‹โ€‹โ€‹#include <stdint.h>
  • The common standard types are:
    1. int8_t, int16_t, int32_t, int64_t โ€” signed integers
    2. uint8_t, uint16_t, uint32_t, uint64_t โ€” unsigned integers
    3. float โ€” standard 32-bit floating point
    4. double -standard 64-bit floating point

** Special Standard Types**

  • In addition to standard fixed-width like uint16_t and int32_t, we also have fast and least types defined in the stdint.h specification.
  • Fast types are:
    int_fast8_t, int_fast16_t, int_fast32_t, int_fast64_t โ€”signed integers
    uint_fast8_t, uint_fast16_t, uint_fast32_t, uint_fast64_t โ€”unsigned integers
  • Least types are:
    int_least8_t, int_least16_t, int_least32_t, int_least64_t โ€”signed integers
    uint_least8_t, uint_least16_t, uint_least32_t, uint_least64_t โ€”unsigned integers

** One Exception to never-char**
The only acceptable use of char in 2016 is if a pre-existing API requires char (e.g. strncat, printf'ing "%s", โ€ฆ) or if you're initializing a read-only string

โ€‹โ€‹โ€‹โ€‹ (e.g. const char *hello = "hello";) because the C type of string literals ("hello") is char[].

ALSO: In C11 we have native unicode support, and the type of UTF-8 string literals is still char

โ€‹โ€‹โ€‹โ€‹[] even for multibyte sequences like const
โ€‹โ€‹โ€‹โ€‹char *abcgrr = u8"abc?";

Pointers as Integers

The correct type for pointer math is uintptr_t defined by <stdint.h>, while the also useful ptrdiff_t is defined by stddef.h.

Instead of:

โ€‹โ€‹โ€‹โ€‹long diff = (long)ptrOld - (long)ptrNew;

Use:

โ€‹โ€‹โ€‹โ€‹ptrdiff_t diff = (uintptr_t)ptrOld - (uintptr_t)ptrNew;

** System-Dependent Types **
If we skip over the line of thinking where you are deliberately introducing difficult to reason about code by using two different sizes depending on platform, you still don't want to use long for system-dependent types.
In these situations, you should use intptr_t โ€” the integer type capable of holding a pointer value for your platform.

  • On modern 32-bit platforms, intptr_t is int32_t.
  • On modern 64-bit platforms, intptr_t is int64_t.
  • intptr_t also comes in a uintptr_t flavor.

For holding pointer offsets, we have the aptly named ptrdiff_t which is the proper type for storing values of subtracted pointers.

** Maximum Value Holders **
The safest container for any integer is intmax_t (also uintmax_t). You can assign or cast any signed integer tointmax_t with no loss of precision, and you can assign or cast any unsigned integer to uintmax_t with no loss of precision.

** That Other Type **
Other uses include: size_t is the type of the argument to malloc, and ssize_t is the return type of read() andwrite() (except on Windows where ssize_t doesn't exist and the return values are just int).

Printing Types

  • size_t - %zu
  • ssize_t - %zd
  • ptrdiff_t - %td
  • raw pointer value - %p (prints hex in modern compilers; cast your pointer to (void**) first)
  • int64_t - "%"PRId64
  • uint64_t - "%"PRIu64
  • intptr_t โ€” "%"PRIdPTR
  • uintptr_t โ€” "%"PRIuPTR
  • intmax_t โ€” "%"PRIdMAX
  • uintmax_t โ€” "%"PRIuMAX

One note about the PRI* formatting specifiers: they are macros and the macros expand to proper printf type specifiers on a platform-specific basis. This means you can't do:

โ€‹โ€‹โ€‹โ€‹printf("Local number: %PRIdPTR\n\n", someIntPtr);

but instead, because they are macros, you do:

โ€‹โ€‹โ€‹โ€‹printf("Local number: %" PRIdPTR "\n\n", someIntPtr);

Notice you put the % inside your format string literal, but the type specifier is outside your format string literal because all adjacent strings get concatentated by the preprocessor into one final combined string literal.

** C99 allows variable declarations anywhere**
if you have tight loops, test the placement of your initializers. Sometimes scattered declarations can cause unexpected slowdowns.
So, do NOT do this:

โ€‹โ€‹โ€‹โ€‹void test(uint8_t input) {
โ€‹โ€‹โ€‹โ€‹    uint32_t b;
โ€‹โ€‹โ€‹โ€‹    if (input > 3) {return;}
โ€‹โ€‹โ€‹โ€‹    b = input;
โ€‹โ€‹โ€‹โ€‹}

do THIS instead:

โ€‹โ€‹โ€‹โ€‹void test(uint8_t input) {    
โ€‹โ€‹โ€‹โ€‹if (input > 3) {return;}
โ€‹โ€‹โ€‹โ€‹    uint32_t b = input;
โ€‹โ€‹โ€‹โ€‹}

** Modern compilers support **#pragma** **once

So, do NOT do this:

โ€‹โ€‹โ€‹โ€‹#ifndef PROJECT_HEADERNAME#define PROJECT_HEADERNAME...#endif /* PROJECT_HEADERNAME */

Do THIS instead:

โ€‹โ€‹โ€‹โ€‹#pragma once
โ€‹โ€‹โ€‹โ€‹#pragma
โ€‹โ€‹โ€‹โ€‹once tells the compiler to only include your header once and you do not need three lines of header guards anymore. This pragma is widely supported across all compilers across all platforms and is recommended over manually naming header guards.

For more details, see list of supported compilers at pragma once.

C allows static initialization of auto-allocated arrays

So, do NOT do this:

โ€‹โ€‹โ€‹โ€‹    uint32_t numbers[64];    memset(numbers, 0, sizeof(numbers));

Do THIS instead:

โ€‹โ€‹โ€‹โ€‹    uint32_t numbers[64] = {0};

C allows static initialization of auto-allocated structs

So, do NOT do this:

โ€‹โ€‹โ€‹โ€‹    struct thing {uint64_t index;uint32_t counter;};
โ€‹โ€‹โ€‹โ€‹    struct thing localThing;
โ€‹โ€‹โ€‹โ€‹    void initThing(void) {memset(&localThing, 0, sizeof(localThing));}

Do THIS instead:

โ€‹โ€‹โ€‹โ€‹    struct thing {uint64_t index;uint32_t counter;};
โ€‹โ€‹โ€‹โ€‹    struct thing localThing = {0};

IMPORTANT NOTE: If your struct has padding, the {0} method does not zero out extra padding bytes. For example, struct thing has 4 bytes of padding after counter (on a 64-bit platform) because structs are padded to word-sized increments. If you need to zero out an entire struct including unused padding, usememset(&localThing,0, sizeof(localThing)) because sizeof(localThing)== 16 bytes even though the addressable contents is only 8

  • 4 = 12 bytes.

If you need to re-initialize already allocated structs, declare a global zero-struct for later assignment:

โ€‹โ€‹โ€‹โ€‹struct thing {uint64_t index;uint32_t counter;};    
โ€‹โ€‹โ€‹โ€‹static const struct thing localThingNull = {0};    
โ€‹โ€‹โ€‹โ€‹struct thing localThing = {.counter = 3};    
โ€‹โ€‹โ€‹โ€‹localThing = localThingNull;

If you are lucky enough to be in a C99 (or newer) environment, you can use compound literals instead of keeping a global "zero struct" around (also see, from 2001, The New C: Compound Literals).
Compound literals allow your compiler to automatically create temporary anyonomous structs then copy them onto a target value:

โ€‹โ€‹โ€‹โ€‹    localThing = (struct thing){0};

C99 added variable length arrays (C11 made them optional)

So, do NOT do this (if you know your array is tiny or you are just testing something quickly):

โ€‹โ€‹โ€‹โ€‹    uintmax_t arrayLength = strtoumax(argv[1], NULL, 10);    
โ€‹โ€‹โ€‹โ€‹    void *array[];
โ€‹โ€‹โ€‹โ€‹    array = malloc(sizeof(*array) * arrayLength);
โ€‹โ€‹โ€‹โ€‹    /* remember to free(array) when you're done using it */

Do THIS instead:

โ€‹โ€‹โ€‹โ€‹    uintmax_t arrayLength = strtoumax(argv[1], NULL, 10);    
โ€‹โ€‹โ€‹โ€‹    void *array[arrayLength];
โ€‹โ€‹โ€‹โ€‹    /* no need to free array */

C99 allows annotating non-overlapping pointer parameters

See the restrict keyword (often __restrict)

Parameter Types

If a function accepts arbitrary input data and a length to process, don't restrict the type of the parameter.
So, do NOT do this:

โ€‹โ€‹โ€‹โ€‹void processAddBytesOverflow(uint8_t *bytes, uint32_t len) 
โ€‹โ€‹โ€‹โ€‹{    
โ€‹โ€‹โ€‹โ€‹for (uint32_t i = 0; i < len; i++) {
โ€‹โ€‹โ€‹โ€‹        bytes[0] += bytes[i];    
โ€‹โ€‹โ€‹โ€‹        }
โ€‹โ€‹โ€‹โ€‹}

Do THIS instead:

โ€‹โ€‹โ€‹โ€‹void processAddBytesOverflow(void *input, uint32_t len) {
โ€‹โ€‹โ€‹โ€‹    uint8_t *bytes = input;
โ€‹โ€‹โ€‹โ€‹    for (uint32_t i = 0; i < len; i++) {
โ€‹โ€‹โ€‹โ€‹            bytes[0] += bytes[i];    
โ€‹โ€‹โ€‹โ€‹            }
โ€‹โ€‹โ€‹โ€‹}

Return Parameter Types

C99 gives us the power of <stdbool.h> which defines true to 1 and false to 0.

So, do NOT do this:

โ€‹โ€‹โ€‹โ€‹void *growthOptional(void *grow, size_t currentLen, size_t newLen) {    
โ€‹โ€‹โ€‹โ€‹    if (newLen > currentLen) {
โ€‹โ€‹โ€‹โ€‹        void *newGrow = realloc(grow, newLen);
โ€‹โ€‹โ€‹โ€‹                if (newGrow) {/* resize success */grow = newGrow;} 
โ€‹โ€‹โ€‹โ€‹                else {
โ€‹โ€‹โ€‹โ€‹                /* resize failed, free existing and signal failure through NULL */            free(grow);            
โ€‹โ€‹โ€‹โ€‹                grow = NULL;        
โ€‹โ€‹โ€‹โ€‹                }    
โ€‹โ€‹โ€‹โ€‹}
โ€‹โ€‹โ€‹โ€‹    return grow;
โ€‹โ€‹โ€‹โ€‹}

Do THIS instead:

โ€‹โ€‹โ€‹โ€‹/* Return value: *  - 'true' if newLen > currentLen and attempted to grow *    - 'true' does not signify success here, the success is still in '*_grow' *  - 'false' if newLen <= currentLen */
โ€‹โ€‹โ€‹โ€‹bool growthOptional(void **_grow, size_t currentLen, size_t newLen) {
โ€‹โ€‹โ€‹โ€‹    void *grow = *_grow;    
โ€‹โ€‹โ€‹โ€‹    if (newLen > currentLen) {        
โ€‹โ€‹โ€‹โ€‹    void *newGrow = realloc(grow, newLen);
โ€‹โ€‹โ€‹โ€‹            if (newGrow) {
โ€‹โ€‹โ€‹โ€‹                        /* resize success */            
โ€‹โ€‹โ€‹โ€‹                        *_grow = newGrow;            
โ€‹โ€‹โ€‹โ€‹                        return true;        
โ€‹โ€‹โ€‹โ€‹                        }
โ€‹โ€‹โ€‹โ€‹        /* resize failure */        
โ€‹โ€‹โ€‹โ€‹        free(grow);        
โ€‹โ€‹โ€‹โ€‹        *_grow = NULL;
โ€‹โ€‹โ€‹โ€‹        /* for this function,         * 'true' doesn't mean success, it means 'attempted grow' */        
โ€‹โ€‹โ€‹โ€‹        return true;    
โ€‹โ€‹โ€‹โ€‹        }
โ€‹โ€‹โ€‹โ€‹    return false;
โ€‹โ€‹โ€‹โ€‹    }

Or, even better, Do THIS instead:

โ€‹โ€‹โ€‹โ€‹typedef enum growthResult {    
โ€‹โ€‹โ€‹โ€‹GROWTH_RESULT_SUCCESS = 1,    
โ€‹โ€‹โ€‹โ€‹GROWTH_RESULT_FAILURE_GROW_NOT_NECESSARY,    
โ€‹โ€‹โ€‹โ€‹GROWTH_RESULT_FAILURE_ALLOCATION_FAILED
โ€‹โ€‹โ€‹โ€‹} growthResult;
โ€‹โ€‹โ€‹โ€‹
โ€‹โ€‹โ€‹โ€‹growthResult growthOptional(void **_grow, size_t currentLen, size_t newLen) {    
โ€‹โ€‹โ€‹โ€‹    void *grow = *_grow;    
โ€‹โ€‹โ€‹โ€‹    if (newLen > currentLen) {
โ€‹โ€‹โ€‹โ€‹            void *newGrow = realloc(grow, newLen);        
โ€‹โ€‹โ€‹โ€‹            if (newGrow) {
โ€‹โ€‹โ€‹โ€‹                        /* resize success */            
โ€‹โ€‹โ€‹โ€‹                        *_grow = newGrow;            
โ€‹โ€‹โ€‹โ€‹                        return GROWTH_RESULT_SUCCESS;        
โ€‹โ€‹โ€‹โ€‹                        }
โ€‹โ€‹โ€‹โ€‹        /* resize failure, don't remove data because we can signal error */        return GROWTH_RESULT_FAILURE_ALLOCATION_FAILED;    
โ€‹โ€‹โ€‹โ€‹        }
โ€‹โ€‹โ€‹โ€‹    return GROWTH_RESULT_FAILURE_GROW_NOT_NECESSARY;
โ€‹โ€‹โ€‹โ€‹    }

Formatting

The only usable C formatter as of 2016 is clang-format. clang-format has the best defaults of any automatic C formatter and is still actively developed.
clang-format:

โ€‹โ€‹โ€‹โ€‹clang-format -help

An easy way to create the .clang-format file is:

โ€‹โ€‹โ€‹โ€‹clang-format -style=llvm -dump-config > .clang-format

There is an integration for vim which lets you run the clang-format standalone tool on your current buffer, optionally selecting regions to reformat. The integration has the form of a python-file which can be found under clang/tools/clang-format/clang-format.py.
This can be integrated by adding the following to your .vimrc:

โ€‹โ€‹โ€‹โ€‹map <C-K> :pyf <path-to-this-file>/clang-format.py<cr>
โ€‹โ€‹โ€‹โ€‹imap <C-K> <c-o>:pyf <path-to-this-file>/clang-format.py<cr>

Script for patch reformatting

โ€‹โ€‹โ€‹โ€‹usage: clang-format-diff.py [-h] [-i] [-p NUM] [-regex PATTERN] [-style STYLE]

So to reformat all the lines in the latest git commit, just do:

โ€‹โ€‹โ€‹โ€‹git diff -U0 HEAD^ | clang-format-diff.py -i -p1

misc thoughts

Never use malloc

You should always use calloc. There is no performance penalty for getting zero'd memory. If you don't like the function protype of calloc(objectcount, size per object) you can wrap it with

โ€‹โ€‹โ€‹โ€‹#define mycalloc(N) calloc(1, N)

Never memset (if you can avoid it)

Never memset(ptr,0, len) when you can statically initialize a structure (or array) to zero (or reset it back to zero by assigning from an in-line compound literal or by assigning from a global zero'd out structure).

Though, memset() is your only choice if you need to zero out a struct including its padding bytes (because {0}only sets defined fields, not undefined offsets filled by padding).