2016q3 Homework3

contributed by <SarahCheng>

tags: sarah,2016q3_hw3

閱讀筆記

Modern Microprocessors

  • VLIW designs are not interlocked.
    • not check for dependencies between instructions.
    • have no way of stalling instructions other than to stall the whole processor on a cache miss.
    • The programmable shaders in graphics processors (GPUs) are sometimes VLIW designs, as are many digital signal processors

看不太懂: This complicates the compiler somewhat, because it is doing something that a superscalar processor normally does at runtime, however the extra code in the compiler is minimal and it saves precious resources on the processor chip.

  • Branches & Branch Prediction
    • make the guess.(Two alternatives)
      • static >compiler
      • dynamic>on-chip branch prediction table containing the addresses of recent branches and a bit indicating whether each branch was taken or not last time.

      In reality, most processors actually use two bits, so that a single not-taken occurrence doesn't reverse a generally taken prediction (important for loop back edges).
      *The most advanced modern processors often implement several branch predictors

  • Use empty cycles
    • reordering in hardware at runtime:OOO( software need not be recompiled)

      complex logic to the processor,harder to design, larger chip area, power-hungry

    • rearranging the instructions :compiler (multiple paths)

      means more cores, or extra cache, could be placed onto the same amount of chip area

SIMD

  • The difficult parts
    • Finding Parallelism in Algorithm
    • Portability between different intrinsics
    • Boundary handling
      • Padding, Predication, Fallback
    • Divergence
      • Predication, Fallback to scalar
    • Register Spilling
      • multi-stages, # of variables + braces, assembly
    • Non-Regular Access/Processing Pattern/Dependency
      • Multi-stages/Reduction ISA/Enhanced DMA
    • Unsupported Operations
      • Division, High-level function (eg: math functions)
    • Floating-Point
      • Unsupported/cross-deivce Compatiblilty

mergesort-concurrent

參照 LanKuDot學長的實作

Test

  • Problem:
    • 確保編譯時加入 -g 參數,確保包含 debug info 的執行檔正確產生

      —g加在makefile的OBJS和sort 沒有用

    • 不要直接修改git的master,建一個local branch,push成遠端branch

      git checkout -b testbysarah git push -u origin testbysarah

作業

  • $ uniq words.txt | sort -R > input.txt

argc 是argument count(參數總和)的縮寫,代表包括指令本身的參數個數。系統會自動計算所輸入的參數個數。
argv 則是argument value 的縮寫。代表參數值。
也就是使用者在命令列中輸入的字串,每個字串以空白相隔。
同時,系統會自動將程式本身的名稱指定給argv[0] ,再將程式名稱後面所接續的參數依序指定給argv[1]

Doxygen

  • install
    • git clone https://github.com/doxygen/doxygen.git
    • cd doxygen
    • mkdir build
    • cd build
    • cmake -G "Unix Makefiles" .

    Configuring incomplete, errors occurred!(待解)

    • make

`Configuring incomplete, errors occurred!

SuperMalloc

  • error : format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t {aka long long unsigned int}’ [-Werror=format=
    # if __WORDSIZE == 64
    typedef long int  int64_t;
    # else
    __extension__
    typedef long long int  int64_t;
    # endif
 both 32-bit compile with GCC (and with 32- and 64-bit MSVC), the output of the program will be:

int:           0
int64_t:       1
long int:      0
long long int: 1

*可能是os灌不對 uname -a

Linux sarahcheng-MacBookPro 4.4.0-38-generic #57-Ubuntu SMP Tue Sep 6 15:41:41 UTC 2016 i686 i686 i686 GNU/Linux

  • error: Cannot open the file

    not input.txt, just input

  • error: input unsorted data line-by-line
  • error: x range not ....
  • error: redeclaration of ‘fp’ with no linkage
  • error: Segmentation fault (core dumped)程式記憶體區段錯誤
Select a repo