huang-me

@huang-me

Joined on Oct 20, 2019

  • 學號:311551173 姓名:黃牧恩 Q1 What are the pros and cons of the three methods? Give an assumption about their performances. Method 1 Pros: 每一個thread只需要處理一個pixel Cons: 有些計算量小的thread只被使用一下,只有部分的thread會因為計算量較大而被使用較久,導致整體thread utilization較低
     Like  Bookmark
  • Q1.1: How do you control the number of MPI processes on each node? MPI run programs according to hostfile, it assign programs to machine in hostfile one-by-one. machine_1 machine_1 machine_2 If we're going to run 8 copies of the program and hostfile is defined as above, program_1 & program 2 will be assigned to machine_1, program_3 will be assigned to machine_2, program_4 will be assigned to machine_1, ...etc.
     Like  Bookmark
  • Q1: Is speedup linear in the number of threads used? In your writeup hypothesize why this is (or is not) the case? Refer to the graph above, the speedup is almost linear. Since all threads doing same thing, using more thread will reduce the loading of threads, which decrease the execution time linearly. Speedup is also almost linear too. Q2: How do your measurements explain the speedup graph you previously created?
     Like  Bookmark
  • Contributed by < huang-me > 使用 atomic 與否差異 #include <stdatomic.h> int v = 0; atomic_bool v_ready = false; int bv; void *threadA() {
     Like 2 Bookmark
  • contributed by < huang-me > kecho 給定的 kecho 已使用 CMWQ,請陳述其優勢和用法 優勢: CMWQ 利用的 thread pool 的機制,大幅減少了創建 thread 的 overhead。 使用 unbounded thread 機制使得 thread 可以切換到 idle 的 CPU 上,增加系統資源的使用率。 引入 scheduling 機制資源不會被需要長時間的 thread 佔用。
     Like  Bookmark
  • contributed by <huang-me> 測驗一 (average) Solution 1 #include <stdint.h> uint32_t average(uint32_t a, uint32_t b) { return (a >> 1) + (b >> 1) + (a & b & 1); }
     Like  Bookmark
  • contributed by < huang-me > twoSum Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target. struct hlist_node { struct hlist_node *next, **pprev; }; struct hlist_head { struct hlist_node *first; }; typedef struct { int bits; struct hlist_head *ht; } map_t; #define MAP_HASH_SIZE(bits) (1 << bits)
     Like  Bookmark
  • contributed by < huang-me > :::danger 注意作業書寫規範! :notes: jserv ::: Prerequisites $ sudo apt install build-essential git clang-format aspell colordiff valgrind
     Like  Bookmark
  • contributed by <huang-me>, <OliveLake> :::info Expectation [x] RVVM ( https://github.com/LekKit/RVVM ) 同時支援 RV64 和 RV32,本課程著重於 RV32,請針對 RV32 準備相關的開發工具 (可重用 GNU Toolchain) [x] 留意 Linux 啟動流程: OpenSBI -> U-Boot -> Linux kernel -> userspace,請學習相關的背景知識,說明個別元件的作用,特別是 OpenSBI [ ] 準備 RV32 的 Linux 核心,你們可能會遭遇技術困難,記錄下來並嘗試排除 [ ] 說明 Linux 核心在 RV32/RVVM 啟動過程中,除了 CPU,還有哪些週邊的互動 (如中斷控制器和 timer)
     Like  Bookmark
  • contributed by: <huang-me> Rewrite Leetcode 191. Number of 1 Bits Write a function that takes an unsigned integer and returns the number of '1' bits it has (also known as the Hamming weight). C code int cntbit(int n) { if(n == 0) return 0; int cnt = 0;
     Like  Bookmark
  • Two Sum Approach 1: Iterate through all possible combinations, and check if the sum of combination is equal to target value, return index of two numbers. Time complexity: O($n^2$). For each element, we try to find its complement by looping through the rest of the array which takes O(n) time. Therefore, the time complexity is O($n^2$). vector<int> twoSum(vector<int>& nums, int target) { for(int i=0; i<nums.size(); i++) { for(int j=i+1; j<nums.size(); j++) { if(nums[i]+nums[j] == target) return {i, j};
     Like  Bookmark
  • Chrome add button to toggle GPU Command to start chrome with gpu-disabled open -a "Google Chrome" --args --load-extensions="${extensionPath}" --disable-gpu 可以在 chrome://gpu 中查看現在是否使用 GPU ==一般情況下如下圖:== 使用的 GPU 為 Intel(R) UHD Graphics 617
     Like  Bookmark
  • <Contributed by huang-me> Noise suppression 以下是論文中提及的常見的 Noise supperssion 分析方式,主要是先利用 VAD 模組區分有無人講話的部分。之後再將結果傳遞給 Noise Spectral Estimation 模組計算噪音的頻譜特徵。最後把前面兩個模組的結果結合,利用音訊的特徵減去噪音的特徵以得到抑制雜訊的效果。 不過,看似簡單的結構卻因為每個模組都需要很仔細的調整演算法中的參數,否則很容易使得整個系統出現問題,也開始使用深度學習讓幫助測試參數的設定。 Digital signal processing IIR: Define: $y[k] = \sum_{p=0}^N{a_p}x[k-p] + \sum_{p=1}^M{b_p}y[k-p]$
     Like  Bookmark
  • Contributed by < huang-me > quiz6 題目 測驗1 bfloat16 float fp32tobf16(float x) { float y = x; int *py = (int *) &y; unsigned int exp, man; exp = *py & 0x7F800000u;
     Like  Bookmark
  • Contributed by < huang-me > quiz5 題目 測驗1 Floating point division 程式碼運作原理 #include <stdio.h> #include <stdlib.h> double divop(double orig, int slots) { if (slots == 1 || orig == 0)
     Like  Bookmark
  • Contributed by < haung-me > 測驗1 Hamming Distance int hammingDistance(int x, int y) { return __builtin_popcount(x ^ y); } 因為 hamming distance 就是要找出兩個數值在2進位時數值不一樣的 bit 個數,將輸入的兩個數做 XOR 後,就只有兩個數值不同的 bit 才會被 set 為1,此時再利用 __builtin_popcount 就可以計算出正確的 hamming distance。 不使用 gcc extension 的 C99 實作程式碼
     Like  Bookmark
  • Contributed by < huang-me > [toc] Principle ==解釋程式的運作原理== test_common.c 判斷應該使用 CPY 或者 REF if (!strcmp(argv[1], "CPY") || (argc > 2 && !strcmp(argv[2], "CPY"))) {
     Like  Bookmark
  • contributed by < huang-me > 第二周題目 [toc] 測驗1 #include <stddef.h> bool is_ascii(const char str[], size_t size) {
     Like  Bookmark
  • contributed by < huang-me > [toc] 重新回答第一周測驗題 AA1 assert 在 debug 模式下才會執行,所以不應該將任何應該執行的步驟放入 assert() 在使用 malloc 配置記憶體之後應該要測試有沒有給予成功記憶體再繼續執行其他的動作,所以本題的位置選擇放 assert
     Like  Bookmark
  • contributed by < huang-me > Outline [TOC] Environment $ uname -a Linux muen-B550-GAMING-X 5.4.0-40-generic #44-Ubuntu SMP Tue Jun 23 00:01:04 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux $ gcc --version
     Like  Bookmark