2020q3 Homework5 (render)

--- tags: 進階電腦系統理論與實作 --- # 2020q3 Homework5 (render) contributed by < `RinHizakura` > > [I07: render](https://hackmd.io/@sysprog/2020-render) > [GitHub](https://github.com/RinHizakura/raycaster) ## 程式原理 ### SDL 相關 API #### [`SDL_Init`](https://wiki.libsdl.org/SDL_Init) 初始化 SDL library，`SDL_INIT_VIDEO` 表示初始為視訊相關的系統狀態。 #### [`SDL_CreateWindow`](https://wiki.libsdl.org/SDL_CreateWindow) 建立 `sdlWindow` 物件做為繪畫的 "畫布"。 ```cpp SDL_CreateWindow(const char* title, int x, int y, int w, int h, Uint32 flags) ``` * `title`: 即視窗標題 * `ｘ`、`y`: 視窗的 x、y 座標，`SDL_WINDOWPOS_UNDEFINED` 將視窗初始在預設位置 * `w`、`h`: 視窗的寬與高 * `flags`: 視窗如何顯示的相關參數: `SDL_WINDOW_SHOWN` 表示視窗為顯示狀態 #### [`SDL_GetPerformanceFrequency`](https://wiki.libsdl.org/SDL_GetPerformanceFrequency) 取得 high resolution counter 每秒有幾個 count 的資訊。 #### [`SDL_GetPerformanceCounter`](https://wiki.libsdl.org/SDL_GetPerformanceCounter) 取得 high resolution counter 現在的值。 #### [`SDL_CreateRenderer`](https://wiki.libsdl.org/SDL_CreateRenderer) 為了在螢幕上繪圖，首先我們需要先產生一個 `SDL_Renderer` 物件。 ```cpp SDL_Renderer* SDL_CreateRenderer(SDL_Window* window, int index, Uint32 flags) ``` * `window`: 前面我們所建立的 `sdlWindow` 物件 * `index`: 使用的 rendering driver index，-1 表示使用 default * `flags`: 相關的狀態設定。`SDL_RENDERER_ACCELERATED` 使用硬體加速 #### [`SDL_CreateTexture`](https://wiki.libsdl.org/SDL_CreateTexture) 為了可以畫出各個物件的 [texture](https://en.wikipedia.org/wiki/Texture_mapping)，需要建立 `SDL_Texture` 物件。我們可以看到原始程式碼中對應 float 和 fixed point 建立了兩個 `SDL_Texture` ```cpp SDL_Texture* SDL_CreateTexture(SDL_Renderer* renderer, Uint32 format, int access, int w, int h) ``` * `renderer`: 前面所建立的 `sdlRenderer` 物件 * `format`: pixel 的表示格式，`SDL_PIXELFORMAT_ARGB8888` 表示 32 位元的各 8 bits 表示 A(透明度)、R、G、B * `access`: 對於此 texture 的存取方式，`SDL_TEXTUREACCESS_STREAMING` 表示頻繁刷新 * `w`、`h`: 表示 texture 的長寬 ### `Renderer` `Renderer` 物件僅有一個 method `Trace_Frame`。 #### `Trace_Frame` ```cpp _rc->Start(static_cast<uint16_t>(g->playerX * 256.0f), static_cast<uint16_t>(g->playerY * 256.0f), static_cast<int16_t>(g->playerA / (2.0f * M_PI) * 1024.0f)); ``` * 先將玩家目前所在的座標轉換成定點數的表示傳遞，使可以更新下一個 frame * 對於 X、Y 座標，就是把小數用 16 位元整數表示，且用前 8 位元表示整數並用後 8 位元表示小數點後位數 * 對於角度座標，浮點數表達的範圍為 0 ~ 2$\pi$，定點數則將一圈分成 1024 等份來表達 * 則在 `RayCasterFloat` 的 `Start` 中需要再將其座標轉換回來，在 `RayCasterFixed` 則直接 assign 即可 ```cpp for (int x = 0; x < SCREEN_WIDTH; x++) { uint8_t sso; uint8_t tc; uint8_t tn; uint16_t tso; uint16_t tst; uint32_t *lb = fb + x; _rc->Trace(x, &sso, &tn, &tc, &tso, &tst); const auto tx = static_cast<int>(tc >> 2); int16_t ws = HORIZON_HEIGHT - sso; if (ws < 0) { ws = 0; sso = HORIZON_HEIGHT; } ``` * 相較於 ray tracing 需要針對每一個繪制的 pixel 去追蹤光線，因為 ray casting 方法對物件的幾合形狀做了限制(geometric constraints)，因此可以逐 column 去追蹤光線，加快了對畫面更新的處理速度 * 透過 `Trace` 去計算該如何繪制場景，可以想成是計算對於每個 x 座標，y 座標區分成的天空、箱子、地板各要佔多少比例，從上往下畫 * `sso` 是中間需要留給箱子的 y 軸範圍 * `tn` 繪制箱子的亮面還是暗面 * `tc` 對應要從箱子的 x 軸位置開始畫 * `tso` 對應要從箱子的 y 軸位置開始畫 * `tst` 對應下一個 y 軸位置要走幾步(考慮到箱子的遠近，箱子的細節也就不用完整的畫出來了) * `const auto tx = static_cast<int>(tc >> 2)` 用 6 bit 去 index 箱子的 x 軸 * 則 `2 * sso` 就是中間箱子佔據的 pixel 數，天空和地板各佔 `ws`，所以 `2 * sso + 2 * ws = SCREEN_HEIGHT = 2 * HORIZOM_HEIGHT ` ```cpp uint16_t to = tso; uint16_t ts = tst; for (int y = 0; y < ws; y++) { *lb = GetARGB(96 + (HORIZON_HEIGHT - y)); lb += SCREEN_WIDTH; } ``` * 則首先先畫天空的部份 * `GetARGB` 把灰階值轉換成對應的 ARGB 表示，亮度部份透過一些計算畫出漸層感 ```c= for (int y = 0; y < sso * 2; y++) { // paint texture pixel auto ty = static_cast<int>(to >> 10); auto tv = g_texture8[(ty << 6) + tx]; to += ts; if (tn == 1 && tv > 0) { // dark wall tv >>= 1; } *lb = GetARGB(tv); lb += SCREEN_WIDTH; } ``` * `auto ty = static_cast<int>(to >> 10)` 用 6 bit 去 index 箱子的 y 軸 * 這也對應了箱子部份的 `g_texture` 的大小是 $4096 = 64 \times 64 = 2^6 \times 2^6$，因此`(ty << 6) + tx` 用共 12 bits 去 index 所要畫的箱子對應 pixel * `if (tn == 1 && tv > 0)`: 使得箱子一側的表面亮度變暗一些 ```cpp for (int y = 0; y < ws; y++) { *lb = GetARGB(96 + (HORIZON_HEIGHT - (ws - y))); lb += SCREEN_WIDTH; } ``` * 繪制地板部份，與天空同理 ### `RayCaster` :::warning 因為自己也不到 100% 理解，因此文字的表達可能不太清楚，加上示意圖也畫得有點醜:cry:。可以對照 [Ray-Casting Tutorial For Game Development And Other Purposes](https://permadi.com/1996/05/ray-casting-tutorial-table-of-contents/)對 ray-casting 的講解釐清更多細節 ::: `RayCaster` 物件做為 `RayCasterFloat` 或者 `RayCasterFixed` 的 Parent class，定義了兩個 virtual function `Start` 和 `Trace`，這代表若有 derived class 的話，該函式可以被重新定義。 :::info 並不熟悉 C++，可以先閱讀 [C++中關於 virtual 的兩三事](https://medium.com/theskyisblue/c-%E4%B8%AD%E9%97%9C%E6%96%BC-virtual-%E7%9A%84%E5%85%A9%E4%B8%89%E4%BA%8B-1b4e2a2dc373) 做為參考 ::: ### `RayCasterFloat` #### `Start` 將 raycaster 中的座標做更新，因為輸入的是定點數的表達，因此必須調整回來。 #### `Distance` ```cpp while (rayA < 0) { rayA += 2.0f * M_PI; } while (rayA >= 2.0f * M_PI) { rayA -= 2.0f * M_PI; } ``` * 先將角度調整到 0 ~ 2$\pi$ 之間 ```cpp int tileStepX = 1; int tileStepY = 1; float tileX = 0; float tileY = 0; if (rayA > M_PI) { tileStepX = -1; } if (rayA > M_PI_2 && rayA < 3 * M_PI_2) { tileStepY = -1; } ``` ![](https://i.imgur.com/iOuW9EL.png =400x) * 想像我們位在一個網格狀的地圖上，則 `tileStepX`、`tileStepY` 初始為 1，表示每個物體會佔據 1x1 的格子 * `tileStep` 需要根據角度調整正負，後面會解釋原因 ```cpp float rayX = playerX; float rayY = playerY; float offsetX = modff(rayX, &tileX); float offsetY = modff(rayY, &tileY); float startDeltaX, startDeltaY; if (rayA <= M_PI_2) { startDeltaX = (1 - offsetY) * tan(rayA); startDeltaY = (1 - offsetX) / tan(rayA); } else if (rayA <= M_PI) { if (offsetY == 0) { startDeltaX = (1) * fabs(tan(rayA)); } else { startDeltaX = (offsetY) *fabs(tan(rayA)); } startDeltaY = -(1 - offsetX) / fabs(tan(rayA)); } else if (rayA < 3 * M_PI_2) { if (offsetY == 0) { startDeltaX = -(1) * fabs(tan(rayA)); } else { startDeltaX = -(offsetY) *fabs(tan(rayA)); } if (offsetX == 0) { startDeltaY = -(1) / fabs(tan(rayA)); } else { startDeltaY = -(offsetX) / fabs(tan(rayA)); } } else { startDeltaX = -(1 - offsetY) * fabs(tan(rayA)); if (offsetX == 0) { startDeltaY = (1) / fabs(tan(rayA)); } else { startDeltaY = (offsetX) / fabs(tan(rayA)); } } ``` ![](https://i.imgur.com/7s72aGD.png =500x) * 在這個網格狀的地圖上，我們要計算從面向的角度往外射出一條光線，直到碰到物件(假設是圖中的深紅色框)的座標這段距離。 * 因為網格的單位為 1，則透過 `modff` 取得的小數部份就是與鄰近的 X, Y 網格的距離 * `startDeltaX` 與 `startDeltaY` 對應圖中的橘色和綠色線段長度，這使我們可以計算哪些是光線經過的 x 座標或 y 座標 * 圖中示意的是 $\theta <= \pi/2$ 的時候的情形，根據視線與藍線的夾角應當可以依此推理出其他 branch 中的計算從何而來 :::warning :warning: 暫時還不確定整個 render 方法的座標系，圖示僅供對算法想求的東西做闡述，但不一定完全符合程式的假設，如果之後看懂更多部份後會再回頭釐清 ::: ```cpp float interceptX = rayX + startDeltaX; float interceptY = rayY + startDeltaY; float stepX = fabs(tan(rayA)) * tileStepX; float stepY = fabs(1 / tan(rayA)) * tileStepY; bool verticalHit = false; bool horizontalHit = false; bool somethingDone = false; ``` * 則可以知道 `interceptX` 是第一個碰到網格的 x 座標(最左邊的粉點)，`interceptY` 是第一個碰到網格的 y 座標(最左邊紫點) * `StepX` 是兩個紫點間的 x 座標距離，長度應該可以看出是 tan $\theta \times 1$，同理可以看出 `StepY` 長度是 $1/tan\theta$ * 根據角度，座標可能會往上加或者往下減，這對應前面 `tileStep` 可能是 +1 或者 -1 的原因 ```cpp do { somethingDone = false; while (((tileStepY == 1 && (interceptY <= tileY + 1)) || (tileStepY == -1 && (interceptY >= tileY)))) { somethingDone = true; tileX += tileStepX; if (IsWall(tileX, interceptY, rayA)) { verticalHit = true; rayX = tileX + (tileStepX == -1 ? 1 : 0); rayY = interceptY; *hitOffset = interceptY; *hitDirection = true; break; } interceptY += stepY; } ... ``` * 則首先判斷 `tileY` 到 `tileY + 1` 範圍內的碰觸網格的所有 x 座標，`tileX += tileStepX;` 是把座標 x 方向移動到下個網格，則透過座標 `(tileX, interceptY)` 對應地圖可以知道這個座標是否碰到牆壁 :::info 至於 `IsWall()` 裡的 `rayA` 呢？如果去看 `IsWall()` 的實作會發現是沒有用到，所以其實可以拿掉XD ::: * 如果 `IsWall()` 為 true * `verticalHit` 設為 true * 則 `rayX`、 `rayY` 為碰到牆壁所在的 grid 座標(從程式看起來是以網格右上角的頂點作為座標依據(?)) * `hitOffset`: 從碰觸到箱子的位置，可以回推是 texture 的哪一部份 * 如果碰觸到的是垂直面，這面的牆壁是深色的，`hitDirection` 設為 true ```cpp while (!verticalHit && ((tileStepX == 1 && (interceptX <= tileX + 1)) || (tileStepX == -1 && (interceptX >= tileX)))) { somethingDone = true; tileY += tileStepY; if (IsWall(interceptX, tileY, rayA)) { horizontalHit = true; rayX = interceptX; *hitOffset = interceptX; *hitDirection = 0; rayY = tileY + (tileStepY == -1 ? 1 : 0); break; } interceptX += stepX; } } while ((!horizontalHit && !verticalHit) && somethingDone); ``` * 對 `tileX` 到 `tileX + 1` 間的處理則大致與前面同理 * 注意到 `tileX` 和 `tileY` 在 while 迴圈中也會被更新，所以 do...while 迴圈可能會被執行多次 ```cpp if (!somethingDone) { return 0; } float deltaX = rayX - playerX; float deltaY = rayY - playerY; return sqrt(deltaX * deltaX + deltaY * deltaY); } ``` * 則可以透過距離公式，計算出從人物位置到發出的光碰處牆壁的此段距離 :::info 表達的不是很好...可以思考一下示意圖中任意藍色的箭頭指向其他方向，則紫色與粉色的點出現的狀況，如何依序找到這些點就是這個 do...while 迴圈的作用 ::: #### `Trace` ```cpp float deltaAngle = atanf(((int16_t) screenX - SCREEN_WIDTH / 2.0f) / (SCREEN_WIDTH / 2.0f) * M_PI / 4); ``` ![](https://i.imgur.com/KCdb6fh.png =400x) ![](https://i.imgur.com/QVV2uLW.png =400x) * 計算要繪制視窗的目標 column 與視線的夾角 `deltaAngle` $\theta$，從這裡的計算我們可以知道第 1 人稱的 [field of view](https://en.wikipedia.org/wiki/Field_of_view) 設為 90 度 ```cpp= float lineDistance = Distance(_playerX, _playerY, _playerA + deltaAngle, &hitOffset, &hitDirection); float distance = lineDistance * cos(deltaAngle); float dum; *textureX = (uint8_t)(256.0f * modff(hitOffset, &dum)); *textureNo = hitDirection; *textureY = 0; *textureStep = 0; ``` ![](https://i.imgur.com/3Bm3KYT.png =500x) * `Distance()` 根據面向的方向，計算光線從眼睛出發直到碰觸箱子的距離，以及碰觸到箱子的哪一部份 * 則根據簡單的高中數學可以計算出 `distance`，即視線至箱子的直線距離 * `modff` 從 `hitOffset` 取出小數點部份 x256 作為對應 index 的箱子 x 座標(?) * `textureNo` 表示碰觸到箱子的亮面還是暗面 ```cpp if (distance > 0) { *screenY = INV_FACTOR / distance; auto txs = (*screenY * 2.0f); if (txs != 0) { *textureStep = (256 / txs) * 256; if (txs > SCREEN_HEIGHT) { auto wallHeight = (txs - SCREEN_HEIGHT) / 2; *textureY = wallHeight * (256 / txs) * 256; } } } else { *screenY = 0; } ``` * 如果 distance > 0，表示碰觸到箱子，則需要設定 texture 等相關繪制的座標資訊 :::warning :warning: 思考了一下 ray casting 的邏輯，應該要滿足 $-\pi/4$ <= `deltaAngle` <= $\pi/4$，所以 `cos(deltaAngle)` 應大於零。則 `distance > 0` 應該與 `lineDistance > 0` 同理，所以推斷這裡的邏輯應該等同於 `lineDistance > 0` 的「光線碰觸到物體」? ::: * `screenY` 是一半的視窗留給物體繪制的比例 :::warning ![](https://i.imgur.com/rPKRNMw.png =400x) 不確定 `INV_FACTOR` 是怎麼求得的，不過這裡的計算應該與上圖有關(?)。根據國中數學，我們知道紅線的 `screenY` 長度為: $$ \frac{(物體的高度/2)}{眼睛至牆壁的直線距離(綠線的 distance)} \times 眼睛至視窗的距離 $$ ::: * 因此 `txs = (*screenY * 2.0f)` 得到佔據整個視窗的比例 * 則如果 `txs != 0`，需根據這個比例計算繪制物件的 `textureStep` * 則如果 `txs > SCREEN_HEIGHT` 為 True ，表示視窗內只能畫出部份的箱子，需要調整 `textureY` * 否則，該 column 就不需要畫任何物件，因此 `screenY = 0` ### `RayCasterFixed` #### `Start` 將 raycaster 中的座標做更新，角度的部份需要額外的計算是落在將圓切成四等份的哪個區段，作為後續的判斷使用。 #### `CalculateDistance` ... #### `Trace` 對照 floating-point 版本，來檢驗 fixed-point 版本是如何實作的。 ```cpp uint16_t rayAngle = static_cast<uint16_t>(_playerA + LOOKUP16(g_deltaAngle, screenX)); // neutralize artefacts around edges switch (rayAngle % 256) { case 1: case 254: rayAngle--; break; case 2: case 255: rayAngle++; break; } rayAngle %= 1024; ``` * `LOOKUP16(g_deltaAngle, screenX)` 對應 floating-point 版本 `deltaAngle` 的計算 * switch 的註解說是為了去除偽影? 不過測試了一下如果拿掉會導致後面的 `distance` 求出負值而導致 segmentation fault ```cpp int16_t deltaX; int16_t deltaY; CalculateDistance(_playerX, _playerY, rayAngle, &deltaX, &deltaY, textureNo, textureX); ``` * `CalculateDistance` 對應 floating-point 的 `Distance` 計算 ```cpp // distance = deltaY * cos(playerA) + deltaX * sin(playerA) int16_t distance = 0; ``` ![](https://i.imgur.com/AGiIlwI.png =300x) * 這裡使用另一種方式計算距離，從上圖我們可以看到，這個距離會是 $deltaX \times sin(\theta) + deltaY \times cos(\theta)$ ```cpp if (_playerA == 0) { distance += deltaY; } else if (_playerA == 512) { distance -= deltaY; } else switch (_viewQuarter) { case 0: distance += MulS(LOOKUP8(g_cos, _viewAngle), deltaY); break; case 1: distance -= MulS(LOOKUP8(g_cos, INVERT(_viewAngle)), deltaY); break; case 2: distance -= MulS(LOOKUP8(g_cos, _viewAngle), deltaY); break; case 3: distance += MulS(LOOKUP8(g_cos, INVERT(_viewAngle)), deltaY); break; } ``` * 首先計算 $deltaY \times cos(\theta)$ * 0 對應 0 度: $cos0^\circ = 1$，512 對應 180 度: $cos180^\circ = -1$ * 其他情況下，即是計算 `deltaY * cos(playerA)` * 根據 `playerA` 角度所在的象限(`_viewQuarter`)，計算會有不同 * 因為 `_viewAngle = playerA % 256` ，紀錄的是角度除以 90 度的餘數 * 但是想想高中的三角函數 $$|cos120^\circ| = |cos60^\circ| \neq |cos30^\circ| = |cos(120\%90)^\circ| $$ 因此需要調整 index ，得到正確的三角函數值 * 也因為 `_viewAngle = playerA % 256`，所以0 <= `_viewAngle` < 255，則 `INVERT(x)` 所求會是 90 度的互補角 * `MulS` 是對定點數的乘法，處理 uint8_t 的 `v` 乘 int16_t 的 `f` 的結果 ```cpp if (_playerA == 256) { distance += deltaX; } else if (_playerA == 768) { distance -= deltaX; } else switch (_viewQuarter) { case 0: distance += MulS(LOOKUP8(g_sin, _viewAngle), deltaX); break; case 1: distance += MulS(LOOKUP8(g_sin, INVERT(_viewAngle)), deltaX); break; case 2: distance -= MulS(LOOKUP8(g_sin, _viewAngle), deltaX); break; case 3: distance -= MulS(LOOKUP8(g_sin, INVERT(_viewAngle)), deltaX); break; } ``` * 計算 $deltaX \times sin(\theta)$ 同理就不贅述 ```cpp if (distance >= MIN_DIST) { *textureY = 0; LookupHeight((distance - MIN_DIST) >> 2, screenY, textureStep); } else { *screenY = SCREEN_HEIGHT >> 1; *textureY = LOOKUP16(g_overflowOffset, distance); *textureStep = LOOKUP16(g_overflowStep, distance); } ``` * 如果距離很近(< `MIN_DIST`，但尚不確定 `MIN_DIST` 從何得來) ，整個 column 都要拿來畫物件，因此 `screenY = SCREEN_HEIGHT >> 1;` * `textureY` 和 `textureStep` 則透過查表得到計算的結果 * 否則，如果距離較遠(>= `MIN_DIST` )，則可以畫出整個箱子的頂到底，因此 `textureY = 0` * 透過 `LookUpHeight` 去求得這個 `distance` 對應的 `screenY` 與 `textureStep` 值 #### `LookupHeight` ```cpp if (distance >= 256) { const uint16_t ds = distance >> 3; if (ds >= 256) { *height = LOOKUP8(g_farHeight, 255) - 1; *step = LOOKUP16(g_farStep, 255); } *height = LOOKUP8(g_farHeight, ds); *step = LOOKUP16(g_farStep, ds); } else { *height = LOOKUP8(g_nearHeight, distance); *step = LOOKUP16(g_nearStep, distance); } ``` * 首先要注意到 `LookupHeight` 的輸入是 `LookupHeight((distance - MIN_DIST) >> 2, screenY, textureStep)` * 因此可以反推定點數表示中的距離從 `MIN_DIST = 187` 到 `((256 << 2) + MIN_DIST) - 1= 1210` 的範圍會由 `g_near` 的 table 來計算 * 對於 `((256 << 2) + MIN_DIST) = 1211` 到 `(((256 << 3) << 2) + MIN_DIST)) - 1 = 8378` 的定點數表示範圍則去查詢 `g_far` * 對於超過 8378 的距離呢? * `*height = LOOKUP8(g_farHeight, 255) - 1` = 1，也就是對於一定距離外的物件，幾乎不需要把物件畫出來 * `step` 也做類似處理即可 * 注意到後面不小卻又把 `height` 跟 `step` 改掉了嗎w ## 定點數 table 的計算思路對照程式中的用途和 [precalculator.cpp](https://github.com/sysprog21/raycaster/blob/master/tools/precalculator.cpp) 可以得知各 table 是如何被建立的。 ### `delataAngle` 參照下方程式並執行，對照 table 就可以看出端倪。你可以想成現在就是把 360 度從 $0\space到\space2\pi$ 表示轉換到 $0\space到\space1024$。 ```cpp #include <math.h> #include <stdio.h> int main(){ for(int i = 0; i < 320; i++){ double angle = atanf( ((i - 160)/ 160.0f) * M_PI / 4); printf("%d arctan: %f %d\n",i, angle, (int) (1024 * (angle/(2*M_PI)))); } return 0; } ``` ### `g_cos` 嘗試執行以下的程式，對照 table 就可以知道，就是把浮點數的角度 $\theta$，轉成其對應定點數角度之 cos 值的定點數表示。 ```cpp= #include <math.h> #include <stdio.h> int main(){ for(int i = 0; i < 256; i++){ double angle = (2 * M_PI) * ( i / 1024.0f); printf("%d, %f cos: %d\n",i, angle, (int)(cos(angle) * 256.0f)); } } ``` ### `g_sin` 與 `g_cos` 的計算同理。 ```cpp= #include <math.h> #include <stdio.h> int main(){ for(int i = 0; i < 256; i++){ double angle = (2 * M_PI) * ( i / 1024.0f); printf("%d, %f sin: %d\n",i, angle, (int)(sin(angle) * 256.0f)); } } ``` ### `g_tan` `g_tan` 也類似，不過在 precalculator.cpp 中會改用 `M_PI_2` 來計算對應的角度。 ```cpp #include <math.h> #include <stdio.h> int main(){ for(int i = 0; i < 256; i++){ double angle = i * M_PI_2 / 256.0f; printf("%d, %f tan: %d\n",i, angle, (int)(tan(angle) * 256.0f)); } } ``` ### `g_cotan` `g_cotan` 也是類似上述的邏輯。 ```cpp #include <math.h> #include <stdio.h> int main(){ for(int i = 0; i < 256; i++){ double angle = i * M_PI_2 / 256.0f; printf("%d, %f tan: %d\n",i, angle, (int)( 1 / tan(angle) * 256.0f)); } } ``` ### `INV_FACTOR_INT` 在思考後續的 table 之前，需要計算定點數的 `INV_FACTOR`，可以嘗試從 floating-point 的作法對照回推。首先，可以看到 fixed-point 的 `INV_FACTOR` 定為: ```cpp #define INV_FACTOR_INT ((uint16_t)(SCREEN_WIDTH * 75)) ``` 則可以回推原本 floating-point 中的 `INV_FACTOR` ```cpp #define INV_FACTOR (float) (SCREEN_WIDTH * 95.0f / 320.0f) ``` 中 `95.0f` 位置應該要是 `75 / 256 * 320 = 93.75` (除 256 是因為定點數會使用後 8 bits 表示小數點部份)，換句話說，如果要把原本 fixed-point 做法百分之百轉換成 floating-point，其實應該要把 `INV_FACTOR` 定成 ```cpp #define INV_FACTOR (float) (SCREEN_WIDTH * 93.75f / 320.0f) ``` 不過把浮點數轉成定點數的狀況下，受限於編碼表示的數字範圍不同，我們需要合理的容忍一定的誤差。 ### `g_overflowOffset` / `g_overflowStep` 有了 `INV_FACTOR_INT`，就可以從浮點數的作法反推 `g_overflowOffset` 和 `g_overflowStep` 兩個 table 的計算是如何計算的。得知在距離為定點數的 0 ~ 255 時，其對應的浮點數距離轉換成 `textureStep`、`textureY` 該是多少。 ```cpp int main(){ for(int i = 1; i < 256; i++){ double screenY = (INV_FACTOR_INT / (i / 2.0f)) / 2.0f; double txs = (screenY * 2.0f); uint16_t textureStep = (256 / txs) * 256; double wallHeight = (txs - 256) / 2; uint16_t textureY = wallHeight * (256 / txs) * 256; printf("%d %d\n",textureY, textureStep); } return 0; } ``` 值得注意的點是，因為我們需要把浮點數表示合理的轉換到定點數範圍下，所以計算中在求至目標值以前需要先用浮點數運算，在最後一步時再 rounding 到定點數下的表示範圍。 ### `g_nearHeight` / `g_nearStep` 則對 `g_near` 的兩個 table，可以經過如下運算得到。 ```cpp int main (){ for(int i = 0; i < 256; i++){ uint16_t distance = ((i << 2) + MIN_DIST); uint16_t screenY = (INV_FACTOR_INT / (distance / 4)) / 4; double distance_double = ((i * 4.0f) + MIN_DIST); double screenY_double = (INV_FACTOR_INT / (distance_double / 4.0f)) / 4.0f; auto txs = (screenY_double * 2.0f); uint16_t textureStep = (256 / txs) * 256; printf("%d %d \n",i, screenY, textureStep); } return 0; } ``` 可以看到 `textureStep` 的求得需要重新計算更精準的 `distance` 和 `screenY`。 ### `g_farHeight` / `g_farStep` 對 `g_far` 的兩個 table，同理可以經過如下計算得到。 ```cpp int main (){ for(int i = 0; i < 256; i++){ uint16_t distance = ((i << 5) + MIN_DIST); uint16_t screenY = (INV_FACTOR_INT / (distance / 32)) / 32; double distance_double = ((i * 32.0f) + MIN_DIST); double screenY_double = (INV_FACTOR_INT / (distance_double / 32.0f)) / 32.0f; auto txs = (screenY_double * 2.0f); uint16_t textureStep = (256 / txs) * 256; printf("%d %d \n",i, screenY, textureStep); } return 0; } ``` ## 展現的缺失 ### 連續的牆壁在右邊的浮點數算繪，可以明顯看出計算的缺失，導致無法產生和定點數一樣連續的牆面。 ![](https://i.imgur.com/6De3UrG.png) 如果對照定點數的作法與浮點數作法的差異，就會發現浮點數版本出現上圖中問題的原因!回顧一下浮點數版本中計算 `screenY` 的方法: ```cpp if (distance > 0) { *screenY = INV_FACTOR / distance; auto txs = (*screenY * 2.0f); if (txs != 0) { *textureStep = (256 / txs) * 256; if (txs > SCREEN_HEIGHT) { auto wallHeight = (txs - SCREEN_HEIGHT) / 2; *textureY = wallHeight * (256 / txs) * 256; } } } else { *screenY = 0; } ``` 需注意到 `screenY` 的型態是 uint8_t，但是 `INV_FACTOR / distance` 的結果卻不一定可以用 8 bits 表示！則上述程式的 `*screenY = INV_FACTOR / distance` 就會出問題! 比對定點數的實作，當 `INV_FACTOR / distance * 2.0f` 大於 `SCREEN_HEIGHT`，也就是 8 bits 可以表示到的整數範圍時，我們應該要把他設成 `SCREEN_HEIGHT >> 1`，對應的程式修改為： ```cpp if (distance > 0) { uint16_t tmpY = INV_FACTOR / distance; auto txs = (tmpY * 2.0f); *screenY = tmpY; if (txs != 0) { *textureStep = (256 / txs) * 256; if (txs > SCREEN_HEIGHT) { *screenY = SCREEN_HEIGHT >> 1; auto wallHeight = (txs - SCREEN_HEIGHT) / 2; *textureY = wallHeight * (256 / txs) * 256; } } } else { *screenY = 0; } ``` 則可以看到預期的效果: ![](https://i.imgur.com/27dICqJ.png) ### 角落的牆壁當我們蜷伏在角落時(?)，會出現奇怪的牆壁。 ![](https://i.imgur.com/aHFqU8B.png) 而問題發生在 `LookupHeight`，因為缺少了 else 敘述導致 `if(ds >= 256)` 內的 statement 其實沒有真正作用，這也暗示了舊程式其實是有越界的 array 存取的。 ```diff void RayCasterFixed::LookupHeight(uint16_t distance, uint8_t *height, uint16_t *step) { if (distance >= 256) { const uint16_t ds = distance >> 3; if (ds >= 256) { *height = LOOKUP8(g_farHeight, 255) - 1; *step = LOOKUP16(g_farStep, 255); } + else{ *height = LOOKUP8(g_farHeight, ds); *step = LOOKUP16(g_farStep, ds); + } } ... ``` ### 消失的牆壁在特定的角度下，浮點數版本的牆壁會消失。 ![](https://i.imgur.com/jHS0RLQ.png) ```diff if (rayA <= M_PI_2) { startDeltaX = (1 - offsetY) * tan(rayA); startDeltaY = (1 - offsetX) / tan(rayA); } else if (rayA <= M_PI) { - if (offsetY == 0) { - startDeltaX = (1) * fabs(tan(rayA)); - } else { startDeltaX = (offsetY) *fabs(tan(rayA)); - } startDeltaY = -(1 - offsetX) / fabs(tan(rayA)); } else if (rayA < 3 * M_PI_2) { - if (offsetY == 0) { - startDeltaX = -(1) * fabs(tan(rayA)); - } else { startDeltaX = -(offsetY) *fabs(tan(rayA)); - } - if (offsetX == 0) { - startDeltaY = -(1) / fabs(tan(rayA)); - } else { startDeltaY = -(offsetX) / fabs(tan(rayA)); - } } else { - startDeltaX = -(1 - offsetY) * fabs(tan(rayA)); - if (offsetX == 0) { startDeltaY = (1) / fabs(tan(rayA)); - } else { startDeltaY = (offsetX) / fabs(tan(rayA)); - } } ``` ![](https://i.imgur.com/NA2LsTr.png =400x) 原因是因為上面的程式中，原本的作法相當於略過人物位於網格上的點的判斷(`offset = 0`)，這會使得位於牆壁所在 grid 時的判讀錯誤，因此予以修正。 ### 外側牆壁的比例 ![](https://i.imgur.com/5Pcn7qA.png) 觀察最外側的牆，可以看到兩邊的比例差距很大。發現是因為浮點數和定點數對牆壁的判別不一致導致(浮點數版本的牆壁座標比較靠內)，因此對浮點數的 `IsWall` 進行修改： ```diff bool RayCasterFloat::IsWall(float rayX, float rayY) { ... - if (tileX < 0 || tileY < 0 || tileX >= MAP_X - 1 || tileY >= MAP_Y - 1) + if (tileX < 0 || tileY < 0 || tileX > MAP_X - 1 || tileY > MAP_Y - 1) ... } ``` ![](https://i.imgur.com/MXqtmwe.png) ## 編譯時期產生 lookup table 為了讓 lookup table 可以根據 macro 定義的調整，也變成對應的數值，在編譯時期再產生 table，會相對直接在程式中寫死固定值更有彈性。對於固定大小(256)的 table，要在編譯時期產生 lookup table，可以技巧性的先定義 iterator: ```cpp #define S4(i) S1((i)), S1((i) + 1), S1((i) + 2), S1((i) + 3) #define S16(i) S4((i)), S4((i) + 4), S4((i) + 8), S4((i) + 12) #define S64(i) S16((i)), S16((i) + 16), S16((i) + 32), S16((i) + 48) #define S256(i) S64((i)), S64((i) + 64), S64((i) + 128), S64((i) + 192) ``` 接著，以 `g_tan` 為例，就可以透過類似以下的方式定義： ```cpp static const uint16_t g_tan[256] = { #define S1(i) (uint16_t)(256.0f * tan((i) *M_PI_2 / 256.0f)) S256(0) #undef S1 }; ``` 這個作法的好處是在大部分版本的 C/C++ 編譯器中都適用，使得我們可以在編譯時期產生 lookup table。但缺點是因為實際上只是把計算的展開規則用 macro 去歸納，所以會有所限制(例如對於某個重複的運算不能先用變數去儲存)，導致 macro 看起來美觀欠佳，編譯時間也可能大幅增加(因為 git commit 時會加上 cppcheck 等檢查，所需時間會有點久...)，如以下的 `g_nearStep` : ```cpp #define DELTA 1e-300 static const uint16_t g_nearStep[256] = { #define S1(i) \ (uint16_t)( \ (256 / \ ((((INV_FACTOR_INT / ((((i) *4.0f) + MIN_DIST) / 4.0f)) / 4.0f) * \ 2.0f) + \ DELTA)) * \ 256) S256(0) #undef S1 }; ``` 順帶一提，也可以看到這裡比起 [tools/precalculator.cpp](https://github.com/sysprog21/raycaster/blob/master/tools/precalculator.cpp) 的計算方式多用了一個 `DELTA`，這是因為要避免出現除以 0 的錯誤，但又不能直接用 if 判斷，所以用一個接近 0 的數字去保護。 ### `deltaAngle` 這個表格的產生則比較麻煩，如果考慮到 table 的大小會隨 `SCREE_WIDTH` 而有不同，則我們會沒辦法使用前面的方式產生。目前找到的唯一解法是利用 c++ 強大的 [constexpr](https://en.cppreference.com/w/cpp/language/constexpr)，並且搭配 [operator overloading](https://en.cppreference.com/w/cpp/language/operators) 技巧性的讓 `g_deltaAngle` 可以用 `[]` operator 像矩陣一樣的被操作。 :::warning 不過其實 constexpr 在 c++14 以上的版本才允許用 for 來定義，所以只好偷偷去改 Makefile:smile:。需再研究看看有沒有其他的替代方案。 ::: 則整個定義會如以下: ```cpp template <class T, int N> struct array { T elems[N]; constexpr T &operator[](size_t i) { return elems[i]; } constexpr const T &operator[](size_t i) const { return elems[i]; } }; // Function to build the lookup table constexpr array<uint16_t, SCREEN_WIDTH> gen_deltaAngle() { array<uint16_t, SCREEN_WIDTH> a = {}; for (int i = 0; i < SCREEN_WIDTH; i++) { float deltaAngle = atanf(((int16_t) i - SCREEN_WIDTH / 2.0f) / (SCREEN_WIDTH / 2.0f) * M_PI / 4); int16_t da = static_cast<int16_t>(deltaAngle / M_PI_2 * 256.0f); if (da < 0) { da += 1024; } a[i] = static_cast<uint16_t>(da); } return a; } constexpr static array<uint16_t, SCREEN_WIDTH> g_deltaAngle = gen_deltaAngle(); ``` 詳細的程式請參考 [precalculator.h](https://github.com/RinHizakura/raycaster/blob/master/precalculator.h)。 ## Frame rate 根據 [Frame rate](https://en.wikipedia.org/wiki/Frame_rate) 的定義，其所表示的是連續的 frame 顯示的頻率。可以粗略的理解為是每秒我們對遊戲畫面的更新次數。要計算每個 frame 的時間，不需要額外的 `time.h` 函式庫，我們可以直接通過 SDL 的 API 來完成，前面的章節也有提到 `SDL_GetPerformanceFrequency` 可以得到 high resolution counter 每秒可以有多少個 counter，則再通過計算每次更新畫面時的 counter 變化就可以回推時間。也就是程式中的： ```cpp const auto seconds = (nextCounter - tickCounter) / static_cast<float>(tickFrequency); ``` 則在 main loop 中額外加入程式 ```cpp cum_time += seconds; fps++; if(cum_time >= 1.0f){ prev_fps = fps; cum_time = 0; fps = 0; } ``` * 每次 loop 時，累計時間與 frame 的計算 * 如果計算時間多於一秒，則將目前累計的 `fps` 變數轉移到 `prev_fps` 中，並且重置累計的 frame 和時間為甚麼要把 `prev_fps` 把時間記下來呢？這是為了~~吃飽太閒~~想讓 frame rate 顯示在畫面上。 ![](https://i.imgur.com/5IEw5Lg.png) ```cpp #define draw(data, pos, i) \ for (i = 0; i < 4; i++) { \ uint32_t *lb = fb + ((pos) + 2 * i) + SCREEN_WIDTH; \ for (int j = 0; j < 16; j++) { \ int jj = j / 2; \ if (data[jj * 4 + i] == 1) { \ uint32_t *tb = lb; \ for (int x = 0; x < 2; x++) { \ *tb = 0x88444444; \ tb++; \ } \ } \ lb += SCREEN_WIDTH; \ } \ } void Renderer::ShowFPS(uint32_t fps, uint32_t *fb) { int iter; int i; for (i = 0; fps > 0; i++) { draw(number_data[fps % 10], SCREEN_WIDTH - 10 - 15 * i, iter); fps /= 10; } draw(colon_data, SCREEN_WIDTH - 10 - 15 * i, iter); i++; draw(s_data, SCREEN_WIDTH - 10 - 15 * i, iter); i++; draw(p_data, SCREEN_WIDTH - 10 - 15 * i, iter); i++; draw(f_data, SCREEN_WIDTH - 10 - 15 * i, iter); } ``` 在 `Renderer` 中添加新的 method，用很陽春的方式把該顯示數字對應到 0, 1 表示的 table 後，再畫在畫面上。則只要在 `floatRenderer.TraceFrame(&game, floatBuffer)` 之後再加上額外的 ```cpp floatRenderer.ShowFPS(prev_fps, floatBuffer); ``` 就可以達到如圖中的效果了。