The Art of Readable Code & Clean Code 共筆

# Prologue > 💡 這本書傳達的關鍵就是換位思考。思考第一次看到程式碼的人會想什麼? > 對方想知道什麼；不想知道甚麼。 > 你想讓對方做什麼；不想讓對方做甚麼。 ![image](https://hackmd.io/_uploads/BJt-nZq26.png "Two main resources") # Naming ## Intention-revealing name > 取名字要有意義，能讓人一看就理解它的用意 ```c /* bad examples */ int d, size, len, start; char* p; int getID (socketArray* arr, int index) { // ... } /* good examples */ int daysPerMonth, stringSize, startTimeStamp; char* pbuffer; int getSocketNumber (socketArray* arr, int index){ // ... } ``` ```c int size; // not a good idea, for what? int mem_size; int numbers; int height; int width; int volume; int weight; int get_data (char* src); // get data locally, or via internel? int get_local_data (char* src); int fetch_rmt_data (char* src); int download_rmt_data (char* src); int send (char* src); // dunno the way of interaction int send_message (char* src); // emphasize the "send" action int deliver_notification (char* src); // deliver_package (char* src) int dispatch_event (char* src); // Emphasize the delivery of something to a recipient ``` ![image](https://hackmd.io/_uploads/HyWSaDBh6.png) [**英文同義詞辭典**](https://dictionary.cambridge.org/zht/thesaurus/) ## Make meaningful distinctions > 避免使用模糊不清與無意義的命名，通常情況下這種問題不會發生，但當工程師想要通過 compile、interpreter 的編譯或翻譯時有可能 “暫時修改名稱”，又或者有多個類似功能的函式、變數時，可能產生這種命名問題 ```c /* bad examples */ int func1(void); int func2(void); int procData(void); int procDta(void); void* strcpy(char* a1, char* a2); char* readfile(FILE* fp); char* read_file(FILE* fp); int getObj(void); int getObjLen(void); int getObjHeight(void); /* Improved examples */ int calculateInterest(void); // Assuming func1 is used to calculate interest int updateAccountBalance(void); // Assuming func2 is used to update account balance int processPaymentData(void); // Assuming procData is used to process payment data int processCustomerData(void); // Assuming procDta is a typo, actually meant to process customer data void* copyString(char* destination, char* source); // Correcting function name and parameter names to more clearly reflect function and intent char* readFileContent(FILE* filePointer); // Correcting function name and parameter type for clarity char* fetchFileMetadata(FILE* filePointer); // Assuming read_file is for fetching file metadata, and correcting parameter type int getObjectID(void); // Assuming getObj is for obtaining the object's ID int getObjectLength(void); // Clearer version of getObjLen int getObjectHeight(void); // Clearer version of getObjHeight ``` ## Concrete over abstract > 具體、直接的描述變數或函式，不要拐彎抹角。另外你可以使用領域內的專業術語，減少字數 - **原始命名**: `ServerCanStart()` - **問題**: 不夠具體，僅表達了一個可能的結果（伺服器能開始）而不是實際監聽 TCP/UDP 端口的操作。 - **改進**: `CanListenOnPort()` 或 `IsPortAvailableForListening()` - **原始命名**: `CheckUser()` - **問題**: 不清楚是在檢查什麼關於用戶的信息。 - **改進**: `IsUserAuthenticated()` 或 `DoesUserExist()` - **原始命名**: `DataValid()` - **問題**: 不明確指出哪些數據有效，或是根據什麼標準。 - **改進**: `IsRespondDataValid()` 或 `IsRequestDataComplete()` - **原始命名**: `UpdateConfig()` - **問題**: 不清楚是更新哪些設定，或是在什麼情境下更新。 - **改進**: `UpdateUserPrivacySettings()` 或 `RefreshApplicationConfiguration()` - **原始命名**: `RunJob()` - **問題**: 不具體表明這個作業是做什麼。 - **改進**: `ExecuteDataBackupJob()` 或 `StartEmailSyncProcess()` ```c // Before: Ambiguous function name bool ServerCanStart(); // After: Clear function name indicating the action of listening on a port bool CanListenOnPort(); bool IsPortAvailableForListening(); // Before: Unclear what user information is being checked bool CheckUser(); // After: Specific function names indicating the purpose of the check bool IsUserAuthenticated(); bool DoesUserExist(); // Before: Vague naming not specifying what data is being validated bool DataValid(); // After: Descriptive function names clarifying the type of data being validated bool IsResponseDataValid(); bool IsRequestDataComplete(); // Before: General function name not specifying what configuration is updated void UpdateConfig(); // After: Detailed function names specifying what settings are being updated void UpdateUserPrivacySettings(); void RefreshApplicationConfiguration(); // Before: Generic job function name without details void RunJob(); // After: Specific function names describing the job being executed void ExecuteDataBackupJob(); void StartEmailSyncProcess(); ``` ## Make pronounceable name > *If you can’t say the word, then you don’t understand it* > Discussion can be difficult because of the difficulty of pronunciation - **原始命名**: `cfss` (代表 check fota socket status) - **問題**: 縮寫無法發音，導致討論時不清晰。 - **改進**: `checkFotaSocketStatus` 或 `fotaSocketStatusCheck` - **原始命名**: `rddt` (代表 retrieve device data) - **問題**: 縮寫難以發音，使得溝通更加困難。 - **改進**: `retrieveDeviceData` 或 `getDeviceData` - **原始命名**: `gtsm` (代表 get system memory) - **問題**: 此縮寫不是自然語言中的單詞，難以識別和發音。 - **改進**: `getSystemMemory` 或 `systemMemoryGet` - **原始命名**: `updtRec` (代表 update record) - **問題**: 縮寫形式雖然簡短，但不易於發音或理解。 - **改進**: `updateRecord` 或 `recordUpdate` - **原始命名**: `prcUsr` (代表 process user) - **問題**: 縮寫使得名稱難以發音和理解。 - **改進**: `processUser` 或 `userProcess` ```c // Before: Confusing abbreviation that is hard to pronounce void cfss(); // After: Clear function name that reveals its intention void checkFotaSocketStatus(); void fotaSocketStatusCheck(); // Before: Abbreviation that is difficult to pronounce and understand void rddt(); // After: Descriptive function name that is self-explanatory void retrieveDeviceData(); void getDeviceData(); // Before: Non-word abbreviation that is hard to identify and pronounce void gtsm(); // After: Clear and meaningful function name void getSystemMemory(); void systemMemoryGet(); // Before: Abbreviated form that is not intuitive or easy to pronounce void updtRec(); // After: Fully spelled out function name for better clarity void updateRecord(); void recordUpdate(); // Before: Abbreviated name that makes pronunciation and understanding difficult void prcUsr(); // After: Explicit function name clarifying the action void processUser(); void userProcess(); ``` > everything has exception … > like `pbuffer` is acceptable > 關鍵在於這些縮寫是**約定俗成的表示方法** - **命名**: `httpClient` - **原因**: 在網絡編程和應用開發中，`http` 是一個非常普遍的縮寫，代表超文本傳輸協議(HyperText Transfer Protocol)。由於它的廣泛使用和認知，`httpClient` 即便是縮寫，也能被大多數開發者迅速理解。 - **命名**: `dbConn` - **原因**: 在資料庫工作中，`db` 作為資料庫（database）的縮寫是廣泛接受的。同樣，`Conn` 作為連接（connection）的縮寫也是常見的。因此，`dbConn` 雖不易於發音，但對於有資料庫經驗的開發者來說，它是容易理解的。 - **命名**: `usrId` - **原因**: 雖然 `usr` 是 `user` 的不完整形式，但這種縮寫在許多系統中都是慣用的，且容易被理解。`Id` 通常被認為是 `identifier`（識別符）的縮寫，這種組合在軟體開發中十分常見。 - **命名**: `recvPacket` - **原因**: 在網絡編程中，`recv` 是接收（receive）的常見縮寫，而 `packet` 在這個領域也是一個標準術語。即使 `recv` 不是一個完整的單詞，它在相關領域的普及使它成為一個容易被識別和接受的名稱。 - **命名**: `srcAddr` - **原因**: `src` 和 `Addr` 分別是 `source` 和 `address` 的常見縮寫。在處理如網絡地址轉換、數據傳輸等領域時，這些縮寫非常標準和廣泛被接受。 ```c HTTPClient httpClient; // Assuming HTTPClient is a type defined in a network library DatabaseConnection *dbConn; // Assuming DatabaseConnection is a type defined in a database library int usrId; // If user IDs are numerical char* usrId; // If user IDs are strings Packet recvPacket; // Assuming Packet is a struct or class for network packets char* srcAddr; // If source address is a string IP address Address srcAddr; // Assuming Address is a struct for more complex addressing needs ``` ### common abbreviation in C projects | 縮寫 | 完整名稱 | | ---- | ---- | | skb | Socket Buffer | | fd | File Descriptor | | dev | Device | | cfg | Configuration | | alloc | Allocate | | dealloc | Deallocate | | init | Initialize | | irq | Interrupt Request | | msg | Message | | ptr | Pointer | | rx | Receive | | tx | Transmit | | reg | Register or Regular | | buf | Buffer | | len | Length | | addr | Address | | cnt | Count | | max | Maximum | | min | Minimum | | param | Parameter | | val | Value | | ver | Version | | cmd | Command | | stat | Status | | tmp | Temporary | | temp | Temporary | | idx | Index | | num | Number | | cpy | Copy | | del | Delete | | disp | Display | | err | Error | | exec | Execute | | fmt | Format | | hdlr | Handler | | inc | Include or Increment | | info | Information | | mod | Module | | ops | Operations | | pkt | Packet | | proc | Process or Procedure | | recv | Receive | | req | Request | | res | Resource or Response | | ret | Return | | sync | Synchronize | | usr | User | | util | Utility | ## Add extra information > 為變數添加單位、格式，讓使用者正確的使用有單位、格式性質的變數。 ```c int sec; int sec_ms; uint8_t* data; uint8_t* raw_data; uint8_t* hex_data; ``` 當然，為變數添加額外信息，比如單位、格式或類型，是一種提升代碼可讀性和易於維護的命名原則。這種做法能夠讓閱讀代碼的人在不查閱變數聲明或文檔的情況下，迅速理解變數的用途和性質。以下是這個原則的詳細說明和一些例子： ### Unit information 在涉及時間、長度、重量等的變數命名中添加單位，可以明確變數的量度標準。 - **時間單位**：`int delay_ms;` 表示延遲時間以毫秒為單位。 - **長度單位**：`float width_cm;` 表示寬度以厘米為單位。 - **溫度單位**：`double temp_celsius;` 表示溫度以攝氏度為單位。 ### Format information 對於存儲特定格式數據的變數，通過名稱指示其格式有助於理解如何處理這些數據。 - **數據格式**：`String date_iso8601;` 表示日期字符串遵循ISO 8601標準格式。 - **編碼方式**：`char* name_utf8;` 表示名字字符串使用UTF-8編碼。 ### Type information 對於某些特定類型的數據，尤其是在類型對處理邏輯有特定要求的場合，通過變數名指明數據類型非常有用。 - **指針類型**：`void* ptr_resource;` 明確指出這是一個指向資源的指針。 - **布林類型的狀態**：`bool is_visible;` 表示某個對象是否可見。 - **枚舉類型**：`Color color_primary;` 表示顏色採用枚舉類型`Color`。 ```c // 1. Adding unit information int delay_ms; // Delay time in milliseconds float width_cm; // Width in centimeters double temp_celsius; // Temperature in Celsius // 2. Adding format information String date_iso8601; // Date string following the ISO 8601 standard format char* name_utf8; // Name string in UTF-8 encoding // 3. Adding type information void* ptr_resource; // Pointer to a resource bool is_visible; // Indicates whether an object is visible Color color_primary; // Color using the Color enumeration type // 4. Combining information int speed_kmph; // Speed in kilometers per hour unsigned char* buffer_hex; // Data buffer stored in hexadecimal format ``` ### Cautions 雖然這種命名方法能提高代碼的清晰度和自文檔化能力，但也有一些注意事項： - **避免過度冗長**：變數名應保持簡潔，避免添加過多的信息使得名稱過長，難以閱讀。 - **一致性**：整個項目或團隊內部應採用一致的命名約定，以避免混淆。 ```c // Examples of overly verbose naming int delayBeforeStartOperationInMilliseconds; // Too long and hard to read float distanceBetweenTheTwoPointsInCentimeters; // Excessively detailed and unnecessary // Inconsistent naming int userAgeYears; // 'Years' used at the end in one place int userHeightInCm; // 'InCm' used at the end in another place double weight_kg; // Underscore and lowercase abbreviation used elsewhere ``` ## Make it searchable > 當常數出現次數高時，為其添加變數名能夠提高程式的理解度，並且利於搜索 > 同時要修改程式時，只需要修改變數值即可，不需要逐一修正常數 * 避免使用數字字面量：數字字面量（如 86400）難以理解其含義，也難以搜索。相比之下，使用具有明確意義的常量（如 SECONDS_PER_DAY）更加可讀且容易搜索 ```c for (int j=0; j<34; j++) { s += (t[j]*4)/5; } ``` ```c int realDaysPerIdealDay = 4; const int WORK_DAYS_PER_WEEK = 5; int sum = 0; for (int j=0; j < NUMBER_OF_TASKS; j++) { int realTaskDays = taskEstimate[j] * realDaysPerIdealDay; int realTaskWeeks = (realDays / WORK_DAYS_PER_WEEK); sum += realTaskWeeks; } ``` ### Advantages 1. **易於定位相關代碼**：具有明確含義的名稱（如`WORK_DAYS_PER_WEEK`）比抽象或通用的名稱（如`d`或`num`）更易於在代碼庫中被搜索和定位。這對於在大型項目中快速找到相關代碼片段非常有幫助。 2. **提升搜索效率**：在進行全文搜索時，具有具體含義的名稱能減少誤命中的情況，提高搜索結果的相關性和準確性。 ### In terms of modification and debugging 1. **簡化修改過程**：當變量名和常量直接反映其含義和用途時，修改代碼（如調整常量值）變得更直接和安全，因為開發者可以快速理解變量的作用和影響範圍。 2. **便於理解和維護**：代碼的可讀性提高，使得其他開發者（或未來的你）能更容易理解代碼邏輯，從而簡化了維護工作和減少了調試時的錯誤。 3. **減少引入錯誤的風險**：在需要修改或重構代碼時，明確的命名減少了因誤解變量用途而引入錯誤的風險。 ### When to apply - **長期項目**：對於需要長期維護和迭代的項目，使用明確的命名非常關鍵，有助於新加入的團隊成員快速理解項目結構和代碼邏輯。 - **團隊合作**：在團隊協作的環境中，統一和清晰的命名約定可以確保所有成員都能快速理解和貢獻代碼。 - **複雜系統**：在邏輯複雜或模塊眾多的系統中，清晰的命名有助於區分和管理系統的不同部分。 ### When not to apply - **非常短暫的、局部的使用**：對於只在很小範圍內使用的臨時變量，過於詳細的命名可能是不必要的，如在短小的循環中使用的索引變量`i`。 - **原型開發和快速迭代**：在快速原型開發或探索性編程階段，過分關注命名可能會降低開發速度。不過，一旦原型確認，代碼進入更正式的開發階段，合適的命名就變得重要了。 ```c #define MAX_ITEM_NUMS 10 int raffle_box[MAX_ITEM_NUMS] = {1,2,3,4,5,6,7,8,9,10}; int total_prize = 0; // this for (int i = 0; i < MAX_ITEM_NUMS; ++i) { total_prize += raffle_box[i]; } // or this ? for (int the_value_of_the_item; the_value_of_the_item < MAX_ITEM_NUMS; ++the_value_of_the_item) { total_prize += raffle_box[the_value_of_the_item]; } ``` ## Avoid generic name > 為變數或函式取名時，需要注意通用命名 (generic name) 能否提供足夠信息；或者造成開發人員困擾 :::danger **Avoid** - If the number of loop layers is large and the logic is complex, it can be considered - If you look at the start and stop condition of the loop and you cannot be able to determine the local variable immediately - The scope of the iterator extends beyond the loop itself ::: **example** ![image](https://hackmd.io/_uploads/Hy7-kfqhp.png) > 多層迴圈，為了避免操作失誤，使用 `si`, `ui`, `mi` 也許是更好的選擇 ### Iterators **example 1 :** ```c #include <stdio.h> #define MAX_LAYERS 5 #define MAX_ROWS 10 #define MAX_COLS 10 int main() { // 假設三維溫度數據集 double temperatureData[MAX_LAYERS][MAX_ROWS][MAX_COLS] = { // 初始化數據... }; // 用於存儲每層的平均溫度 double layerAverages[MAX_LAYERS] = {0.0}; // 遍歷每一層 for (int layerIndex = 0; layerIndex < MAX_LAYERS; layerIndex++) { double totalTemp = 0.0; int totalCount = 0; // 遍歷每一行 for (int rowIndex = 0; rowIndex < MAX_ROWS; rowIndex++) { // 遍歷每一列 for (int colIndex = 0; colIndex < MAX_COLS; colIndex++) { totalTemp += temperatureData[layerIndex][rowIndex][colIndex]; totalCount++; } } // 計算並存儲當前層的平均溫度 layerAverages[layerIndex] = totalTemp / totalCount; } // 輸出每層的平均溫度 for (int layerIndex = 0; layerIndex < MAX_LAYERS; layerIndex++) { printf("Layer %d average temperature: %.2f\n", layerIndex, layerAverages[layerIndex]); } return 0; } ``` > 在這個案例中，我們可以看到矩陣操作包裝在三層迴圈之中，這種狀況下使用能辨別的局部變數，能夠有效減少惱人的 bug **example 2 :** ![image](https://hackmd.io/_uploads/r1H9WMc2T.png) > 這個狀況下第一次接手專案的人可能不太容易從迴圈的起始與中止條件判斷出 `p` ，但其實 `p` 就是一個鏈表遍歷的過程，但是使用 `currentAddrInfo` 更能讓人知道當前操作邏輯 **example 3 :** ![image](https://hackmd.io/_uploads/S1Ly7Gc2p.png) > `i`, `j` 的 scope 很顯然不僅侷限在各自的 for loop 當中，尤其是下方的程式碼邏輯中還有 `j>0`, `i<j` ，顯然更具體的命名，能讓可讀性提升 ### Temporary variables > The name `tmp` should be used only in cases when being **short-lived and temporary** is the most important fact about that variable. :::success **Accept** * The operation logic is extremely simple ::: :::danger **Avoid** * Besides being used to store values, it is also used for other purposes (functions, macros, complex calculations, etc.) ::: **example 1 :** ```c= if (right < left) { tmp = left; left = right; right - tmp; } ``` > 簡單的交換操作，符合簡短且生命週期不長的準則，可以放心地使用通用變數名 (`tmp`) **example 2 :** ![image](https://hackmd.io/_uploads/SJaSHG5np.png) > 我很想知道 `tmp` 到底拿來做甚麼? `fun (tfm, tmp, tmp)` 使用的三個參數代表什麼? ### Return value > The name retval doesn’t pack much information. Instead, use a name that > describes the variable’s value. :::success **Accept** * When ret is returned value of function, whether meaning can be inferred from the function name itself * When ret is only used to receive temporary return values, whether its purpose can be determined through macros, functions ::: :::danger **Avoid** * If ret is used to store computational results, it’s better to give a complete definition ::: **example 1 :** ```c= #include <stdio.h> static int _isPositive(int number) { int ret = number > 0; return ret; } int main() { int number = 5; int ret = _isPositive(number); if (ret) { printf("%d is positive.\n", number); } else { printf("%d is not positive.\n", number); } return 0; } ``` > 我們可以從 `_isPositive(number)` 推測出返回值代表什麼，且 `ret` 的 scope 也不長，可以放心地使用 `ret` 來表示返回值 **example 2 :** ```c= #include <stdbool.h> #include <stdio.h> #include <string.h> bool check_user_permission(const char* user_name) { bool ret = false; if (strcmp(user_name, "admin") == 0) { ret = true; // admin has permission } // omit return ret; } int main() { const char* user_name = "admin"; bool ret = check_user_permission(user_name); if (ret) { printf("%s has permission.\n", user_name); } else { printf("%s does not have permission.\n", user_name); } return 0; } ``` > 在 `ret` 代表的意涵可以從 `check_user_permission` 函式名稱以及返回值來推敲，雖然沒有註解但也不影響理解 **example 3 :** ```c= // bad example double euclidean_norm (double* buf, int length) { double ret = 0.0; for (int i = 0; i < length; ++i) { ret += buf[i] * buf[i]; } return sqrt(ret); } // improved example double euclidean_norm (double* buf, int length) { double sum_squares = 0.0; for (int i = 0; i < length; ++i) { sum_squares += buf[i] * buf[i]; } return sqrt(sum_squares); } ``` > 雖然只是簡單的數學運算，但也應該避免使用 `ret`。想比之下 `sum_square` 更可以讓我們知道整個操作邏輯在幹嘛 ```c= double process_data(double base, int years, double rating) { double ret = base; ret += (years > 5) ? base * 0.1 * (years - 5) : 0; ret *= (rating > 4.5) ? 1.2 : (rating >= 3.5) ? 1.1 : 0.9; return ret; } ``` > 天知道 `ret` 代表甚麼 ### Stored variables :::danger **Avoid** * If this stored variable is global * If there’re multiple stored variables within same code section ::: ### Common data structures ## How long should a name be? > 詳細程度 vs 精簡，how to reach the sweet spot? ## Using prefix for naming - Kernel 與 C 專案中常見的底線 - \_function - g_variable - two_conept_words - \_\_static_function - \_\_VARIABLES_IN_KERNEL - lower_seperated - struct / enum / union ## Why not Hungary Notation? - 匈牙利命名法 - intTemp - strName - pPtr - 為何不使用匈牙利命名法 - 程式碼風格以及慣例 - 維護成本 - IDE 的進化 ## Linux coding style and GNU standard coding style - [Linux coding style](https://www.kernel.org/doc/html/v4.10/process/coding-style.html) - [GNU standard coding style](https://www.gnu.org/prep/standards/standards.html)