Leetcode刷題學習筆記--Set/Multiset

# Leetcode刷題學習筆記--Set/Multiset ## Introduction ref : [Difference Between set, multiset, unordered_set, unordered_multiset in C++](https://www.geeksforgeeks.org/difference-between-set-multiset-unordered_set-unordered_multiset-in-cpp/) | | 排序 | 允許重複資料 | 用途 | | ------------------ | ------------------ | ------------ | --- | | set | :heavy_check_mark: | | 排序不重複資料 | | unordered_set | || 檢查資料單一性</br>當成hash table使用 | | multiset | :heavy_check_mark: | :heavy_check_mark: | 排序重複資料 | | unordered_multiset | |:heavy_check_mark:| | + set/multiset只能存放一個資料，不像map可以存key-value pair。 + ==數字放入set會依小到大排序，字串放入set會依字典序排序。== ## Time Complexity ref : [Data Structures: MULTI_SET vs SET vs UNORDERED_SET](https://medium.com/geekculture/data-structures-multi-set-vs-set-vs-unordered-set-bcc4f6609ac9) | Container | insert() | erase() | find() | Implement| | ------------------ | -------- | ------- | ------ | --- | | set | $O(logN)$| $O(logN)$|$O(logN)$|Binary Search Tree| | unordered_set | $O(1)$ | $O(1)$ |$O(1)$ |Hash Table| | multiset | $O(logN)$| $O(logN)$|$O(logN)$|Red Black Tree| | unordered_multiset | $O(1)$ | $O(1)$ |$O(1)$ |Hash Table| ## Code snippet ### 檢查資料是否重複把每筆資料轉換成單一的識別碼，從而檢查是否重複。可以使用set。 ```cpp= vector<int> nums; unordered_set<int> s(nums.begin(), nums.end()); // check if num has alreay existed in set if(s.count(num)) {...} ``` ### 只能insert和erase不能修改資料 ```cpp= unordered_set<int> s; s.insert(1); s.insert(2); s.erase(1); ``` + multiset/unordered_multiset 的資料刪除因為multiset和unordered_multiset可以允許重複的資料，使用erase或刪除相同的資料，所以必須使用find。 ```cpp= multiset<int> ms; ms.erase(ms.find(num)); ``` ### 如果是插入到set或是multiset，可以知道插入的位置。視情況如果想知道insert的位置。 ```cpp= set<int> s; int nums = 5; auto it = s.insert(nums); // get the iterator of inserted number int idx = it - s.begin(); ``` ### Traversal the set 使用iterator來存取所有的item，或是使用for loop來存取所有的資料。 ref : [Different ways to iterate over a set in C++](https://www.geeksforgeeks.org/different-ways-to-iterate-over-a-set-in-c/) ```cpp= set<int> s; // iterator為指向set中的element，可視為pointer，必須使用*來取值。 for(auto it = s.begin(); it != s.end(); ++it) cout << *it << endl; // for loop 直接就會拿到element的reference for(auto& n : s) cout << n << endl; // 必須使用const來定義輸入argument，因為set內的element也是const， // 所以只能刪除不能修改。 // (start iterator, end iterator, funcion pointer) auto print = [&](const int& x) { cout << x << endl; }; for_each(s.begin(), s.end(), print); for_each(s,begin(), s.end(), [&](const int& x){ cout << x << endl; }); ``` ### 改變Set排列順序 + 預設Set排列，integer為從小到大，string為字典序排列。 + 但是有些情況想要有integer從大到小的排列。 1. 使用iterator ```cpp= for(auto it = s.rbegin(); it != r.end(); ++it) { cout << *it << endl; } ``` 2. 使用compare ```cpp= set<int, less<int>> s; // 從小到大 set<int, greater<int>> s; //從大到小 ``` 3. 反轉key ```cpp= vector<int> nums; set<int> s; for(auto& n : nums) s.insert(-1 * n); ``` ### Sets of pairs in C++ ref : [Sets of pairs in C++](https://www.geeksforgeeks.org/sets-of-pairs-in-c/) 在Set中存放pairs。則會先依pairs first element排序，如果first element一樣就會用second element排序。 ### Custom Comparator ref : [Using custom std::set comparator](https://stackoverflow.com/questions/2620862/using-custom-stdset-comparator) + Modern C++11 Solution ```cpp= auto cmp = [](int a, int b) { return a < b; // 從小到大 }; set<int, decltype(cmp)> s(cmp); ``` + function pointer ```cpp= bool cmp(int a, int b) { return a < b; } set<int, decltype(&cmp)> s(&cmp); ``` ### [36. Valid Sudoku(Medium)](https://leetcode.com/problems/valid-sudoku/) 找出行,列，每個block是否有數字重複。 ```cpp= class Solution { public: bool isValidSudoku(vector<vector<char>>& board) { unordered_set<string> s; string t1, t2, t3; for(int i = 0; i < 9; ++i) { for(int j = 0; j < 9; ++j) { if(board[i][j]=='.') continue; t1 = to_string(i) + "r" + board[i][j]; t2 = to_string(j) + "c" + board[i][j]; t3 = to_string(i / 3) + to_string(j / 3) + "e" + board[i][j]; if(s.count(t1) || s.count(t2) || s.count(t3)) return false; else { s.insert(t1); s.insert(t2); s.insert(t3); } } } return true; } }; ``` ### [929. Unique Email Addresses(Easy)](https://leetcode.com/problems/unique-email-addresses/) 找出不同的email address，判斷規則參考題目。 > 使用unordered_set來過濾是否有重複的email address。**unordered_set<string> valid_mails;** > 把email分成local name和domain name。對localname進行處理。把"+"以後的字串都刪除掉，然後把"."都拿掉。完整程式碼如下: ```cpp= int numUniqueEmails(vector<string>& emails) { unordered_set<string> valid_mails; for(auto& mail : emails) { int pos = mail.find("@"); string localname = mail.substr(0, pos); string domainname = mail.substr(pos + 1); int p = localname.find("+"); string newname = localname.substr(0, p); newname.erase(std::remove(newname.begin(), newname.end(), '.'), newname.end()); valid_mails.insert(newname + "@" + domainname); } return valid_mails.size(); } ``` ### [220. Contains Duplicate III](https://leetcode.com/problems/contains-duplicate-iii/) 給你一個vector<int> nums, 找出index差小於等於k，且數字差小於等於t。 $abs(num[i] - nums[j]) <= t$ 且 $abs(i - j) <= k$。 > 1. 因為限制index差在k之內，所以使用slinding window，新增一個數nums[i]就刪除nums[i - k - 1] > 2. 使用multiset來記錄數字，因為數字可能重複。 > 3. 因為題目是要差值小於t，所以insert之後的數字只需要跟前後去比較。 > 4. **刪除的時候必須使用 s.earse(s.find(val)); 避免刪除全部相同的key。**，因為可能會遇到windows中好幾個相同的數值。 ```cpp= bool containsNearbyAlmostDuplicate(vector<int>& nums, int k, int t) { multiset<long> s; for(int i = 0; i < nums.size(); ++i) { // 使用find之後再erase避免刪除重複的key if(i - k - 1 >= 0) s.erase(s.find(nums[i - k - 1])); // O(logK + logK) // insert之後取得在multiset之中的位置。 auto it = s.insert(nums[i]); // O(logK) if(it != s.begin() && it - *prev(it) <= t) return true; if(next(it) != s.end() && *next(it) - *it <= t) return true; // time : O(N*3*logK) = O(NlogK), // K : size of slinding window // space : O(K) } return false; } ``` ###### tags: `leetcode` `刷題`