692. Top K Frequent Words

###### tags: `Leetcode` `medium` `hash` `heap` `python` `c++` # 692. Top K Frequent Words ## [題目連結:] https://leetcode.com/problems/top-k-frequent-words/ ## 題目: Given an array of strings ```words``` and an integer ```k```, return the ```k``` most frequent strings. Return the answer **sorted** by **the frequency** from highest to lowest. Sort the words with the same frequency by their **lexicographical order**. **Example 1:** ``` Input: words = ["i","love","leetcode","i","love","coding"], k = 2 Output: ["i","love"] Explanation: "i" and "love" are the two most frequent words. Note that "i" comes before "love" due to a lower alphabetical order. ``` **Example 2:** ``` Input: words = ["the","day","is","sunny","the","the","the","sunny","is","is"], k = 4 Output: ["the","is","sunny","day"] Explanation: "the", "is", "sunny" and "day" are the four most frequent words, with the number of occurrence being 4, 3, 2 and 1 respectively. ``` **Follow-up**: Could you solve it in O(n log(k)) time and O(n) extra space? ## 解題想法: * 此題為對單字按照出現次數由多到少進行排序 * 若出現次數相同，則按字母序小到大排序 * 最終返回前k個 ## Python: * 判斷出現次數: * 可用collections.Counter()進行統計 * [https://docs.python.org/zh-tw/3/library/collections.html] * Counter為dict的子類別，用以計算數量 * 對於其內容進行自定義排序 * sorted()創建排序好的新list * 使用**key=lambda** * 先表示按照出現次數，由大到小排好 * 再按字母順序，由小到大排好 ``` python= from collections import Counter class Solution(object): def topKFrequent(self, words, k): """ :type words: List[str] :type k: int :rtype: List[str] """ count=Counter(words) #-x[0]表示按照出現次數由大到小排好 #x[0] 按字母順序由小到大排好 dic=sorted(count.items(),key=lambda x:(-x[1],x[0])) #print(dic) [('i', 2), ('love', 2), ('coding', 1), ('leetcode', 1)] res=[item[0] for item in dic] #print(res) ['i', 'love', 'coding', 'leetcode'] return res[:k] if __name__=='__main__': result=Solution() ans=result.topKFrequent(words = ["i","love","leetcode","i","love","coding"], k = 2) print(ans) ``` ## C++: * 對於自定義排序: * 使用**cmp**: 需引入include< algorithm >[https://www.gushiciku.cn/pl/gh0b/zh-tw] * 使用priority_queue進行存放(為max_heap)[https://yuihuang.com/cpp-stl-priority-queue/] * 對於priority_queue中改變優先權定義: * priority_queue<T> pq; 預設由大排到小 * priority_queue<T, vector<T>, greater<T> > pq; 改成由小排到大 * priority_queue<T, vector<T>, **cmp**> pq; 自行定義 cmp 排序 * 其中需要注意對於宣告時候會出現template argument for template type parameter must be a type此問題: ![](https://i.imgur.com/0ooaKxL.png) ![](https://i.imgur.com/Z9I0Zgo.png) * 需加上**decltype**以解決上述問題: [https://stackoverflow.com/questions/16111337/declaring-a-priority-queue-in-c-with-a-custom-comparator] ![](https://i.imgur.com/u3K7enG.png) ``` cpp= #include<bits/stdc++.h> using namespace std; class Solution { public: vector<string> topKFrequent(vector<string>& words, int k) { vector<string> res(k); unordered_map<string, int> Freq; //use cmp //注意寫法: 先[]再()寫內容: 為lambda function方式編寫 auto cmp=[](pair<string, int>& a, pair<string, int>& b){ return a.second>b.second || (a.second==b.second && a.first<b.first); //出現次數多的先，若次數相同，則字母小的先 }; //priority_queue<T, vector<T>, cmp> pq; 自行定義 cmp 排序 priority_queue<pair<string, int>, vector<pair<string, int>>, decltype(cmp) > que(cmp); //count the frequence for (auto item: words){ Freq[item]+=1; } //push to the que for (auto item: Freq){ que.push(item); if (que.size()>k) que.pop(); } //let que's item into res //que: LIFO for (int i=que.size()-1; i>=0; i--){ res[i]=que.top().first; //first: string; second: int que.pop(); } return res; } }; ```