721. Accounts Merge

# 721. Accounts Merge ###### tags: `leetcode` `DFS` `721` `medium` `google` ## :memo: Question Given a list of `accounts` where each element `accounts[i]` is a list of strings, where the first element `accounts[i][0]` is a name, and the rest of the elements are **emails** representing emails of the account. Now, we would like to merge these accounts. Two accounts definitely belong to the same person if there is some common email to both accounts. Note that even if two accounts have the same name, they may belong to different people as people could have the same name. A person can have any number of accounts initially, but all of their accounts definitely have the same name. After merging the accounts, return the accounts in the following format: the first element of each account is the name, and the rest of the elements are emails **in sorted order**. The accounts themselves can be returned in **any order**. ### Example 1: ``` Input: accounts = [["John","johnsmith@mail.com","john_newyork@mail.com"],["John","johnsmith@mail.com","john00@mail.com"],["Mary","mary@mail.com"],["John","johnnybravo@mail.com"]] Output: [["John","john00@mail.com","john_newyork@mail.com","johnsmith@mail.com"],["Mary","mary@mail.com"],["John","johnnybravo@mail.com"]] Explanation: The first and third John's are the same person as they have the common email "johnsmith@mail.com". The second John and Mary are different people as none of their email addresses are used by other accounts. We could return these lists in any order, for example the answer [['Mary', 'mary@mail.com'], ['John', 'johnnybravo@mail.com'], ['John', 'john00@mail.com', 'john_newyork@mail.com', 'johnsmith@mail.com']] would still be accepted. ``` ### Example 2: ``` Input: accounts = [["Gabe","Gabe0@m.co","Gabe3@m.co","Gabe1@m.co"],["Kevin","Kevin3@m.co","Kevin5@m.co","Kevin0@m.co"],["Ethan","Ethan5@m.co","Ethan4@m.co","Ethan0@m.co"],["Hanzo","Hanzo3@m.co","Hanzo1@m.co","Hanzo0@m.co"],["Fern","Fern5@m.co","Fern1@m.co","Fern0@m.co"]] Output: [["Ethan","Ethan0@m.co","Ethan4@m.co","Ethan5@m.co"],["Gabe","Gabe0@m.co","Gabe1@m.co","Gabe3@m.co"],["Hanzo","Hanzo0@m.co","Hanzo1@m.co","Hanzo3@m.co"],["Kevin","Kevin0@m.co","Kevin3@m.co","Kevin5@m.co"],["Fern","Fern0@m.co","Fern1@m.co","Fern5@m.co"]] ``` Constraints: * `1 <= accounts.length <= 1000` * `2 <= accounts[i].length <= 10` * `1 <= accounts[i][j] <= 30` * `accounts[i][0] consists of English letters.` * `accounts[i][j] (for j > 0) is a valid email.` ## :memo: 題意 :bulb: 這題都是文字所以要仔細看清楚題目的意思， ```python # 結構長這樣 input = [["name", "email"....],["name","email"...],...] ``` :bulb: 要將相同名字的 email 合併再一起，但有可能有人是相同名字但是是不同人，怎麼判定是同一個人，如果`同一個人那他的名字一定會相同而且會有一個相同的 email，最後的 output 名字要在最前面，email 要依照字母排序`。 ## :memo: My first solution * **Hint**: For every pair of emails in the same account, draw an edge between those emails. The problem is about enumerating the connected components of this graph. * :medal: **思考一**: 這題我有看 hint: 將每個 email 的關係圖建立出來 * :medal: **思考二**:一看到Constraints，有明確限制大小和範圍的通常暴力解都可以過，但我就不確定面試時，主考官會不會想看到暴力解。 * :medal: **思考三**:看到 hint 這樣提示，就是往 dfs 走了。 * :medal: **思考四**:我必須要有一個 visited 去檢視我有走過的點，有走過就不走了。 ```python= class Solution: def accountsMerge(self, accounts: List[List[str]]) -> List[List[str]]: relation = [[] for i in range(len(accounts))] visited = [False for i in range(len(accounts))] for i in range(len(accounts)-1): for j in range(i+1, len(accounts)): for x in accounts[i][1:]: if x in accounts[j][1:]: relation[i].append(j) relation[j].append(i) break ans = [] def dfs(i, data): for j in relation[i]: if not visited[j]: data += accounts[j][1:] visited[j] = True data = dfs(j, data) return data for i in range(len(relation)): if not visited[i]: visited[i] = True result = dfs(i, accounts[i]) x = result[0:1] + sorted(set(result[1:])) ans.append(x) return ans ``` ## :memo: bigO * 時間複雜度: m * n * n (7~13行) + n (25行) * nlogn (我有做一個sort 在29行) * 空間複雜度: 2*n ( relation matrix + visited matrix) + m*n (ans output) ## :-1: **檢討** * 雖然有過，但效率很慢。 * 建立 relation table 的這一步應該要簡化。 ## :memo: leetcode solution ```python= class Solution: def accountsMerge(self, accounts: List[List[str]]) -> List[List[str]]: em_to_name = {} graph = collections.defaultdict(set) for acc in accounts: name = acc[0] for email in acc[1:]: graph[acc[1]].add(email) graph[email].add(acc[1]) em_to_name[email] = name seen = set() ans = [] for email in graph: if email not in seen: seen.add(email) stack = [email] component = [] while stack: node = stack.pop() component.append(node) for nei in graph[node]: if nei not in seen: seen.add(nei) stack.append(nei) ans.append([em_to_name[email]] + sorted(component)) return ans ``` ## :memo: My first solution Fix ```python= class Solution: def accountsMerge(self, accounts: List[List[str]]) -> List[List[str]]: em_to_name = {} relation = collections.defaultdict(set) for acc in accounts: name = acc[0] for email in acc[1:]: relation[acc[1]].add(email) relation[email].add(acc[1]) em_to_name[email] = name seen = set() ans = [] def dfs(email, data): for r_email in relation[email]: if r_email not in seen: seen.add(r_email) data.append(r_email) data = dfs(r_email, data) return data for email in relation: if email not in seen: seen.add(email) ans.append([em_to_name[email]] + sorted(dfs(email, [email]))) return ans ``` ## :memo: bigO * 時間複雜度: m*n (matrix的每個值都看過一遍) + a * aloga (a = relation[i] 的長度) * 空間複雜度: relatoin matrix and dfs data list ## :memo: 為什麼這種寫法會快很多？ * 簡化了 relation table，導致後面根據 relation matrix 進行 dfs 都減少了運算。 ``` => realtion = { email : set(email1,email2....) email2: set(emailx,emaily....) .... ... } ```