作業要求 - HackMD

# 作業要求 1. 筆記中應當包含 JD (務必列出對應公司 JD 的超連結)、分析 JD 和探討自身的匹配狀況、相關的面試題目，和文字導向的自問自答 2. 自 IC 產業結構和軟體工程師的機會和《0 到 100 的軟體工程師面試之路》所及的公司對應的官方網站中，找出較符合自身興趣/規劃的職務描述 (即 JD [Job Description]，至少找出 3 份) 3. 分析上述職缺所需要的能力，探討自己在專業上匹配的程度 4. 嘗試列出上述職缺 (或類似的職缺) 的面試題目，可以是網路搜尋整理，也可以自行改寫 5. 在不揭露自身資訊的狀況下，比照第 1 次和第 2 次作業的問答形式，對面試題目進行問答，文字紀錄即可。避免只用教科書內容回答，儘量搭配自己過去 (或近期學習到) 的程式開發經驗，進行自問自答 # [Associate Data Scientist / IBM](https://careers.ibm.com/job/16928585/-2023-campus-hire-associate-data-scientist-taipei-tw/?codes=IBM_CareerWebSite) ## 工作內容 * Collaborate with a global Data Science team to develop and implement methodologies * Work with solution leaders and engineers to ensure understanding of analytical requirements * Ensure that all algorithms scale and if needed can be executed in real time * Engage with customers to understand their business problems ## 能力要求 * Statistical Knowledge such as regression, time series, mixed model, Bayesian methods, clustering, etc., to analyze data and provide insights * Experience with large-scale, real-world problems on data science; deep learning modeling experience, such as CNN, RNN, LSTM and frameworks include TensorFlow, Keras * Demonstrated communication skills; comfortable speaking with scientific and business audiences at all levels of an organization * Basic understanding of programming language for analytics such as R, Python, Matlab, Scala * Less than 2 years full-time work experience * Fluent English and Mandarin communication ## 匹配程度 * 曾經在 google 實習，工作內容主要就是透過 DL 去做資料分類，過程中也透過不少工具去做 data visualization 以便分析資料，例如 UMPA。 * 修習過資料分析的課程，利用不同圖表去分析資料例如 pie chart 、 count plot、heap map 和 box chart，也有對於特徵分析的經驗。透過創造新的特徵以及刪除不具代表性的特徵使得訓練的模型更加強大。 * 使用過 5 - fold cross validation、GridSearchCV等方式來強化模型的訓練，並接觸過不少的模型例如 RNN、VGG16、ResNet50、SVM、KNN等等 # [Backend Engineer, Platform Team / Yahoo](https://ouryahoo.wd5.myworkdayjobs.com/careers/job/Taiwan---Remote/Platform-Engineer_JR0020996) ## 工作內容 * Responsible for design, implementation/development and unit testing of various modules with minimal supervision * Participate in troubleshooting various problems in the system * Participate in design discussion and reviews * Participated in the full life cycle of a large project: design, implementation, testing, releasing and sustaining. ## 能力要求 * Bachelor or Master in Computer Sciences or related majors * Excellent knowledge and hands-on practice of web service development. * Proficient at Java or PHP and Object oriented programming skills * Understanding of data structures and algorithms * Understanding ofAnalytical and problem solving skills * Knowledge / experience of AWS (S3, DynamoDB, RDS), k8s, Spring Boot will be plus. * Pursues work with energy, passion, drive, and intense customer focus * A personal commitment to continuous learning and self development * Excellent analytical and problem-solving skills and desire to learn new skills * Ability to take initiative and be innovative- Great sense of responsibility and attention to detail ## 匹配程度 * 修習過相關課程會 Java 以及 C++ 等 OOP的撰寫方式。 * 大學生專題是與網路相關，利用p4撰寫交換機，並在專題展中獲得第二名。對於網路知識有一定的了解。 # [ Software Engineer in Test (Intern) / 群輝](https://career.synology.com/zh-tw/HQ/position/91) ## 能力要求 - Currently enrolled in an undergraduate or master’s program - Ability to commute to Synology's office in Banqiao for at least 24 hours per week or during epidemic prevention periods, to follow remote work policies - Proficiency in Python or C++ - Familiarity with Unix/Linux - Familiarity with Git - Experience with JavaScript, CSS, and/or HTML ## 匹配程度 * C++ 、 python 、 git 都會使用。 * 但是網頁以及 linux 知識還很欠缺不過都是目前規劃準備要學習的領域，所已還是先把這個職缺給記錄下來。 # 模擬面試 - Associate Data Scientist / IBM 以 ibm 的 Personal Interview 作為模擬 ***面試官:*** 請分享過往相關的資料分析專案、經驗與能力。 ***我:*** 2022年的暑假我到 google 實習，並負責與 Machine Learning (ML) 以及資料分析相關的專案，題目部分因為涉及公司機密無法細講，簡單來說是要透過 ML 的方式，設計一套方法去辨識圖片正常與否。因為現實中鮮少會出現不正常的圖片，所以面對最大的挑戰莫過於不正常資料量過少的問題。透過訓練 ResNet50 模型作為 feature extractor 以及 GMM 模型作為 one class clustering 模型組合使用解決了這些問題，且在不同測項都有很好的表現，false positive rate 分別控制在0%以及4%。並透過 UMAP 的方式將原本高維度的資料以及特徵降至二維，以便視覺化資料分布並驗證 feature extractor 功效。最後設計了 GUI 包裝了所有模型以及前處理，自動化的提供使用者去分類未標明種類的圖片。 ***我:*** 另外我在大學時也有些修習過資料分析的課程，有實作過的專案包括分析並預測 Titanic 上的人員是否死亡、分析並預測波士頓的房價、以及透過 RNN 來預測風力發電機未來的發電量。在這些作業中用不同圖表去分析資料例如 pie chart 、 count plot、heap map 和 box chart，也有對於特徵分析的經驗。透過創造新的特徵以及刪除不具代表性的特徵使得訓練的模型更加強大。使用過 5 - fold cross validation、GridSearchCV等方式來強化模型的訓練，並接觸過不少的模型例如 RNN、VGG16、ResNet50、SVM、KNN等等。 ***面試官問答：*** 在這些 Project 中遇到了哪些問題，而你又是怎麼解決的? ***我:*** 最有印象的是在 google 實習時，一開始因為資料非常不對稱，直接使用傳統的 binary classifier 受限於有一方資料量極少，效果極差。當時因為沒有類似的經驗還蠻頭痛的，加上我的 host 與沒有類似的經驗，所以也不知道如何處理。最後我在 IEEE 上找了不少論文，也透過網路上查詢，找到了 one class classifier 相關的方法，並加以實作。也自主約了其他部門有相關經驗的正職，每個禮拜開一次會，來確認我找到的這些方法是可行且有用的，並詢問一些意見以及未來可繼續前進的方向。 ***面試官問答：*** 你已經有了在 google 實習的經驗了，為甚麼沒有考慮繼續留在那邊，而是來我們公司工作呢? ***我:*** 我確實很喜歡 google 的文化以及氛圍。但是因為那是我第一個加入的企業，我認為自己對於整個業界了解太少。眼界是十分重要的，我想多看看其他厲害的科技公司不同的文化以及環境，而因為原本的專長就是 data science，看到了相關的職缺，所以我選擇了ibm。