owned this note
owned this note
Published
Linked with GitHub
# 資訊科技產業專案設計課程作業 3
## [Amazon - Software Dev Engineer](https://www.linkedin.com/jobs/view/4057712721)
:::spoiler Job Descriptions & Requirements
### 工作內容
1. Support the development of the Ring and Blink FW and work with the SDET team to support the building up of the test framework with the QA team.
2. Collaborate with experienced cross-disciplinary Amazonians to conceive, design, and bring innovative products and services to market.
3. Design and build innovative technologies in a large distributed computing environment, and help lead fundamental changes in the industry.
4. Create solutions to run predictions on distributed systems with exposure to innovative technologies at incredible scale and speed.
5. Build distributed storage, index, and query systems that are scalable, fault-tolerant, low cost, and easy to manage/use.
6. Ability to design and code the right solutions starting with broadly defined problems.
7. Work in an agile environment to deliver high-quality software.
### 條件要求
* Bachelor’s or Master’s Degree in Computer Science, Computer Engineering, or related field at time of application.
* Graduate between 2024/08 - 2025/08.
* Knowledge of the syntax of languages such as C/C++, Java, Python.
* Knowledge of Computer Science fundamentals such as object-oriented design, algorithm design, data structures, problem solving, and complexity analysis.
:::
### 自身評估
:::success
* 有CS學科基礎
* 能快速掌握技術:專案中涉及多種技術(如 Firebase, React, GCP)
:::
:::danger
* 缺乏分散式技術經驗
* 不熟悉C/C++, Java
* 演算法不熟
:::
## [AMD - Machine Learning Software Development](https://www.linkedin.com/jobs/view/4028252644)
:::spoiler Job Descriptions & Requirements
### 工作內容
1. Deliver ML model solutions using JAX-based frameworks for distributed training or inference purposes.
2. Implement HIP/CUDA-based features such as FlashAttention, PagedAttention, MoE-GEMM, etc by using JAX and ROCm/HIP.
3. Optimize models for competitive performance and scalability.
4. Stay updated on the latest developments and best practices in JAX, XLA, and related technologies.
### 條件要求
1. Proficiency in C/C++/HIP/CUDA programming languages.
2. Experience in optimizing models for performance and scalability.
3. Strong problem-solving skills and attention to detail.
4. Master's degree or higher in Computer Science or related field.
5. Expertise in AI/ML frameworks, particularly those based on JAX, for building neural network models.
6. Familiarity with ROCm/CUDA and corresponding profiling tools.
7. Knowledge of LLM is a plus.
8. Bachelor’s or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
:::
### 自身評估
:::success
* 有ML專案經驗
* MS of CS degree
:::
:::danger
* 無使用過JAX框架 和 缺乏HIP/CUDA 經驗
* 缺乏硬體相關學習與技術
* 對LLM不熟
:::
## [Nvidia - AI Computing Software Development Engineer, TensorRT](https://www.linkedin.com/jobs/view/4074374493)
:::spoiler Job Descriptions & Requirements
### 工作內容
1. Craft and develop robust inferencing software that can be scaled to multiple platforms for functionality and performance
2. Performance analysis, optimization and tuning
3. Closely follow academic developments in the field of artificial intelligence and feature update TensorRT
4. Provide feedback into the architecture and hardware design and development
5. Collaborate across the company to guide the direction of machine learning inferencing, working with software, research and product teams
6. Publish key results in scientific conferencesexpertise.
### 條件要求
1. Master or higher degree in Computer Engineering, Computer Science, Applied Mathematics or related computing focused degree (or equivalent experience)
2. 3+ years of relevant software development experience.
3. Excellent C/C++ programming and software design skills, including debugging, performance analysis, and test design.
4. Strong curiosity about artificial intelligence, awareness of the latest developments in deep learning like LLMs, generative and recommender models
5. Experience working with deep learning frameworks like TensorFlow and PyTorch
6. Proactive and able to work without supervision
7. Excellent written and oral communication skills in English
:::
### 自身評估
:::success
* 有ML經驗
* 有TensorFLow與Pytorch專案
:::
:::danger
* 沒有三年以上經驗
* C/C++能力不足
* 對LLM不熟
:::
## Mock Interview
> * [Resume](https://docs.google.com/document/d/1BlQUyATnhJ2ViQJPc2mQpF9VwfE-VhsvQLgVChdaNaw/edit?usp=sharing)
> * [Github](https://github.com/yrtc99)
> interviewer
> candidate
---
**Interviewer**: 你好,聽說你有深度學習和資料處理經驗,尤其是針對模型訓練過程中的資料前處理部分。我想先問問你,當你在進行模型訓練時遇到資料不平衡的情況,你是如何處理的?
**Candidate**: 你好,很高興能參加今天的面試!在過去的專案中,我確實遇到過資料不平衡的問題。這通常會導致模型對某些類別的預測偏向高頻類別,而低頻類別的準確率明顯下降。我使用了 SMOTE 和 ADASYN 對低頻類別進行過抽樣(Oversampling)。
**Interviewer**: 那麼為什麼會選擇 SMOTE 和 ADASYN,而不是單純的隨機過抽樣?
**Candidate**: 單純的隨機過抽樣雖然可以解決樣本數量不平衡的問題,但可能會導致過擬合,因為生成的樣本其實只是重複了已有的數據,並未帶來新的特徵多樣性。而 **SMOTE** 和 **ADASYN** 則可以通過合成新樣本來避免這個問題。
**Interviewer**: SMOTE跟ADAYSN的機制是甚麼?
**Candidate**:SMOTE的核心概念是基於低頻類別的樣本,在特徵空間中找到它的鄰居,並生成新樣本,這樣的合成樣本能保留原始數據的特徵分布。ADASYN 則根據每個樣本的難度自適應生成更多樣本。
**Interviewer**: 聽起來你對這些技術非常熟悉,那在應用這些技術時有遇到什麼困難嗎?
**Candidate**: 是的,最大的挑戰之一是如何控制合成樣本的影響。我們發現,如果生成樣本的比例過高,模型可能會引入噪聲,進而影響整體準確率。當時某一類別的資料實在太少,使得資料沒有neighbor可以學習,ADASYN所製作的資料會出現重複的結果,所以最後使用SMOTE解決數據不平衡。
**Interviewer**: 很全面的考量!那麼在實際應用中,資料前處理對整個模型的效果有多大的影響?
**Candidate**: 我認為資料前處理是模型訓練中非常關鍵的一環。清理和處理好的數據往往比單純提升模型結構帶來的效果更顯著。在我的經驗中,通過處理不平衡數據或進行特徵工程,模型的準確率可以提升 10-20%,而且穩定性也會更好。例如,在鋁擠業異常檢測專案中,我們的數據在早期階段非常不平衡,異常樣本不到總數的 1%。經過 SMOTE 增強異常樣本後,我們的模型異常檢測的召回率有所提高。
**Interviewer**: 那麼,如果未來我們讓你參與處理更大規模的資料集,你認為需要優先注意哪些問題?
**Candidate**: 如果處理大規模資料集,我認為做好資料清洗、注意特徵選擇與降維、資料分布是否平衡很重要,對模型的準確率影響很大,可以減少噪聲對模型的干擾。
**Interviewer**: 非常清晰的回答!今天的面試到這裡,我非常欣賞你對資料處理的深度理解和細緻的應用策略。後續我們會與你聯繫,謝謝你今天的分享!
**Candidate**: 感謝您的時間和機會!期待能有進一步合作的機會。
## 面試參考資料
1. [在你面試前一定要做的事](https://ithelp.ithome.com.tw/articles/10304813)
2. [[心得] 2021/2022 臺灣 NewGrad SWE 面試心得分享](https://medium.com/@william881218/2021-2022-新鮮人swe面試心得分享-b2db7dac018)
3. [2024 工作 & 面試分享— Machine Learning Engineer](https://come880412.medium.com/工作-面試分享-partii-machine-learning-engineer-da5ba5d66641)