info2024-homework3

# 2024 年「[資訊科技產業專案設計](https://hackmd.io/@sysprog/info2024)」課程第 3 次作業 > 貢獻者 : 阿財-Sausage > Resume : [連結](https://docs.google.com/document/d/1xGI7UqbCEFBcFYC5Pbq2S0UIF4KjGZqjk6b7S2dK06I/edit?usp=sharing) # 背景國立成功大學製造資訊與系統研究所元智大學資訊工程學系 # Job Software Development Engineer - Big Data Solutions | Yahoo! <details> It takes powerful technology to connect our brands and partners with an audience of hundreds of millions of people. Whether you’re looking to write mobile app code, engineer the servers behind our massive ad tech stacks, or develop algorithms to help us process trillions of data points a day, what you do here will have a huge impact on our business—and the world. **A Little About Us** Our engineering teams are focused on delivering the best products across Search, Mail, Mobile, Homepage, Sports, Daily Fantasy and Finance just to name a few, and we have fun while we do it. Our team structure encourages trust, learning from one another, having fun, and attracting people who are passionate about what they do. **A Lot About You** We are looking for an experienced developer familiar with the modern data stack and native cloud technologies. Even more significant than having used this specific tech is your desire and willingness to work with these technologies on one of the core teams within the organization. We are a fun group that's passionate about technology. We are committed to providing outstanding, timely customer support, so you’ll get to know lots of folks from across the organization who happen to be leaders in the Open Source and Big Data communities. We work well with one another and could use another forward-thinking mind to help round out the team! **Required Experience:** Knowledge of programming concepts, software architecture, distributed systems, and public cloud environments Demonstrable experience as a software developer using backend programming languages (Java, Python, etc) Extensive experience working with native cloud technologies (EMR, Glue, Dataproc, Dataflow, ) A passion for elegant code Outstanding interpersonal and communication skills **The following skills and experience are considered a plus:** Experience working within an enterprise-level environment Spark Flink Terraform Yahoo is proud to be an equal opportunity workplace. All qualified applicants will receive consideration for employment without regard to, and will not be discriminated against based on age, race, gender, color, religion, national origin, sexual orientation, gender identity, veteran status, disability or any other protected category. Yahoo is dedicated to providing an accessible environment for all candidates during the application process and for employees during their employment. If you need accessibility assistance and/or a reasonable accommodation due to a disability, please submit a request via the Accommodation Request Form ( ) or call 408-336-1409. Requests and calls received for non-disability related issues, such as following up on an application, will not receive a response. Yahoo has a high degree of flexibility around employee location and hybrid working. In fact, our flexible-hybrid approach to work is one of the things our employees rave about. Most roles don’t require specific regular patterns of in-person office attendance. If you join Yahoo, you may be asked to attend (or travel to attend) on-site work sessions, team-building, or other in-person events. When these occur, you’ll be given notice to make arrangements. If you’re curious about how this factors into this role, please discuss with the recruiter. Currently work for Yahoo? Please apply on our internal career site. </details> MLOps Engineer | Pic College <details> As an MLOps Engineer, you will play a key role in designing, implementing, and maintaining robust machine learning platforms and data pipelines, ensuring smooth deployment, scaling, and monitoring of models in production. You will drive automation to accelerate the development, evaluation, and integration of machine learning models, improving collaboration and overall efficiency. In addition to optimizing production environments, you will act as a bridge between machine learning developers and software engineers, ensuring seamless integration of ML systems into applications. You will also share best practices for MLOps and have the opportunity to work on high-impact projects that reach millions of users, as well as help bring innovative new applications to market. **Responsibilities:** Design, implement and maintain machine learning platform and data pipelines, ensuring seamless deployment, scaling, and monitoring of models in production environments. Set up monitoring systems for deployed models and tracking key metrics. Apply and share software engineering best practices within the context of machine learning. Collaborate with ML developers to ensure model performance is maintained in production and work with software engineers to integrate ML systems into the broader application stack. Accelerate machine learning development, evaluation, and integration speed through automation of workflows, tools, and processes to enhance collaboration and efficiency. **Qualifications:** Strong programming skills in Python. Proficiency with containerization tools (Docker, Kubernetes) and cloud platforms (GCP, AWS, Azure; expertise in at least one). Experience working with backend servers and APIs (e.g., FastAPI, Django, or similar frameworks). Experience with machine learning frameworks (PyTorch, TensorFlow). Experience with MLOps tools (Kubeflow, MLflow, TFX). Experience with monitoring tools (Prometheus, Grafana) and logging frameworks. Knowledge of data engineering concepts (ETL pipelines, data lakes, data warehouses). </details> MTS Software Development Eng | AMD <details> **Job Description:** WHAT YOU DO AT AMD CHANGES EVERYTHING We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences – the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives. **THE ROLE:** AMD is looking for an influential software engineer who is passionate about improving the performance of key applications and benchmarks. You will be a member of a core team of incredibly talented industry specialists and will work with the very latest hardware and software technology. **THE PERSON:** We are seeking a highly skilled and motivated individual to join our team as a AI/ML Solution Engineer. The ideal candidate will have a passion for machine learning and a strong background in software development, particularly in the areas of model performance optimization and distributed training. Attention to detail, problem-solving skills, and the ability to stay updated on emerging technologies are essential for success in this role. **KEY RESPONSIBILITIES:** Deliver ML model solutions distributed training or inference purposes. Implement HIP/CUDA-based features such as FlashAttention, PagedAttention, MoE-GEMM, etc based on ROCm/HIP. Optimize models for competitive performance and scalability. Stay updated on the latest developments and best practices in avdanced ML technologies. REQUIREMENTS: Proficiency in Python/C/C++/HIP/CUDA programming languages. Experience in optimizing models for performance and scalability. Strong problem-solving skills and attention to detail. **PREFERRED EXPERIENCE:** Master's degree or higher in Computer Science or related field. Expertise in AI/ML frameworks for building neural network models. Familiarity with ROCm/CUDA and corresponding profiling tools. Knowledge of LLM/Difussion is a plus. ACADEMIC CREDENTIALS: Bachelor’s or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent **#LI-SC1 Benefits offered are described: .** AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process. </details> # Self Assessment 1. Software Development Engineer - Big Data Solutions | Yahoo! 優勢： - 有data engineering 和 backend 的實習經驗 - 有實際操作分散式系統訓練AI模型的經驗劣勢： - Flink, Terraform沒有實際使用過 - native cloud經驗較少 2. MLOps Engineer | Pic College 優勢： - 有data engineering 和 backend 的實習經驗 - 有深度學習的研究經驗和論文發表劣勢： - 沒有Monitoring Tool, MLOps Tool和Logging Tool的使用經驗，這方面也不太了解。 3. MTS Software Development Eng | AMD 優勢： - 有用過分散式系統訓練過AI模型 - 有用過開源框架建立、修改AI模型劣勢： - 缺乏LLM相關經驗 - C/CUDA/HIP/ROCm實作經驗缺乏 # Mock Interview > :man: interviewer > :baby: interviewee ## Software Development Engineer - Big Data Solutions | Yahoo! > Reference : [Top 25 Yahoo Software Engineer Interview Questions & Answers](https://interviewprep.org/yahoo-software-engineer-interview-questions/) :man: : 請描述您在高效能應用程式開發方面的經驗。 :baby: : 在過去的專案中，我設計了一個處理大規模網站瀏覽數據的 ETL 管道，能夠每秒處理數千條請求。通過優化數據結構和減少數據庫 I/O 操作，我將處理效率提高了 27%。 :man: : 可以詳細講一下你是怎麼實現這些效率提升的嗎? :baby: : 好的，我提出的改進方法主要有以下幾點 : 1. 首先，這個ETL中有一個判斷邏輯是 `if a in list`來判斷`a`是否已經存在於該list中，但`in`在判斷list時的時間複雜度為O(n)，而我將該list改為set，`in`的邏輯在判斷set的時候為O(1)。 2. 再來是一段邏輯中會要檢索MongoDB中的資料，但該處邏輯是遍歷所有要檢索的條件，逐一用`find_one()`來搜索，但這樣當條件越來越多時就會產生太多I/O操作，我將條件轉為condition query，透過`$or`和`find()`來一次性檢索，減少資料庫的I/O操作。 ## MLOps Engineer | Pic College >Reference : [Top 100 MLOps interview questions and answers](https://razorops.com/blog/top-100-mlops-interview-questions-and-answers) :man: : 什麼是 MLOps？它與傳統的 DevOps 有何不同？ :baby: : MLOps 是機器學習運營的一種實作，專注於機器學習模型在生產環境中的部署、自動化和監控。與 DevOps 不同，MLOps 涉及模型訓練、模型版本控制、數據漂移檢測以及模型的持續評估。此外，MLOps 需要處理數據管道、特徵工程和模型的再訓練流程，這些都是傳統 DevOps 中較少涉及的。 ## AMD - MTS Software Development Engineer :man: : 請描述您在深度學習框架中優化與擴展機器學習模型性能方面的經驗。 :baby: : 在先前的研究經驗中，我在MMDetection這個框架中透過實作參數凍結、以及量化等功能，搭配預訓練參數來將模型的準確度提升15%以上。 :man: : 你所謂的預訓練參數可以講得詳細一點嗎，這個參數是你自己訓練出來的嗎? :baby: : 是的，由於我們的偵測目標類似於文字，因此我們透過網路上的免費/付費字形檔來生成文字圖片並以此作為預訓練資料集，並透參數凍結保留一部份的權重並將剩餘權重在我們的目標資料集上進行訓練。