資訊科技產業專案設計課程作業 3

# AI/DE [老師回饋](https:// "title") >some words --- ## What is AI appplication about signal processing and CV(computer vision) **傳統 CV** > 簡介： > > 傳統 CV 一般是指靜態的圖片，可能是黑白、灰階或是各種顏色顯示格式的彩色圖片，以及由數「幀」圖片組成的影像。在一些特殊領域，[生物視覺(Biology vision)](https://www.sciencedirect.com/science/article/pii/S0960982213015091)、[光達(Lidar)](https://waymo.com/blog/2022/09/informing-smarter-lidar-solutions-.html)、[電子顯微資料(Electron microscopy data)](https://www.nature.com/articles/s41524-021-00652-z) 以及[粒子運動資料(Data from Large Hadron Collider)](https://home.cern/news/news/knowledge-sharing/are-you-trackml-challenge) 等也都屬於 CV 的範疇，會這樣分類是因為這些這資料有需多相似的特性，同時在處理這些資料時，我們會使用到許多相似的手法。 > > 特徵： > > 傳統一般領域的 CV 關注低維視覺資料的的紋理(Texture)、邊緣(Edge)、顏色(Color)、頻率(Frequency) 和分割(Segmentation) 等特徵；含時視覺資料的轉場(Shot chaange)、光流(Optical flow)、語意(Semantic, text) 和多物體追跡等(Multiple object tracking, MOT) 等特徵。 > > 手法： > > 2D-FFT、Henry Classification System([應用在指紋辨識](https://ieeexplore.ieee.org/document/5235014))、Histogram Of Oriented Gradients([用於光流檢測](https://ieeexplore.ieee.org/abstract/document/6327977))、Gabor filter([用於病理分析，這篇是成大的www](https://ieeexplore.ieee.org/abstract/document/1647569))、Wavelet([經典參考書](https://grail.cs.washington.edu/projects/wavelets/)) 等。反正超級多，族繁不及備載。 **加入 AI 後的 CV** > 簡介： > > 自 2012 年 AlexNet 在 ImageNet 問世後，CV 的研究「貌似」開始收束，幾乎所有研究都和 **CNN**、**MLP**、**ResNet** 以及 **RNN** 有關。CNN 處理 CV 靜態視覺資料的特徵萃取任務，不同的卷積層設計能涵蓋紋理、邊緣、顏色甚至是頻率特徵的萃取，得到所需的特徵後，接著透過 MLP、Loss function 和訓練集的設計，我們能夠處理分割、辨識、語意檢索等任務。ResNet 則提供 AI 進入大模型時代的基礎，解決梯度消失的問題，使得模型不會因為權重過多而導致準確率下降，最後 RNN 則允許 AI 涉足動態視覺資料，允許 AI 能同時「看」時間序列前與後的資料。雖然 AI 為人詬病的是其無法解釋推論過程，但若將模型拆分，我們會發現其實很多部分都是模仿過去 CV 的手法及研究而來。 > > 近五年的改變： > > 真正的應用 > > > AI 雖然很能夠很好地完成任務，但它需要的計算量十分龐大，就算只是推論結果，有的模型，像是生成式模型，仍舊需要大量的計算資源。現在主要的解決辦法有三，一是透過網路，將計算的任務交付給擁有大量計算資源的企業，政府處理網路相關的規劃、大公司提供計算資源、小公司則設計服務提供給使用者並向大企業承租計算資源；二是升級硬體，更多的計算單元、新的硬體架構、硬體加速器設計以及更先進的製程與封裝；最後是設計新的模型架構與模型修飾的研究，像是生成式領域從擴散模型的崛起。 > > > > 完成更困難的任務 > > > 2017 年 Google Brain 團隊發表了 Transformer model，這個架構使 AI 能夠從更高的維度分析資料並推論：同一個資料的各個部份(就像是 CNN 做的那樣)、參照不同筆資料(就像是 RNN 做的那樣)，並且移除了遞迴的使用，使得模型能夠繼續發展(就像是 ResNet 做到的那樣)。 > > > 新架構的強大，再加上硬體和網路的發展，各界開始將 AI 應用在許多過去 AI 較少涉足的任務(因為在過去 AI 被認為無法勝任這些任務)，像是完成複雜的控制(強化學習)或是進行創作(生成式模型)。 > > > 註：[Transformer 維基百科頁面](https://en.wikipedia.org/wiki/Transformer_(machine_learning_model))的時間軸很好地列出了該領域的研究思路 --- ## AI on chip and its architecture [What is AI Chip Design?](https://www.synopsys.com/ai/what-is-ai-chip-design.html) [Accelerating Inference with Sparsity Using the NVIDIA Ampere Architecture and NVIDIA TensorRT](https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt/) --- ## 相關工作職缺項目 ### [Nvidia AI job](https://nvidia.wd5.myworkdayjobs.com/NVIDIAExternalCareerSite?q=AI&locationHierarchy1=2fcb99c455831013ea52ed162d4932c0) > * [NVIDIA Research 透過人工智慧，在一瞬間將 2D 平面照片變成 3D 立體場景](https://blogs.nvidia.com.tw/2022/04/01/instant-nerf-research-3d-ai/) > * [nerf(Neural Radiance Fields) 的優化實作](https://github.com/kwea123/nerf_pl) > * [前篇 repo 作者親自導讀程式碼](https://www.youtube.com/playlist?list=PLDV2CyUo4q-K02pNEyDr7DYpTQuka3mbV) > * [Nvidia 研究員在 deblur/super-resolution 領域所設計的 REDS dataset](https://seungjunnah.github.io/Datasets/reds.html) :::spoiler #### [AI Algorithms SW Engineer - New College Graduate](https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/Taiwan-Taipei/AI-Algorithms-SW-Engineer---New-College-Graduate_JR1973032?q=AI&locationHierarchy1=2fcb99c455831013ea52ed162d4932c0) **Qualifications:** - MS or PhD in Computer Science, Computer Engineering or Electrical Engineering or related field in Deep Learning, Machine Learning and Computer Vision. - Algorithm development experience data analytics, especially with LLM’s and Multi-Modal Foundation models - Experience working with deep learning frameworks like TensorFlow and pyTorch. - Strong communication skills. **Responsibilities:** - You will be a key member of a growing software team that can architect, analyze, develop and prototype key deep learning algorithms and solutions. - Work and collaborate with different software, research, and hardware teams across geographies for solving critical problems. - Understand and analyze the interplay of hardware and software architectures on future applications. - Support engagements with customers and their third-party software providers, collaboration with Product Managements, Marketing, and Developer Technology teams. ::: :::spoiler #### [Developer Technology Engineer - AI](https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/Taiwan-Taipei/Developer-Technology-Engineer---AI_JR1966721?q=AI&locationHierarchy1=2fcb99c455831013ea52ed162d4932c0) **Qualifications:** - A good degree from a leading university or equivalent experience in an engineering or computer science related discipline (MS or PhD preferred). - 1-3+ years related experience. - Experience with parallel programming, ideally CUDA, OpenCL and OpenACC. - Confident knowledge of C/C++ and/or **Fortran**. - Solid knowledge of software design, programming techniques, and algorithms. - Strong mathematical fundamentals, including linear algebra and numerical methods. - Good communication and organization skills, with a logical approach to problem solving, good time management, and task prioritization skills. - Knowledge in a specific domain is a plus, such as Deep Learning, Machine Learning. **Responsibilities:** - Study and develop cutting-edge techniques in deep learning, graphs, machine learning, and data analytics, and perform in-depth analysis and optimization to ensure the best possible performance on current and next-generation GPU architectures. - Work directly with key customers to understand the current and future problems they are solving and provide the best AI solutions using GPUs. - Collaborate closely with the architecture, research, libraries, tools, and system software teams at NVIDIA to influence the design of next-generation architectures, software platforms, and programming models. ::: ### [AMD AI job](https://careers.amd.com/careers-home) > * [採用 AMD Instinct™ 的機器學習解決方案](https://www.amd.com/zh-hant/graphics/servers-instinct-deep-learning#%E6%8A%80%E8%A1%93) > * [AMD CDNA™ 2 架構](https://www.amd.com/zh-hant/technologies/cdna2): 適用於 HPC 和 AI 的增強 Matrix Core 技術 :::spoiler #### [MTS Software Development Eng.](https://careers.amd.com/careers-home/jobs/35289?lang=en-us) **Job Description:** - This role will be part of the team of Software Engineers that are designing the next generation of ML and AI into the Enterprise Data Center GPU space. **Qualifications:** - BS with several years of related experience or MS with years of related experience or PhD with years of related experience in Computer Engineering or Electrical Engineering or Computer Science or related equivalent from top universities. - Deep experience with distributed training and inferencing frameworks like Horovod, Deepspeed, Lighting, Huggingface Training, Mosaic composer, Megatron. - Deep experience with AI/ML frameworks like Pytorch, Tensor flow, JAX, etc. is preferred. - Knowledge of Acceleration platforms like GPU, TPU, APU, FPGAs in preferred. **Responsibilities:** - Experience training models on large clusters and managing checkpointing, learning rates, data sharing etc. - Experience deploying inferencing models using tensor and model parallel techniques. - Play a role in all the phases of software & model development, from requirement gathering, analysis, design, development, testing and final release to customers. - Provide clear and timely communication related to status and other key aspects of the project to leadership team. - Responsible for working with data scientist teams for resolving critical customer escalations on a need basis. - Willingness to learn skills, tools, and methods to advance the quality, consistency, and timeliness of AMD software products. - Experience with deep learning models and associated AI frameworks (PyTorch, TensorFlow, Jax (FLAN), Lightning, Deepspeed, Mosaic, Horovod, Collasal) ::: ### [Google DE](https://www.google.com/about/careers/applications/jobs/results?location=Taiwan&target_level=EARLY&employment_type=FULL_TIME) > * [Google Pixel 8 智慧型手機開箱！拍照厲害、錄音能消除雜音、多人合照也能修正表情的神奇 AI 魔法奇機](https://tw.news.yahoo.com/google-pixel-8-%E6%99%BA%E6%85%A7%E5%9E%8B%E6%89%8B%E6%A9%9F%E9%96%8B%E7%AE%B1-%E6%8B%8D%E7%85%A7%E5%8E%B2%E5%AE%B3-150314137.html?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAAAyqJUpr_ew1__YV33CrqQkY6g4lChNvIKY_I7mQbv-OmVIZhho9U35-C-It-AFYvmkNxfpPMtGF_RLyMEnYI49fwm-i5KT2-5eiEtTRqtj0Opuv55232QYrz7_pzWcSBTl4WiA0gq_4o6CUZ-4-qZ6m5DFphGWLksU0w4TXsZkj) > * [Google Pixel 8 Pro vs. iPhone 15 Pro Max Speed Test](https://www.youtube.com/watch?v=YAtJOX-lQew) :::spoiler #### [**Multimedia Silicon Engineer**](https://www.google.com/about/careers/applications/jobs/results/142054263418168006-multimedia-silicon-engineer?location=Taiwan&target_level=EARLY&employment_type=FULL_TIME) **Job Description:** - In this role, you will be a part of Research and Development team developing high performance and low power hardware and software to enable Google’s continuous innovations in mobile image and AI processing. Your responsibilities include designing and verifying components and delivering the IPs to the system for integration. **Minimum qualifications:** - Bachelor's degree in Computer Science, Electrical Engineering, Computer Engineering, a related technical field, or equivalent practical experience. - Experience with C/C++ or Verilog/SystemVerilog. - Experience in Computer Architecture or Digital Designs. **Preferred qualifications:** - Master's degree or PhD in Computer Science, Electrical Engineering, or a related technical field. - 3 years of experience in RTL design or verification. - Experience designing, implementing, or validating RTL design. - Experience in a scripting language (e.g. C/C++), programming, and software design. - Knowledge of object-oriented programming. **Responsibilities:** - Perform RTL design and simulation using Verilog/SystemVerilog. - Perform RTL verification using industry standard methodologies, participate in test planning, and coverage analysis. - Create tools/scripts to automate tasks and track progress. - Work with multi-disciplined and multi-site teams in RTL design, verification, or architecture/micro-architecture planning. ::: :::spoiler #### [**Silicon Engineer, University Graduate, 2024**](https://www.google.com/about/careers/applications/jobs/results/118684907669988038-silicon-engineer-university-graduate-2024?location=Taiwan&target_level=EARLY&employment_type=FULL_TIME) **Job Description:** - Your team designs and builds the hardware, software and networking technologies that power all of Google's services. As a Hardware Engineer, you design and build the systems that are the heart of the world's largest and most powerful computing infrastructure. You develop from the lowest levels of circuit design to large system design and see those systems all the way through to high volume manufacturing. Your work has the potential to shape the machinery that goes into our cutting-edge data centers affecting millions of Google users. **Minimum qualifications:** - Bachelor's degree in Engineering or equivalent practical experience. - **Academic coursework in computer architecture (e.g., core, cache, memory, etc.).** - 歡迎選修早上老師的計結，一袋米要扛幾樓 - Experience with C/C++ or RTL. **Preferred qualifications:** - Advanced degree in Computer Science, Electrical Engineering, or related field. - Experience designing/implementing or validating RTL design (e.g., core, cache, fabric, memory, codec, etc.). - Knowledge of OS, Firmware, or software stack. - Knowledge of performance or power architecture, power estimation, modeling, or optimization of processor or ASIC. - Excellent scripting language, C/C++ programming, and software design skills. **Responsibilities:** - Perform performance validation and simulation using C/C++ and RTL-based models, and performance correlation. - Perform analysis results in both qualitative and quantitative fashion. - Create tools/scripts to automate test suites and models to improve functionality of simulators. - Participate in evaluation of future ASIC designs and general architecture. ::: ### [MediaTek AI & DE](https://careers.mediatek.com/eREC/JobSearch?sortBy=&order=&page=1&searchKey=&category=&workExp=&branch=&program=) > * [聯發科預告將於11/6晚間舉辦活動，預計公布新一代旗艦處理器天璣9300](https://mashdigi.com/mediatek-has-announced-that-it-will-hold-an-event-on-the-evening-of-11-6-and-is-expected-to-announce-the-new-generation-flagship-processor-dimensity-9300/) > * [ARM Cortex-X4](https://en.wikipedia.org/wiki/ARM_Cortex-X4) :::spoiler #### [**Research Engineer**](https://careers.mediatek.com/eREC/JobSearch/JobDetail/MRTW20200311000?returnUrl=%2FeREC%2FJobSearch%3FsortBy%3DWorkExp%26order%3Ddescending%26page%3D1%26searchKey%3DAI%26category%3D%26workExp%3D0002%26branch%3D%26program%3D) **Job Description:** - MediaTek Research (MRTW) is looking for aspiring ML/DL candidates to assist our team of researchers. The ideal candidate is open-minded, passionate about learning the learning theory, and keen on the opportunities to challenging convention. - **One of your major responsibilities is to assist your worldwide colleagues to develop innovative Deep Learning theories.** - **Another is to apply modern Deep Learning theories to real world novel applications and methodologies.** **Qualifications:** - Advanced degrees (MSc or above) in Mathematics, Computer Science, Electrical Engineering, or an equivalent degree in a related field. - Participated in applying DL/ML to real world problems in a non-black box fashion. - (Optional for fresh grads) Publications in the main portion of DL/ML conference as the main authors. ::: :::spoiler #### [**Video Decoder IC designer**](https://careers.mediatek.com/eREC/JobSearch/JobDetail/MTK120220401000?returnUrl=%2FeREC%2FJobSearch%3FsortBy%3D%26order%3D%26page%3D1%26searchKey%3Ddigital%2520IC%26category%3D%26workExp%3D0011%26branch%3D%26program%3D) **Job Description:** - Architecture design and RTL implementation of Multi-format Video decoder digital circuit design - Verification of Multi-format Video decoder digital circuit design **Qualifications:** - Familiar with digital IC design, DFT, and FPGA emulation flow - Familiar with video compression standard (Ex: H.264, HEVC... etc) & related principle - Have keen sense of responsibillity - Be careful, discreet ::: --- ## Assessment ### AI/DE Job - AI - 我的目標工作大多數都需要工作經驗或是完整的相關設計經驗 - 因為 AI 的研究過於廣泛，求職網頁中沒有詳細說明會從事哪方面的研究或開發 - Computer Architecture。期望能設計匹配硬體的模型，或是為通用模型開發對應的加速設備 - DE - 我的目標工作大多數都需要工作經驗或是完整的相關設計經驗 - DE 相關工作會希望求職者已經走過設計、模擬、驗證等流程 - Computer Architecture。尤其是需要熟稔匯排流和快取的設計 ### Self-assessment - 大學是物理學系畢業，能跟上其他產業升級的腳步(例如晶片製程繼續向下) - 取得成大人工智慧學分學程，修習過 "DSP"、"IC design"、"Multimedia content analysis"、"Data mining" 等專業課程，過程中完成多項 project 並呈現在 GitHub - 正在修習黃老師的計算機結構 - 實驗室不是 IC lab - 缺乏完整的 IC design flow 經驗 - 自己過去的副專案都是和 AI 相關，在台灣常會被認為只是套模而已 - 最近很凍 --- ## 模擬面試 🧔：interviewer 👶：interviewee ### 背景詢問 🧔：詢問關於生成式 AI 的實作內容。 👶：使用 CGAN 將萃取後 latent code 的資料進行聚類並給出對應的標籤，標籤的數量是預先設計的。可以將影片中每一幀的圖像進行聚類，因此能分辨出不連續的畫面是同一個場景。在副專案中，我是將其應用在轉場辨識。 🧔：GAN 不是不好收斂嗎？為什麼不使用更好收斂的擴散模型？ 👶：因為做這項專案時，擴散模型才正在挑戰 GAN 的地位，因此我當時選擇 GAN 作為我的骨幹模型，此外我在實驗過程中也有注意到收斂的問題，因此報告最後我有提出可以改用擴散模型作為骨幹模型。 ### 背景知識 #### DSP 🧔：請你解釋一下傅立葉分析。 [圖解傅立葉分析](https://hackmd.io/@sysprog/fourier-transform) #### Deblur 🧔：如果不使用 AI 技術，要如何解模糊？ 👶：解模糊是從模糊影像 $B$ 中恢復清晰的影像 $S$ 的過程。在數學上可以表示為 $B=S*K$，$K$ 是模糊核，傳統解模糊就是去解逆卷積。知道模糊核的情況下可以使用 "Wiener deconvolution"，不知道模糊核則可以使用 "Blind deconvolution" 的技術。 🧔：結合傳統理論，可以請你構思一個可解釋的解模糊模型嗎？ 👶：可以將使用 MAP 的架構，將解模糊的過程寫成最小化真實影像和模型輸出的誤差 $$E(x)=\frac{1}{2\sigma^2}||y-(x*k)\downarrow_s||^2+\lambda\Phi(x)$$ 👶：使用半二次分解將其拆成迭代的形式，接著將每一層迭代視為神經網路的一層，就可以設計出具可解釋性的 AI 解模糊模型了。 #### Transformer 🧔：什麼是注意力機制 👶：[Neural Machine Translation by Jointly Learning to Align and Translate](https://arxiv.org/abs/1409.0473) 🧔：Transformer 和注意力機制的關係？ 👶：使用位置編碼器標記進出網路的資料元素。Attention 單元追蹤這些標籤，計算出一種代數圖，說明各元素之間的關係。通常是在所謂的多頭注意力（multi-headed attention）中計算一個方程式矩陣，以平行執行 attention 查詢。 #### Computer Architecture 🧔：解釋 RISC 和 CISC 的設計 👶：![](https://hackmd.io/_uploads/rkNOLO0Ma.png) 👶：CISC 有很多種架構，以 x86 為例，其中用到的技術是將 CISC 指令 **decode** 成數個微指令，而這數個微指令的處理則是基於 RISC 架構 ![](https://hackmd.io/_uploads/Sy3Xk5gR3.png) 🧔：請問你對 TPU 的架構有了解嗎？ 👶：![](https://hackmd.io/_uploads/SJlOtcxAh.png) --- ## 附錄 * 我在寫「What is AI appplication about signal processing and CV(computer vision)」這段時寫到 PTSD，想到去年在資工所修課修到沒日沒夜的日子，老實說成大在 CV 和 signal processing 的師資其實很豐富，很多教授的博論或是碩論其實就是做這些領域的，透過修課可以從教授們身上學到很多。這領域水很深，但在台灣，相關職缺的薪水都不高QQ --- ## Something I haven't used, but it's interesting * [Do Vision Transformers See Like Convolutional Neural Networks?](https://arxiv.org/abs/2108.08810) * [Emergent Abilities of Large Language Models](https://arxiv.org/abs/2206.07682) * [Generative Pre-trained Transformer Detailed explanation](https://vinija.ai/models/gpt/)