week_4_cv_w_ai_document_intelligence

# CV with AI tools - ## ~~Azure Form Recognizer~~ [Azure Form Recognizer is now Azure AI Document Intelligence](https://techcommunity.microsoft.com/t5/azure-ai-services-blog/azure-form-recognizer-is-now-azure-ai-document-intelligence-with/ba-p/3875765) --- # CV with AI tools - ## Document Intelligence Studio https://formrecognizer.appliedai.azure.com/studio --- #### prebuilt models ![](https://learn.microsoft.com/zh-tw/azure/ai-services/document-intelligence/media/studio/welcome-to-studio.png?view=doc-intel-3.1.0) --- #### custom models <iframe src="https://www.microsoft.com/zh-tw/videoplayer/embed/RE5fX1c?postJsllMsg=true&autoCaptions=zh-tw" width="100%" height="600" frameborder="0" marginheight="0" marginwidth="0">自訂模型</iframe> --- #### no code usage <iframe src="https://www.microsoft.com/zh-tw/videoplayer/embed/RE56n49?postJsllMsg=true&autoCaptions=zh-tw" width="100%" height="600" frameborder="0" marginheight="0" marginwidth="0">快速入門</iframe> --- #### Quick start https://learn.microsoft.com/zh-tw/azure/ai-services/document-intelligence/quickstarts/try-document-intelligence-studio?view=doc-intel-3.0.0 --- #### 除了使用介面操作，也提供 API 串接 #### 可以整合進自己的服務裡 - https://sfoteini.github.io/blog/automate-document-processing-with-azure-form-recognizer-global-ai-bootcamp/ - https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/how-to-guides/use-sdk-rest-api?view=doc-intel-3.0.0&preserve-view=true&tabs=windows&pivots=programming-language-python --- # 這麼強大的服務，背後使用什麼技術？ --- ## Optical Character Recognition OCR aka 光學字元辨識 ![](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/images/vision-studio-ocr-demo.png) --- # 電腦視覺（Computer Vision）領域的一種技術 https://azure.microsoft.com/zh-tw/resources/cloud-computing-dictionary/what-is-computer-vision/ --- # 讓電腦看懂影像透過 CNN 卷積神經網路 ![](https://stanford.edu/~shervine/teaching/cs-230/illustrations/architecture-cnn-en.jpeg?3b7fccd728e29dc619e1bd8022bf71cf) - https://developer.ibm.com/articles/introduction-to-convolutional-neural-networks/ - https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-convolutional-neural-networks --- # ImageNet ![](https://www.image-net.org/static_files/index_files/logo.jpg) - https://www.image-net.org/index.php --- ### Multimodal Large Language Model <iframe src="https://player.vimeo.com/video/867815782?h=14133f853d" width="100%" height="499" frameborder="0" marginheight="0" marginwidth="0">ChatGPT</iframe> --- ### Multimodal Large Language Model https://openai.com/blog/chatgpt-can-now-see-hear-and-speak ![](https://hackmd.io/_uploads/S1xSyBZxT.jpg)