msxlol
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Containerized GPU training on Windows Server 2019 :::info **Windows 容器中的 GPU 加速**: 容器主機必須執行 Windows Server 2019 或 Windows 10 版本 1809 或更新版本。 ::: ## Windows Server 2019 版本 ![](https://i.imgur.com/ItMRnzD.png) Windows Server Standard跟Essentials 都有180days的評估版可以使用 https://www.microsoft.com/en-us/evalcenter/evaluate-windows-server-2019-essentials * Standard直接下載官方提供的iso安裝 即開始試用 * Essential則是官方有提供一組試用的產品金鑰 > NJ3X8-YTJRF-3R9J9-D78MF-4YBP4 其中,要讓Docker運行 必須在Windows上啟用 The Containers feature :::danger ***但是在Essential版本中無法啟用Containers feature*** ::: ![](https://i.imgur.com/ysn1rTW.png) ## Install Docker Windows有兩種安裝Docker的方式: 1. Docker Desktop for Windows ***- both Linux and Windows containers on Windows*** The Docker Desktop installation includes Docker Engine, Docker CLI client, Docker Compose, Notary, Kubernetes, and Credential Helper. 2. Docker on Windows ***- Windows containers only*** with a common API and command-line interface (CLI) > #### 兩種在Windows上的容器 > 1. Linux Container > 2. Windows Container > :::warning > Windows Container 只支援特定OS版本 以及必須啟用Container功能 > 而Linux Container由於只需啟用Hyper-V就能使用 > Docker Desktop 可以在兩個Container之間切換 defalut是Linux Container > 但Docker on Windows就只支援Windows Container 不能作切換 > 這也使得Windows Server 2019 Essential版本並不能正常運行及安裝Docker on Windows > 但卻可以安裝Docker Desktop for Windows > ::: --- ### Docker Desktop for Windows ![](https://i.imgur.com/WOQEtJg.png) 直接安裝即可 內建設定、查看運行狀況、taskbar UI,也有一鍵開啟k8s功能 ![](https://i.imgur.com/0IIVqJ0.png) ![](https://i.imgur.com/oHB5QOt.png) 能夠切換Linux Container 或是 Windows Container ![](https://i.imgur.com/QQ5aPAi.png) (Linux containers & Windows containers只能管理各自的容器) --- ### Docker on Windows https://github.com/OneGet/MicrosoftDockerProvider https://docs.microsoft.com/zh-tw/virtualization/windowscontainers/deploy-containers/deploy-containers-on-server ![](https://i.imgur.com/qbGVoxA.png) **使用 OneGet 提供者 PowerShell 模組安裝 Docker** #### 安裝 OneGet PowerShell 模組 ```shell= Install-Module -Name DockerMsftProvider -Repository PSGallery -Force ``` #### 安裝 OneGet docker provider ```shell= Import-Module -Name DockerMsftProvider -Force Import-Packageprovider -Name DockerMsftProvider -Force ``` #### Install Docker Upgrade to the latest version of docker: ```shell= Install-Package -Name docker -ProviderName DockerMsftProvider -Verbose -Update ``` :::info Windows 容器中的 GPU 加速: 容器主機必須執行 Docker 引擎 19.03 或更新版本。 ::: --- ## Windows base image for containers https://hub.docker.com/_/microsoft-windows ![](https://i.imgur.com/jf0fk7w.png) :::info Windows 容器中的 GPU 加速: 容器基底映像必須是 mcr.microsoft.com/windows:1809 或更新版本。 ::: Windows Server 2019 Standard 評估版 官方所提供的iso原生版本是***OS Build 17763.737*** 由於我們要使用1809版本的Windows Images 至少要10.0.17763.1397 這邊測試是upgrade到 ***OS Build 17763.1369*** 能夠正常運行 * *(以下測試皆使用windows:1809*) ```shell= docker pull mcr.microsoft.com/windows:1809 ``` > 另外還有三種不同的base image > [windows/iotcore](https://hub.docker.com/_/microsoft-windows-iotcore): Windows IoT Core base OS container image [windows/nanoserver](https://hub.docker.com/_/microsoft-windows-nanoserver): Nano Server base OS container image [windows/servercore](https://hub.docker.com/_/microsoft-windows-servercore): Windows Server Core base OS container Windows容器的Dockerfile僅支援以上四種base images 無法使用Linux類基礎映象檔 :::danger Windows容器中使用GPU加速並不支援 **Windows Server Core** 和 **Nano Server** 容器映像 ::: --- ## GPU Training Samples ### DirectX Container Sample > https://github.com/MicrosoftDocs/Virtualization-Documentation/tree/master/windows-container-samples/directx :::info Windows 容器中的 GPU 加速: DirectX (以及以其為基礎的所有架構) 是唯一可以使用 GPU 來加速的 API。 不支援第三方架構。 ::: 這個範例容器使用到WinMLRunner executable 並且用他的performance benchmarking mode去跑 他會用假資料做一個ml model 100次,一開始用CPU,後來用GPU做測試,最後會產出報表跟一些performance metrics https://github.com/Microsoft/Windows-Machine-Learning/tree/master/Tools/WinMLRunner :::success 撰寫Dockerfile on Windows: #### 建立dockerfile 所建立的 Dockerfile 不能有副檔名。 若要在 Windows 中這麼做,需使用自選的編輯器建立檔案,我自己測試是使用Notepad++,然後直接使用 ***Dockerfile*** 儲存該檔案。 ::: ``` FROM mcr.microsoft.com/windows:1809 WORKDIR C:/App # Download and extract the ONNX model to be used for evaluation. RUN curl.exe -o tiny_yolov2.tar.gz https://onnxzoo.blob.core.windows.net/models/opset_7/tiny_yolov2/tiny_yolov2.tar.gz && \ tar.exe -xf tiny_yolov2.tar.gz && \ del tiny_yolov2.tar.gz # Download and extract cli tool for evaluation .onnx model with WinML. RUN curl.exe -L -o WinMLRunner_x64_Release.zip https://github.com/microsoft/Windows-Machine-Learning/releases/download/1.2.1.1/WinMLRunner.v1.2.1.1.zip && \ tar.exe -xf C:/App/WinMLRunner_x64_Release.zip && \ del WinMLRunner_x64_Release.zip # Run the model evaluation when container starts. ENTRYPOINT ["C:/App/WinMLRunner v1.2.1.1/x64/WinMLRunner.exe", "-model", "C:/App/tiny_yolov2/model.onnx", "-terse", "-iterations", "100", "-perf"] ``` 接著回到cmd cd到剛檔案 將該dockerfile build起來 ``` docker build . -t winml-runner ``` build完如果沒出錯 就可run ``` docker run --isolation process --device class/5B45201D-F2F2-4F3B-85BB-30FF1F953599 winml-runner ``` sample output: :::spoiler ``` .\WinMLRunner.exe -model SqueezeNet.onnx WinML Runner GPU: NVIDIA Tesla P4 Loading model (path = SqueezeNet.onnx)... ================================================================= Name: squeezenet_old Author: onnx-caffe2 Version: 9223372036854775807 Domain: Description: Path: SqueezeNet.onnx Support FP16: false Input Feature Info: Name: data_0 Feature Kind: Float Output Feature Info: Name: softmaxout_1 Feature Kind: Float ================================================================= Binding (device = CPU, iteration = 1, inputBinding = CPU, inputDataType = Tensor)...[SUCCESS] Evaluating (device = CPU, iteration = 1, inputBinding = CPU, inputDataType = Tensor)...[SUCCESS] Outputting results.. Feature Name: softmaxout_1 resultVector[818] has the maximal value of 1 Binding (device = GPU, iteration = 1, inputBinding = CPU, inputDataType = Tensor)...[SUCCESS] Evaluating (device = GPU, iteration = 1, inputBinding = CPU, inputDataType = Tensor)...[SUCCESS] Outputting results.. Feature Name: softmaxout_1 resultVector[818] has the maximal value of 1 ``` ::: --- ### Tensorflow Directml Sample > https://docs.microsoft.com/en-us/windows/win32/direct3d12/gpu-tensorflow-windows ::: warning tensorflow只支援64 bits Python 3.5 - 3.7 以及tensorflow需要msvcp140.dll這個元件 解決方式是安裝Microsoft Visual C++ 2015 Redistributable Update 3 範例中用的python檔 放在dockerfile同一個directory中 ::: :::success 撰寫Dockerfile on Windows: #### PowerShell Cmdlet 撰寫Dockerfile on Windows: 可以使用PowerShell Cmdlet在具有 RUN 作業的 Dockerfile 中執行。 ```shell= RUN powershell.exe -Command ``` #### 逸出字元 預設的 Dockerfile 逸出字元為反斜線 \ 不過因為反斜線也是 Windows 中的檔案路徑分隔符號,所以使用反斜線來跨越多行可能會造成問題。 所以在Windows中 可以使用兩種方式做斷行: \ 及 ` ::: DockerFile: ``` FROM mcr.microsoft.com/windows:1809 # assign work directory WORKDIR /python # move all files to work directory including test.py COPY . /python # Silent Install Microsoft Visual C++ 2015 Redistributable Update 3 RUN powershell.exe -Command \ wget https://download.microsoft.com/download/9/3/F/93FCF1E7-E6A4-478B-96E7-D4B285925B00/vc_redist.x64.exe -OutFile vc_redist.x64.exe ; \ Start-Process vc_redist.x64.exe -ArgumentList '/q /norestart' -Wait Remove-Item vc_redist.x64.exe -Force # Silent Install Python 3.6.1 64bits RUN powershell.exe -Command \ $ErrorActionPreference = 'Stop'; \ [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12; \ wget https://www.python.org/ftp/python/3.6.1/python-3.6.1rcl-amd64.exe -OutFile python-3.6.1rcl-amd64.exe ; \ Start-Process python-3.6.1rcl-amd64.exe -ArgumentList '/quiet InstallAllUsers=1 PrependPath=1' -Wait ; \ Remove-Item python-3.6.1rcl-amd64.exe -Force RUN pip install tensorflow-directml # -u to insure python print is working CMD ["py", "-u", "test.py"] ``` >**Install Python via command line/powershell without UI (quietly/slient install python)** 透過cmd以無UI的方式安裝Python 選擇版本:https://www.python.org/ftp/python/ 指令如下: >``` >Net.ServicePointManager]::SecurityProtocol = >[Net.SecurityProtocolType]::Tls12 >wget https://www.python.org/ftp/python/[version].exe >-OutFile c:\[version].exe >Start-Process c:\[version].exe -ArgumentList '/quiet >InstallAllUsers=1 PrependPath=1' >``` test.py中的內容只是用來測試tensorflow是否成功安裝 ```python= import tensorflow.compat.v1 as tf tf.enable_eager_execution(tf.ConifProto(log_device_placement=True)) print(tf.add([1.0, 2.0], [3.0, 4.0])) ``` ``` docker build -t tensorflow-directml . ``` ``` docker run -it tensorflow-directml ``` result: ```python 2020-07-23 20:06:09.756930: I tensorflow/core/common_runtime/dml/dml_device_factory.cc:45] DirectML device enumeration: found 1 compatible adapters. 2020-07-23 20:06:09.917532: I tensorflow/core/common_runtime/dml/dml_device_factory.cc:32] DirectML: creating device on adapter 0 (Microsoft Basic Render Driver) 2020-07-23 20:06:09.433379: I tensorflow/stream_executor/platform/default/dso_loader.cc:60] Successfully opened dynamic library DirectMLba106a7c621ea741d2159d8708ee581c11918380.dll 2020-07-23 20:06:09.558039: I tensorflow/core/common_runtime/eager/execute.cc:571] Executing op Add in device /job:localhost/replica:0/task:0/device:DML:0 tf.Tensor([4. 6.], shape=(2,), dtype=float32) ``` 已經包好push到Docker hub https://hub.docker.com/r/msxlol/tensorflow-directml-sample *How to Use* ``` docker run msxlol/tensorflow-directml ``` :::danger 在跑的過程中 發現是可以detect到GPU 但總是只能抓到**Microsoft Basic Render Driver** 而不是實際上要使用到的**nvidia Tesla P4** ::: 最後解決方案是使用[DDA(Discrete Device Assignment)](https://docs.microsoft.com/zh-tw/windows-server/virtualization/hyper-v/deploy/deploying-graphics-devices-using-dda) 將整個 PCIe 裝置傳遞至 VM VM上安裝Centos 7與Tesla P4驅動 就能detect到正確的Tesla P4而不是Microsoft Basic Render Driver ![](https://i.imgur.com/IWaxfAg.png) https://docs.microsoft.com/zh-tw/windows-server/virtualization/hyper-v/deploy/deploying-graphics-devices-using-dda https://docs.microsoft.com/zh-tw/windows-server/virtualization/hyper-v/plan/plan-for-gpu-acceleration-in-windows-server >**(DDA) 的離散裝置指派** 離散裝置指派 (DDA) (也稱為 GPU 傳遞)可將一或多個實體 Gpu 專用於虛擬機器。 在 DDA 部署中,虛擬化工作負載會在原生驅動程式上執行,而且通常會擁有 GPU 功能的完整存取權。 DDA 提供最高層級的應用程式相容性和潛在的效能。 硬體需求 - PCI Express Native Power Management - 啟動SR-IOV ![](https://i.imgur.com/Vc8SyMW.png) 確認GPU可以被掛載 ![](https://i.imgur.com/u4hmQyd.png) 也可以從Device Manager取得裝置路徑 ![](https://i.imgur.com/z9BZLfV.png) 使用Hyper V建立VM命名為TestGPU,參考以下設定進行即可 設定完成後,GPU由TestGPU獨占 ```sh= Set-VM -Name TestGPU -AutomaticStopAction TurnOff Set-VM -GuestControlledCacheTypes $true -VMName TestGPU Set-VM -LowMemoryMappedIoSpace 3Gb -VMName TestGPU Set-VM -HighMemoryMappedIoSpace 33280Mb -VMName TestGPU Dismount-VMHostAssignableDevice -LocationPath "PCIROOT(0)#PCI(0200)#PCI(0000)" Add-VMAssignableDevice -LocationPath "PCIROOT(0)#PCI(0200)#PCI(0000)" -VMName TestGPU ``` 接下來如同linux使用GPU相同,需設定相關驅動程式 ```sh= lshw -C display # check GPU # install nvidia cuda yum -y install gcc kernel-devel kernel-headers pkgconfig yum -y upgrade kernel wget http://us.download.nvidia.com/tesla/410.129/NVIDIA-Linux-x86_64-410.129-diagnostic.run modprobe -b -r nouveau # disable nouveau for gpu ./NVIDIA-Linux-x86_64-410.129-diagnostic.run -no-x-check -no-opengl-files # install nvidia cudnn wget http://developer.download.nvidia.com/compute/redist/cudnn/v7.6.5/cudnn-10.0-linux-x64-v7.6.5.32.tgz # echo "28355e395f0b2b93ac2c83b61360b35ba6cd0377e44e78be197b6b61b4b492ba cudnn-10.0-linux-x64-v7.6.5.32.tgz" | sha256sum -c - tar -zxf cudnn-10.0-linux-x64-v7.6.5.32.tgz tar --no-same-owner -xzf cudnn-10.0-linux-x64-v7.6.5.32.tgz -C /usr/local --wildcards 'cuda/lib64/libcudnn.so.*' ldconfig yum -y install gcc-c++ python3-devel pip3 install tensorflow-gpu python3 -c "from tensorflow.python.client import device_lib; device_lib.list_local_devices()" ```

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully