msxlol
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
# Containerized GPU training on Windows Server 2019 :::info **Windows 容器中的 GPU 加速**: 容器主機必須執行 Windows Server 2019 或 Windows 10 版本 1809 或更新版本。 ::: ## Windows Server 2019 版本 ![](https://i.imgur.com/ItMRnzD.png) Windows Server Standard跟Essentials 都有180days的評估版可以使用 https://www.microsoft.com/en-us/evalcenter/evaluate-windows-server-2019-essentials * Standard直接下載官方提供的iso安裝 即開始試用 * Essential則是官方有提供一組試用的產品金鑰 > NJ3X8-YTJRF-3R9J9-D78MF-4YBP4 其中,要讓Docker運行 必須在Windows上啟用 The Containers feature :::danger ***但是在Essential版本中無法啟用Containers feature*** ::: ![](https://i.imgur.com/ysn1rTW.png) ## Install Docker Windows有兩種安裝Docker的方式: 1. Docker Desktop for Windows ***- both Linux and Windows containers on Windows*** The Docker Desktop installation includes Docker Engine, Docker CLI client, Docker Compose, Notary, Kubernetes, and Credential Helper. 2. Docker on Windows ***- Windows containers only*** with a common API and command-line interface (CLI) > #### 兩種在Windows上的容器 > 1. Linux Container > 2. Windows Container > :::warning > Windows Container 只支援特定OS版本 以及必須啟用Container功能 > 而Linux Container由於只需啟用Hyper-V就能使用 > Docker Desktop 可以在兩個Container之間切換 defalut是Linux Container > 但Docker on Windows就只支援Windows Container 不能作切換 > 這也使得Windows Server 2019 Essential版本並不能正常運行及安裝Docker on Windows > 但卻可以安裝Docker Desktop for Windows > ::: --- ### Docker Desktop for Windows ![](https://i.imgur.com/WOQEtJg.png) 直接安裝即可 內建設定、查看運行狀況、taskbar UI,也有一鍵開啟k8s功能 ![](https://i.imgur.com/0IIVqJ0.png) ![](https://i.imgur.com/oHB5QOt.png) 能夠切換Linux Container 或是 Windows Container ![](https://i.imgur.com/QQ5aPAi.png) (Linux containers & Windows containers只能管理各自的容器) --- ### Docker on Windows https://github.com/OneGet/MicrosoftDockerProvider https://docs.microsoft.com/zh-tw/virtualization/windowscontainers/deploy-containers/deploy-containers-on-server ![](https://i.imgur.com/qbGVoxA.png) **使用 OneGet 提供者 PowerShell 模組安裝 Docker** #### 安裝 OneGet PowerShell 模組 ```shell= Install-Module -Name DockerMsftProvider -Repository PSGallery -Force ``` #### 安裝 OneGet docker provider ```shell= Import-Module -Name DockerMsftProvider -Force Import-Packageprovider -Name DockerMsftProvider -Force ``` #### Install Docker Upgrade to the latest version of docker: ```shell= Install-Package -Name docker -ProviderName DockerMsftProvider -Verbose -Update ``` :::info Windows 容器中的 GPU 加速: 容器主機必須執行 Docker 引擎 19.03 或更新版本。 ::: --- ## Windows base image for containers https://hub.docker.com/_/microsoft-windows ![](https://i.imgur.com/jf0fk7w.png) :::info Windows 容器中的 GPU 加速: 容器基底映像必須是 mcr.microsoft.com/windows:1809 或更新版本。 ::: Windows Server 2019 Standard 評估版 官方所提供的iso原生版本是***OS Build 17763.737*** 由於我們要使用1809版本的Windows Images 至少要10.0.17763.1397 這邊測試是upgrade到 ***OS Build 17763.1369*** 能夠正常運行 * *(以下測試皆使用windows:1809*) ```shell= docker pull mcr.microsoft.com/windows:1809 ``` > 另外還有三種不同的base image > [windows/iotcore](https://hub.docker.com/_/microsoft-windows-iotcore): Windows IoT Core base OS container image [windows/nanoserver](https://hub.docker.com/_/microsoft-windows-nanoserver): Nano Server base OS container image [windows/servercore](https://hub.docker.com/_/microsoft-windows-servercore): Windows Server Core base OS container Windows容器的Dockerfile僅支援以上四種base images 無法使用Linux類基礎映象檔 :::danger Windows容器中使用GPU加速並不支援 **Windows Server Core** 和 **Nano Server** 容器映像 ::: --- ## GPU Training Samples ### DirectX Container Sample > https://github.com/MicrosoftDocs/Virtualization-Documentation/tree/master/windows-container-samples/directx :::info Windows 容器中的 GPU 加速: DirectX (以及以其為基礎的所有架構) 是唯一可以使用 GPU 來加速的 API。 不支援第三方架構。 ::: 這個範例容器使用到WinMLRunner executable 並且用他的performance benchmarking mode去跑 他會用假資料做一個ml model 100次,一開始用CPU,後來用GPU做測試,最後會產出報表跟一些performance metrics https://github.com/Microsoft/Windows-Machine-Learning/tree/master/Tools/WinMLRunner :::success 撰寫Dockerfile on Windows: #### 建立dockerfile 所建立的 Dockerfile 不能有副檔名。 若要在 Windows 中這麼做,需使用自選的編輯器建立檔案,我自己測試是使用Notepad++,然後直接使用 ***Dockerfile*** 儲存該檔案。 ::: ``` FROM mcr.microsoft.com/windows:1809 WORKDIR C:/App # Download and extract the ONNX model to be used for evaluation. RUN curl.exe -o tiny_yolov2.tar.gz https://onnxzoo.blob.core.windows.net/models/opset_7/tiny_yolov2/tiny_yolov2.tar.gz && \ tar.exe -xf tiny_yolov2.tar.gz && \ del tiny_yolov2.tar.gz # Download and extract cli tool for evaluation .onnx model with WinML. RUN curl.exe -L -o WinMLRunner_x64_Release.zip https://github.com/microsoft/Windows-Machine-Learning/releases/download/1.2.1.1/WinMLRunner.v1.2.1.1.zip && \ tar.exe -xf C:/App/WinMLRunner_x64_Release.zip && \ del WinMLRunner_x64_Release.zip # Run the model evaluation when container starts. ENTRYPOINT ["C:/App/WinMLRunner v1.2.1.1/x64/WinMLRunner.exe", "-model", "C:/App/tiny_yolov2/model.onnx", "-terse", "-iterations", "100", "-perf"] ``` 接著回到cmd cd到剛檔案 將該dockerfile build起來 ``` docker build . -t winml-runner ``` build完如果沒出錯 就可run ``` docker run --isolation process --device class/5B45201D-F2F2-4F3B-85BB-30FF1F953599 winml-runner ``` sample output: :::spoiler ``` .\WinMLRunner.exe -model SqueezeNet.onnx WinML Runner GPU: NVIDIA Tesla P4 Loading model (path = SqueezeNet.onnx)... ================================================================= Name: squeezenet_old Author: onnx-caffe2 Version: 9223372036854775807 Domain: Description: Path: SqueezeNet.onnx Support FP16: false Input Feature Info: Name: data_0 Feature Kind: Float Output Feature Info: Name: softmaxout_1 Feature Kind: Float ================================================================= Binding (device = CPU, iteration = 1, inputBinding = CPU, inputDataType = Tensor)...[SUCCESS] Evaluating (device = CPU, iteration = 1, inputBinding = CPU, inputDataType = Tensor)...[SUCCESS] Outputting results.. Feature Name: softmaxout_1 resultVector[818] has the maximal value of 1 Binding (device = GPU, iteration = 1, inputBinding = CPU, inputDataType = Tensor)...[SUCCESS] Evaluating (device = GPU, iteration = 1, inputBinding = CPU, inputDataType = Tensor)...[SUCCESS] Outputting results.. Feature Name: softmaxout_1 resultVector[818] has the maximal value of 1 ``` ::: --- ### Tensorflow Directml Sample > https://docs.microsoft.com/en-us/windows/win32/direct3d12/gpu-tensorflow-windows ::: warning tensorflow只支援64 bits Python 3.5 - 3.7 以及tensorflow需要msvcp140.dll這個元件 解決方式是安裝Microsoft Visual C++ 2015 Redistributable Update 3 範例中用的python檔 放在dockerfile同一個directory中 ::: :::success 撰寫Dockerfile on Windows: #### PowerShell Cmdlet 撰寫Dockerfile on Windows: 可以使用PowerShell Cmdlet在具有 RUN 作業的 Dockerfile 中執行。 ```shell= RUN powershell.exe -Command ``` #### 逸出字元 預設的 Dockerfile 逸出字元為反斜線 \ 不過因為反斜線也是 Windows 中的檔案路徑分隔符號,所以使用反斜線來跨越多行可能會造成問題。 所以在Windows中 可以使用兩種方式做斷行: \ 及 ` ::: DockerFile: ``` FROM mcr.microsoft.com/windows:1809 # assign work directory WORKDIR /python # move all files to work directory including test.py COPY . /python # Silent Install Microsoft Visual C++ 2015 Redistributable Update 3 RUN powershell.exe -Command \ wget https://download.microsoft.com/download/9/3/F/93FCF1E7-E6A4-478B-96E7-D4B285925B00/vc_redist.x64.exe -OutFile vc_redist.x64.exe ; \ Start-Process vc_redist.x64.exe -ArgumentList '/q /norestart' -Wait Remove-Item vc_redist.x64.exe -Force # Silent Install Python 3.6.1 64bits RUN powershell.exe -Command \ $ErrorActionPreference = 'Stop'; \ [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12; \ wget https://www.python.org/ftp/python/3.6.1/python-3.6.1rcl-amd64.exe -OutFile python-3.6.1rcl-amd64.exe ; \ Start-Process python-3.6.1rcl-amd64.exe -ArgumentList '/quiet InstallAllUsers=1 PrependPath=1' -Wait ; \ Remove-Item python-3.6.1rcl-amd64.exe -Force RUN pip install tensorflow-directml # -u to insure python print is working CMD ["py", "-u", "test.py"] ``` >**Install Python via command line/powershell without UI (quietly/slient install python)** 透過cmd以無UI的方式安裝Python 選擇版本:https://www.python.org/ftp/python/ 指令如下: >``` >Net.ServicePointManager]::SecurityProtocol = >[Net.SecurityProtocolType]::Tls12 >wget https://www.python.org/ftp/python/[version].exe >-OutFile c:\[version].exe >Start-Process c:\[version].exe -ArgumentList '/quiet >InstallAllUsers=1 PrependPath=1' >``` test.py中的內容只是用來測試tensorflow是否成功安裝 ```python= import tensorflow.compat.v1 as tf tf.enable_eager_execution(tf.ConifProto(log_device_placement=True)) print(tf.add([1.0, 2.0], [3.0, 4.0])) ``` ``` docker build -t tensorflow-directml . ``` ``` docker run -it tensorflow-directml ``` result: ```python 2020-07-23 20:06:09.756930: I tensorflow/core/common_runtime/dml/dml_device_factory.cc:45] DirectML device enumeration: found 1 compatible adapters. 2020-07-23 20:06:09.917532: I tensorflow/core/common_runtime/dml/dml_device_factory.cc:32] DirectML: creating device on adapter 0 (Microsoft Basic Render Driver) 2020-07-23 20:06:09.433379: I tensorflow/stream_executor/platform/default/dso_loader.cc:60] Successfully opened dynamic library DirectMLba106a7c621ea741d2159d8708ee581c11918380.dll 2020-07-23 20:06:09.558039: I tensorflow/core/common_runtime/eager/execute.cc:571] Executing op Add in device /job:localhost/replica:0/task:0/device:DML:0 tf.Tensor([4. 6.], shape=(2,), dtype=float32) ``` 已經包好push到Docker hub https://hub.docker.com/r/msxlol/tensorflow-directml-sample *How to Use* ``` docker run msxlol/tensorflow-directml ``` :::danger 在跑的過程中 發現是可以detect到GPU 但總是只能抓到**Microsoft Basic Render Driver** 而不是實際上要使用到的**nvidia Tesla P4** ::: 最後解決方案是使用[DDA(Discrete Device Assignment)](https://docs.microsoft.com/zh-tw/windows-server/virtualization/hyper-v/deploy/deploying-graphics-devices-using-dda) 將整個 PCIe 裝置傳遞至 VM VM上安裝Centos 7與Tesla P4驅動 就能detect到正確的Tesla P4而不是Microsoft Basic Render Driver ![](https://i.imgur.com/IWaxfAg.png) https://docs.microsoft.com/zh-tw/windows-server/virtualization/hyper-v/deploy/deploying-graphics-devices-using-dda https://docs.microsoft.com/zh-tw/windows-server/virtualization/hyper-v/plan/plan-for-gpu-acceleration-in-windows-server >**(DDA) 的離散裝置指派** 離散裝置指派 (DDA) (也稱為 GPU 傳遞)可將一或多個實體 Gpu 專用於虛擬機器。 在 DDA 部署中,虛擬化工作負載會在原生驅動程式上執行,而且通常會擁有 GPU 功能的完整存取權。 DDA 提供最高層級的應用程式相容性和潛在的效能。 硬體需求 - PCI Express Native Power Management - 啟動SR-IOV ![](https://i.imgur.com/Vc8SyMW.png) 確認GPU可以被掛載 ![](https://i.imgur.com/u4hmQyd.png) 也可以從Device Manager取得裝置路徑 ![](https://i.imgur.com/z9BZLfV.png) 使用Hyper V建立VM命名為TestGPU,參考以下設定進行即可 設定完成後,GPU由TestGPU獨占 ```sh= Set-VM -Name TestGPU -AutomaticStopAction TurnOff Set-VM -GuestControlledCacheTypes $true -VMName TestGPU Set-VM -LowMemoryMappedIoSpace 3Gb -VMName TestGPU Set-VM -HighMemoryMappedIoSpace 33280Mb -VMName TestGPU Dismount-VMHostAssignableDevice -LocationPath "PCIROOT(0)#PCI(0200)#PCI(0000)" Add-VMAssignableDevice -LocationPath "PCIROOT(0)#PCI(0200)#PCI(0000)" -VMName TestGPU ``` 接下來如同linux使用GPU相同,需設定相關驅動程式 ```sh= lshw -C display # check GPU # install nvidia cuda yum -y install gcc kernel-devel kernel-headers pkgconfig yum -y upgrade kernel wget http://us.download.nvidia.com/tesla/410.129/NVIDIA-Linux-x86_64-410.129-diagnostic.run modprobe -b -r nouveau # disable nouveau for gpu ./NVIDIA-Linux-x86_64-410.129-diagnostic.run -no-x-check -no-opengl-files # install nvidia cudnn wget http://developer.download.nvidia.com/compute/redist/cudnn/v7.6.5/cudnn-10.0-linux-x64-v7.6.5.32.tgz # echo "28355e395f0b2b93ac2c83b61360b35ba6cd0377e44e78be197b6b61b4b492ba cudnn-10.0-linux-x64-v7.6.5.32.tgz" | sha256sum -c - tar -zxf cudnn-10.0-linux-x64-v7.6.5.32.tgz tar --no-same-owner -xzf cudnn-10.0-linux-x64-v7.6.5.32.tgz -C /usr/local --wildcards 'cuda/lib64/libcudnn.so.*' ldconfig yum -y install gcc-c++ python3-devel pip3 install tensorflow-gpu python3 -c "from tensorflow.python.client import device_lib; device_lib.list_local_devices()" ```

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully