Tsung-Jung Tsai (TJ_Tsai)
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Azure / 虛擬機器(VM) (包含GPU) === ###### tags: `ML / Platform` ###### tags: `ML`, `Azure`, `VM`, `GPU` <br> [TOC] <br> ## Azure 入口點 https://portal.azure.com/ ![](https://i.imgur.com/gDhJBrz.png) <br> ## 建立流程 ### Step1 - 前往 [建立虛擬機器] 入口點 https://portal.azure.com/#create/Microsoft.VirtualMachine <br> ### Step2 - 配置虛擬機器:基本屬性 ![](https://i.imgur.com/7GkvgNc.png) - **資源群組:ocis-dept-test** - **虛擬機器名稱:low-cost-VM-no-CPU** - **映像檔:Ubuntu Seerver 20.04 LTS - Gen1** --- ![](https://i.imgur.com/nIHHSay.png) - **查看所有大小,各系列說明** [![](https://i.imgur.com/enbMdsT.png)](https://i.imgur.com/enbMdsT.png) - GPU 位於「**N 系列**」底下 [![](https://i.imgur.com/y0h3agW.png)](https://i.imgur.com/y0h3agW.png) - ### 標準 NC12s_v2 ![](https://i.imgur.com/XrQn3Kg.png) GP100GL [Tesla P100 PCIe 16GB] 兩張 - ### 標準 NC12s_v3 ![](https://i.imgur.com/2oSpoem.png) GV100GL [Tesla V100 PCIe 16GB] 兩張 - nvidi-smi [![](https://i.imgur.com/RX2tVWE.png)](https://i.imgur.com/RX2tVWE.png) - v3 的 GPU RAM 反而少 120MB - 實際測試 - v2 為 P100 - v3 為 V100,運算能力比 P100 更快 - 「**非進階儲存體 VM 大小**」底下,亦有 GPU [![](https://i.imgur.com/KwU2yQr.png)](https://i.imgur.com/KwU2yQr.png) <br> - **查看所有大小 (GPU, CPU, RAM)** [![](https://i.imgur.com/h9CTCai.png)](https://i.imgur.com/h9CTCai.png) - 為了準備資料(上傳資料),可先透過最便宜的 VM 來操作 - 等資料都上傳完畢,就可透過 GPU VM 來進行處理 <br> - **使用者名稱** 預設是 azureuser <br> - **SSH 公開金鑰來源** - **未建立** ![](https://i.imgur.com/0S9OT6n.png) :::warning :bulb: **提示** 下一次在 Azure 中建立 VM 時,就可以使用您所建立的 SSH 金鑰。 只要針對 **[SSH 公開金鑰來源]** 選取 **[使用儲存在 Azure 中的金鑰]** 即可。 您的電腦上已經有私密金鑰,所以您不需要下載任何項目。 ::: - **已建立** ![](https://i.imgur.com/7UdcG7h.png) 並選擇金鑰來源: ![](https://i.imgur.com/pRc8FW5.png) <br> ### Step2 - 配置虛擬機器:基本屬性 / GPU 價格 | 執行個體 | 隨用<br>隨付 | GPU | GPU-RAM | CPU | RAM | | ------ | ------- | ------- | ------- | ------- | ------- | | NC12s_v2 | 124.6720 TWD/小時 | GP100GL [Tesla P100 PCIe 16GB] x 2張 | 16280 MiB | 1(插槽) x 12(核/插槽) x 1(超執行緒/核) = 12超執行緒<br><br>Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz | 225G | | NC12s_v3 | 168.89 TWD/小時 | GV100GL [Tesla V100 PCIe 16GB] x 2張 | 16280 MiB | 1(插槽) x 12(核/插槽) x 1(超執行緒/核) = 12超執行緒<br><br>GHz | 225G | | NC24s_v2 | 249.20 TWD/小時 | GP100GL [Tesla P100 PCIe 16GB] x 4張 | 16280 MiB | 2(插槽) x 12(核/插槽) x 1(超執行緒/核) = 24超執行緒<br><br>Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz | 448.1G | | NC24s_v3 | 337.78 TWD/小時 | GV100GL<br>[Tesla V100 PCIe 16GB] x 4張 | 16280 MiB | 2(插槽) x 12(核/插槽) x 1(超執行緒/核) = 24超執行緒<br><br>Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz | 448.1G | | NC24rs_v3 | 371.66 TWD/小時 | GV100GL<br>[Tesla V100 PCIe 16GB] x 4張 | 16280 MiB | 2(插槽) x 12(核/插槽) x 1(超執行緒/核) = 24超執行緒<br><br>Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz | 448.1G | - GPU 價格 - [GPU 最佳化的虛擬機器大小](https://docs.microsoft.com/zh-tw/azure/virtual-machines/sizes-gpu?context=/azure/virtual-machines/context/context) - Azure doc - NCv2-series (P100) - [en](https://docs.microsoft.com/en-us/azure/virtual-machines/ncv2-series) - [zh-tw](https://docs.microsoft.com/zh-tw/azure/virtual-machines/ncv2-series) - NCv3-series (V100) - [en](https://docs.microsoft.com/en-us/azure/virtual-machines/ncv3-series) > NCv3-series VMs are powered by NVIDIA Tesla V100 GPUs. These GPUs can provide 1.5x the computational performance of the NCv2-series. > The NC24rs v3 configuration provides a low latency, high-throughput network interface optimized for tightly coupled parallel computing workloads. - [zh-tw](https://docs.microsoft.com/zh-tw/azure/virtual-machines/ncv3-series) > NCv3 系列 VM 是由 NVIDIA Tesla V100 GPU 提供技術支援。 這些 GPU 可提供 NCv2 系列 1.5 倍的計算效能。 > NC24rs v3 組態提供低延遲且高輸送量網路介面,最適合用於緊密結合的平行計算工作負載。 - ### [Ubuntu Advantage Standard 定價](https://azure.microsoft.com/zh-tw/pricing/details/virtual-machines/ubuntu-advantage-standard/) [![](https://i.imgur.com/ufqAd76.png)](https://i.imgur.com/ufqAd76.png) - ### [Linux 虛擬機器定價](https://azure.microsoft.com/zh-tw/pricing/details/virtual-machines/linux/) [![](https://i.imgur.com/nL6icfT.png)](https://i.imgur.com/nL6icfT.png) <br> ### Step3 - 配置虛擬機器:掛載硬碟 ![](https://i.imgur.com/PgLCVyv.png) - **連接現有的資料硬碟** ![](https://i.imgur.com/yuXnW6h.png) <br> ### Step4 - 配置虛擬機器:網路 ![](https://i.imgur.com/fXhgzUG.png) <br> ### Step5 - 管理 - **自動關機** (預設不啟用) 若有需要,避免忘了關機,可設置自動關機 ![](https://i.imgur.com/O0CQOWv.png) 在關機前 30 分鐘,會寄信通知; 若有需要延遲,可再推延關機時間。 [![](https://i.imgur.com/JeJfPiO.png)](https://i.imgur.com/JeJfPiO.png) 點選延遲關閉 VM ![](https://i.imgur.com/svxmhFK.png) <br> ### Step6 - 檢閱沒問題,就建立 ![](https://i.imgur.com/G2vIYHT.png) <br> ### Step7 - 前往資源 ![](https://i.imgur.com/ie055HT.png) <br> <hr> <br> ## 連線流程 ### Step1 - 查看 IP ![](https://i.imgur.com/KtU0VEh.png) - **公用 IP 位址**:`157.55.197.208` <br> ### Step2 - 使用私密金鑰( .pem ) 來連線 - ### 首次使用會遇到 Permissions 問題 - `Permissions 0777 for 'parabricks-test_key.pem' are too open.` - [解决Permissions 0777 for '/root/.ssh/id_rsa' are too open问题](https://www.jianshu.com/p/d79d0cde061b) - 解决办法 id_rsa文件默认权限属性是700,当初为了打开root文件夹临时更成了777,所以只要把root文件夹权限改回700即可 - 登入指令 ```bash= $ chmod 700 parabricks-test_key.pem $ ssh -i parabricks-test_key.pem \ azureuser@157.55.197.208 ``` [![](https://i.imgur.com/pFMwBpE.png)](https://i.imgur.com/pFMwBpE.png) <br> <hr> <br> ## 磁碟掛載 ### Step1 - 建立磁碟 或 連接既有磁碟 :::warning :bulb: 請參考另外一篇筆記:[Azure / 磁碟(Disk)](/8f1YasxKSY-Tv6yPdCMh8w) ::: 可以在建立 VM 後,再進行: - 建立新磁碟 - 連接 **新的(未格式化)** 或 **舊的(已經格式化)** 磁碟 [![](https://i.imgur.com/2pwkDzK.png)](https://i.imgur.com/2pwkDzK.png) 這邊示範,連接新的磁碟,並進行格式化 [![](https://i.imgur.com/9nJMciG.png)](https://i.imgur.com/9nJMciG.png) <br> ### Step2 - 查看磁碟清單 - `lsblk` ```bash= $ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 55.5M 1 loop /snap/core18/1997 loop1 7:1 0 67.6M 1 loop /snap/lxd/20326 loop2 7:2 0 32.1M 1 loop /snap/snapd/11841 loop3 7:3 0 55.4M 1 loop /snap/core18/2066 loop4 7:4 0 32.1M 1 loop /snap/snapd/12057 sda 8:0 0 30G 0 disk ├─sda1 8:1 0 29.9G 0 part / ├─sda14 8:14 0 4M 0 part └─sda15 8:15 0 106M 0 part /boot/efi sdb 8:16 0 4G 0 disk └─sdb1 8:17 0 4G 0 part /mnt sdc 8:32 0 1T 0 disk <--- 這裡(尚未格式化) sr0 11:0 1 628K 0 rom ``` - Name 可能是 sdb, sdc, sdd, sde...,看當時配置的順序? - 沒有子選項,表示尚未格式化 ``` sdb 8:16 0 4G 0 disk <--- 已經格式化 └─sdb1 8:17 0 4G 0 part /mnt <--- 子選項 sdc 8:32 0 1T 0 disk <--- 尚未格式化 ``` - ` lsblk -o NAME,HCTL,SIZE,MOUNTPOINT` (加上 `-o` 參數) :::warning - **HCTL**: Host:Channel:Target:Lun 主機:通道:目標:邏輯單元號碼 - **Lun**: Logical Unit Number 邏輯單元號碼 ::: ```bash= $ lsblk -o NAME,HCTL,SIZE,MOUNTPOINT NAME HCTL SIZE MOUNTPOINT loop0 55.5M /snap/core18/1997 loop1 67.6M /snap/lxd/20326 loop2 32.1M /snap/snapd/11841 loop3 55.4M /snap/core18/2066 loop4 32.1M /snap/snapd/12057 sda 0:0:0:0 30G ├─sda1 29.9G / ├─sda14 4M └─sda15 106M /boot/efi sdb 1:0:1:0 4G └─sdb1 4G /mnt sdc 3:0:0:0 1T sr0 5:0:0:0 628K ``` ```bash= $ lsblk -o NAME,HCTL,SIZE,MOUNTPOINT | grep sd NAME HCTL SIZE MOUNTPOINT sda 0:0:0:0 30G ├─sda1 29.9G / ├─sda14 4M └─sda15 106M /boot/efi sdb 1:0:1:0 4G └─sdb1 4G /mnt sdc 3:0:0:0 1T ``` - `HCTL` 中的第四個欄位,就是 LUN > LUN: 資料磁碟的邏輯單元編號。此值可用於識別 VM 內的資料磁碟,因此對於連結到 VM 的每個資料磁碟都不得重複。 > > ![](https://i.imgur.com/QNfjw2D.png) - `df -h` :warning: 用 `df` 指令會看不到 1TB SSD ```bash= $ df -h Filesystem Size Used Avail Use% Mounted on /dev/root 29G 2.2G 27G 8% / devtmpfs 203M 0 203M 0% /dev tmpfs 207M 0 207M 0% /dev/shm tmpfs 42M 1.1M 41M 3% /run tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 207M 0 207M 0% /sys/fs/cgroup /dev/loop0 56M 56M 0 100% /snap/core18/1997 /dev/loop2 33M 33M 0 100% /snap/snapd/11841 /dev/loop1 68M 68M 0 100% /snap/lxd/20326 /dev/sda15 105M 7.9M 97M 8% /boot/efi /dev/sdb1 3.9G 16M 3.7G 1% /mnt tmpfs 42M 0 42M 0% /run/user/1000 /dev/loop3 56M 56M 0 100% /snap/core18/2066 /dev/loop4 33M 33M 0 100% /snap/snapd/12057 ``` <br> ### Step3 - 格式化磁碟 - ### 方法一 ([指令來源](https://docs.microsoft.com/zh-tw/azure/virtual-machines/linux/attach-disk-portal#partition-a-new-disk)) > 2021/06/07 - OK ![](https://i.imgur.com/JzjX7P5.png) ``` # 按照文件中的指令,將 sdc 換成 sdb # (掛載的硬碟指到 sdb ) $ sudo parted /dev/sdb --script mklabel gpt mkpart xfspart xfs 0% 100% $ sudo mkfs.xfs /dev/sdb1 $ sudo partprobe /dev/sdb1 ``` - 第一行指令說明 Usage: `parted [OPTION]... [DEVICE: [COMMAND [PARAMETERS]...]...]` - DEVICE: `/dev/sdc` - OPTION: `--script` (never prompts for user intervention) - COMMAND1: `mklabel LABEL-TYPE` (create a new disklabel) - `mklabel gpt` - COMMAND2: `mkpart PART-TYPE [FS-TYPE] START END` (make a partition) - `mkpart xfspart xfs 0% 100%` - PART-TYPE: `xfspart` ? - FS-TYPE: `xfs` ? - START: `0%` - END: `100%` <br> - ### 方法二 ( [指令來源](https://docs.microsoft.com/zh-tw/learn/modules/add-and-size-disks-in-azure-virtual-machines/3-exercise-add-data-disks-to-azure-virtual-machines) | [完整Bash 指令碼](https://raw.githubusercontent.com/MicrosoftDocs/mslearn-add-and-size-disks-in-azure-virtual-machines/master/add-data-disk.sh) ) > 2021/06/08 - OK Step1: 對空白磁碟,建立分割區 ```bash= #!/bin/bash # Partition the drive /dev/sdc. # Read from standard input provide the options we want. # n adds a new partition. # p specifies the primary partition type. # the following blank line accepts the default partition number. # the following blank line accepts the default start sector. # the following blank line accepts the default final sector. # p prints the partition table. # w writes the changes and exits. sudo fdisk /dev/sdc <<EOF n p p w EOF ``` - 執行結果 ``` Welcome to fdisk (util-linux 2.34). Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Device does not contain a recognized partition table. Created a new DOS disklabel with disk identifier 0x72cae361. Command (m for help): Partition type p primary (0 primary, 0 extended, 4 free) e extended (container for logical partitions) Select (default p): Partition number (1-4, default 1): First sector (2048-2147483647, default 2048): Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-2147483647, default 2147483647): Created a new partition 1 of type 'Linux' and of size 1024 GiB. Command (m for help): Disk /dev/sdc: 1 TiB, 1099511627776 bytes, 2147483648 sectors Disk model: Virtual Disk Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: dos Disk identifier: 0x72cae361 Device Boot Start End Sectors Size Id Type /dev/sdc1 2048 2147483647 2147481600 1024G 83 Linux Command (m for help): The partition table has been altered. Calling ioctl() to re-read partition table. Syncing disks. ``` Step2: 建立分割區的檔案系統(如 ext4) > 不同的檔案系統類型:ext, ext2, ext3, ext4, vfat, ntfs, nfs ```bash= # Write a file system to the partition. # ext4 creates an ext4 filesystem. # /dev/sdc1 is the device name. sudo mkfs -t ext4 /dev/sdc1 ``` - 執行結果 ``` $ sudo mkfs -t ext4 /dev/sdc1 mke2fs 1.45.5 (07-Jan-2020) Discarding device blocks: done Creating filesystem with 268435200 4k blocks and 67108864 inodes Filesystem UUID: 681feb55-b751-4ce1-831f-51f8f5783c81 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000, 214990848 Allocating group tables: done Writing inode tables: done Creating journal (262144 blocks): done Writing superblocks and filesystem accounting information: done ``` - ### 檢視建立情況 ``` $ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 55.5M 1 loop /snap/core18/1997 loop1 7:1 0 67.6M 1 loop /snap/lxd/20326 loop2 7:2 0 32.1M 1 loop /snap/snapd/11841 loop3 7:3 0 55.4M 1 loop /snap/core18/2066 loop4 7:4 0 32.1M 1 loop /snap/snapd/12057 sda 8:0 0 30G 0 disk ├─sda1 8:1 0 29.9G 0 part / ├─sda14 8:14 0 4M 0 part └─sda15 8:15 0 106M 0 part /boot/efi sdb 8:16 0 4G 0 disk └─sdb1 8:17 0 4G 0 part /mnt sdc 8:32 0 1T 0 disk └─sdc1 8:33 0 1024G 0 part <--- 出現了 sr0 11:0 1 628K 0 rom ``` <br> ### Step4 - 掛載磁碟 - ### 建立掛載點,並進行連接磁碟 ```bash= # Create the /uploads directory, # which we'll use as our mount point. sudo mkdir /uploads # Attach the disk to the mount point. sudo mount /dev/sdc1 /uploads ``` - ### 檢視建立情況 ```bash= $ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 55.5M 1 loop /snap/core18/1997 loop1 7:1 0 67.6M 1 loop /snap/lxd/20326 loop2 7:2 0 32.1M 1 loop /snap/snapd/11841 loop3 7:3 0 55.4M 1 loop /snap/core18/2066 loop4 7:4 0 32.1M 1 loop /snap/snapd/12057 sda 8:0 0 30G 0 disk ├─sda1 8:1 0 29.9G 0 part / ├─sda14 8:14 0 4M 0 part └─sda15 8:15 0 106M 0 part /boot/efi sdb 8:16 0 4G 0 disk └─sdb1 8:17 0 4G 0 part /mnt sdc 8:32 0 1T 0 disk └─sdc1 8:33 0 1024G 0 part /uploads <--- 掛載點 sr0 11:0 1 628K 0 rom ``` ```bash= # 顯示檔案系統類型:/dev/sdc1 為 ext4 $ df -Th Filesystem Type Size Used Avail Use% Mounted on /dev/root ext4 29G 2.3G 27G 8% / devtmpfs devtmpfs 203M 0 203M 0% /dev tmpfs tmpfs 207M 0 207M 0% /dev/shm tmpfs tmpfs 42M 1.1M 41M 3% /run tmpfs tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs tmpfs 207M 0 207M 0% /sys/fs/cgroup /dev/loop0 squashfs 56M 56M 0 100% /snap/core18/1997 /dev/loop2 squashfs 33M 33M 0 100% /snap/snapd/11841 /dev/loop1 squashfs 68M 68M 0 100% /snap/lxd/20326 /dev/sda15 vfat 105M 7.9M 97M 8% /boot/efi /dev/sdb1 ext4 3.9G 16M 3.7G 1% /mnt tmpfs tmpfs 42M 0 42M 0% /run/user/1000 /dev/loop3 squashfs 56M 56M 0 100% /snap/core18/2066 /dev/loop4 squashfs 33M 33M 0 100% /snap/snapd/12057 /dev/sdc1 ext4 1007G 768M 955G 1% /uploads ``` - ### 變更權限 (root → azureuser) ```bash= $ ls -ls / | grep uploads 4 drwxr-xr-x 3 root root 4096 Jun 8 03:11 uploads $ sudo chown -R $(id -u):$(id -g) /uploads $ ls -ls / | grep uploads 4 drwxr-xr-x 3 azureuser azureuser 4096 Jun 8 03:11 uploads ``` <br> ### 參考資料 - ### [使用入口網站將資料磁碟附加至 Linux VM](https://docs.microsoft.com/zh-tw/azure/virtual-machines/linux/attach-disk-portal#partition-a-new-disk) - ### [在 Azure 虛擬機器中新增磁碟及調整其大小](https://docs.microsoft.com/zh-tw/learn/modules/add-and-size-disks-in-azure-virtual-machines/) - [初始化並格式化資料磁碟](https://docs.microsoft.com/zh-tw/learn/modules/add-and-size-disks-in-azure-virtual-machines/3-exercise-add-data-disks-to-azure-virtual-machines) [Bash 指令碼](https://raw.githubusercontent.com/MicrosoftDocs/mslearn-add-and-size-disks-in-azure-virtual-machines/master/add-data-disk.sh) - 將磁碟機 `/dev/sdc` 進行分割。 - 在磁碟機上建立 `ext4` 檔案系統。 - 建立我們用來作為掛接點的 `/uploads` 目錄。 - 將磁碟連結至掛接點。 - 更新 `/etc/fstab`,如此在系統重新開機之後,磁碟機便會自動掛接。 <br> ### 參考指令:`parted -h` :::warning :bulb: **用途**:可以用來分割及格式化資料磁片 <sup>[[註](https://docs.microsoft.com/zh-tw/azure/virtual-machines/linux/attach-disk-portal#partition-a-new-disk)]</sup> ::: :::warning :warning: **注意:** 如果磁碟大小是 2 tib (TiB) 或更大,您必須使用 GPT 磁碟分割。 如果磁片大小低於 2 TiB,您可以使用 MBR 或 GPT 磁碟分割。 ::: ``` $ parted -h Usage: parted [OPTION]... [DEVICE [COMMAND [PARAMETERS]...]...] Apply COMMANDs with PARAMETERS to DEVICE. If no COMMAND(s) are given, run in interactive mode. OPTIONs: -h, --help displays this help message -l, --list lists partition layout on all block devices -m, --machine displays machine parseable output -s, --script never prompts for user intervention -v, --version displays the version -a, --align=[none|cyl|min|opt] alignment for new partitions COMMANDs: align-check TYPE N check partition N for TYPE(min|opt) alignment help [COMMAND] print general help, or help on COMMAND mklabel,mktable LABEL-TYPE create a new disklabel (partition table) mkpart PART-TYPE [FS-TYPE] START END make a partition name NUMBER NAME name partition NUMBER as NAME print [devices|free|list,all|NUMBER] display the partition table, available devices, free space, all found partitions, or a particular partition quit exit program rescue START END rescue a lost partition near START and END resizepart NUMBER END resize partition NUMBER rm NUMBER delete partition NUMBER select DEVICE choose the device to edit disk_set FLAG STATE change the FLAG on selected device disk_toggle [FLAG] toggle the state of FLAG on selected device set NUMBER FLAG STATE change the FLAG on partition NUMBER toggle [NUMBER [FLAG]] toggle the state of FLAG on partition NUMBER unit UNIT set the default unit to UNIT version display the version number and copyright information of GNU Parted Report bugs to bug-parted@gnu.org ``` - screenshot ![Uploading file..._s0eq60eyd]() <br> ### 參考指令:`fdisk -h` ```= $ fdisk -h Usage: fdisk [options] <disk> change partition table fdisk [options] -l [<disk>] list partition table(s) Display or manipulate a disk partition table. Options: -b, --sector-size <size> physical and logical sector size -B, --protect-boot don't erase bootbits when creating a new label -c, --compatibility[=<mode>] mode is 'dos' or 'nondos' (default) -L, --color[=<when>] colorize output (auto, always or never) colors are enabled by default -l, --list display partitions and exit -o, --output <list> output columns -t, --type <type> recognize specified partition table type only -u, --units[=<unit>] display units: 'cylinders' or 'sectors' (default) -s, --getsz display device size in 512-byte sectors [DEPRECATED] --bytes print SIZE in bytes rather than in human readable format -w, --wipe <mode> wipe signatures (auto, always or never) -W, --wipe-partitions <mode> wipe signatures from new partitions (auto, always or never) -C, --cylinders <number> specify the number of cylinders -H, --heads <number> specify the number of heads -S, --sectors <number> specify the number of sectors per track -h, --help display this help -V, --version display version Available output columns: gpt: Device Start End Sectors Size Type Type-UUID Attrs Name UUID dos: Device Start End Sectors Cylinders Size Type Id Attrs Boot End-C/H/S Start-C/H/S bsd: Slice Start End Sectors Cylinders Size Type Bsize Cpg Fsize sgi: Device Start End Sectors Cylinders Size Type Id Attrs sun: Device Start End Sectors Cylinders Size Type Id Flags For more details see fdisk(8). ``` <br> <hr> <br> ## 查看 GPU 資訊 ### Step0 - 系統資訊 ```bash= $ lshw WARNING: you should run this program as super-user. gpu-vm-nc12s-v3 ... *-core ... *-display:0 UNCLAIMED <--- 第一張,UNCLAIMED 表示尚未裝 driver description: 3D controller product: GV100GL [Tesla V100 PCIe 16GB] <--- vendor: NVIDIA Corporation physical id: 2 bus info: pci@0001:00:00.0 version: a1 width: 64 bits clock: 33MHz capabilities: bus_master cap_list configuration: latency=0 resources: iomemory:100-ff iomemory:140-13f memory:41000000-41ffffff memory:1000000000-13ffffffff memory:1400000000-1401ffffff *-display:1 UNCLAIMED <--- 第二張,UNCLAIMED 表示尚未裝 driver ... ``` 安裝 driver 前後的組態差異性: [![](https://i.imgur.com/GR0cwDd.png)](https://i.imgur.com/GR0cwDd.png) ```bash= $ lspci ... 0000:00:08.0 VGA compatible controller: Microsoft Corporation Hyper-V virtual VGA 0001:00:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB] (rev a1) 0002:00:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB] (rev a1) ``` 一開始的情況,沒有 NV 相關指令 ![](https://i.imgur.com/q6rJ3Ql.png) <br> ### Step1 - 安裝 Nvidia runtime ( CUDA Toolkit ) - ### 到 [**NVIDIA 網站**](https://developer.nvidia.com/cuda-downloads) 上查詢 https://developer.nvidia.com/cuda-downloads [![](https://i.imgur.com/Sox1jQb.png)](https://i.imgur.com/Sox1jQb.png) [![](https://i.imgur.com/8U1JYPk.png)](https://i.imgur.com/8U1JYPk.png) - **Installation Instructions:** ```bash= wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600 sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /" sudo apt-get update sudo apt-get -y install cuda ``` ``` ... ***************************************************************************** *** Reboot your computer and verify that the NVIDIA graphics driver can *** *** be loaded. *** ***************************************************************************** ... $ sudo reboot ``` - ### 執行 `nvidia-smi` ![](https://i.imgur.com/GUT0cFM.png) - CUDA Version: 11.3 - 單顆 CPU 的記憶體:16280 MiB = 16GB - 如果是跑 Parabricks,須滿足 >= 12GB <br> ```bash= $ nvidia-smi -L GPU 0: NVIDIA Tesla P100-PCIE-16GB (UUID: GPU-eb0dc035-486a-e5f5-28d4-75861019ef0e) GPU 1: NVIDIA Tesla P100-PCIE-16GB (UUID: GPU-5ff40e36-feb3-502b-f83f-473bee95b150) ``` <br> <hr> <br> ## 查看 CPU 資訊 > 1(插槽) x 12(核/插槽) x 1(超執行緒/核) = 12條執行緒 ### 檢視物理CPU的個數 (單位:插槽) ``` $ cat /proc/cpuinfo |grep "physical id"|sort |uniq|wc -l 1 ``` ### 檢視邏輯CPU的個數 (單位:執行緒) ``` $ cat /proc/cpuinfo |grep "processor"|wc -l 12 ``` ### 檢視CPU是幾核 (單位:核/插槽) ``` $ cat /proc/cpuinfo |grep "cores"|uniq cpu cores : 12 ``` ### 詳看 CPU 詳細資訊 ``` $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 79 model name : Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz stepping : 1 microcode : 0xffffffff cpu MHz : 2593.993 cache size : 35840 KB physical id : 0 siblings : 12 core id : 0 cpu cores : 12 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 20 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit bogomips : 5187.98 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: ... ... ``` - [E5-2690](https://ark.intel.com/content/www/tw/zh/ark/search.html?_charset_=UTF-8&q=E5-2690) 有 4 種規格,要再看「處理器基礎頻率」和「版本」 ![](https://i.imgur.com/ngz2o2y.png) ![](https://i.imgur.com/fWyJGHn.png) - [Intel® Xeon® 處理器 E5-2690 v4 (35M 快取記憶體,2.60 GHz)](https://ark.intel.com/content/www/tw/zh/ark/products/91770/intel-xeon-processor-e52690-v4-35m-cache-2-60-ghz.html) ![](https://i.imgur.com/CknejOI.png) - vCPU = 14 x 28 = 392 <br> <hr> <br> ## 查看 RAM 資訊 ### 全部記憶體 ```bash= $ lsmem RANGE SIZE STATE REMOVABLE BLOCK 0x0000000000000000-0x000000001fffffff 512M online yes 0-3 Memory block size: 128M Total online memory: 512M <--- Total offline memory: 0B ``` ```bash= $ cat /proc/meminfo | head MemTotal: 423720 kB <--- MemFree: 4876 kB MemAvailable: 44636 kB Buffers: 1320 kB Cached: 23844 kB SwapCached: 0 kB Active: 261864 kB Inactive: 9680 kB Active(anon): 255560 kB Inactive(anon): 172 kB ``` <br> <hr> <br> ## 暫時的儲存空間 (`/mnt`) :::warning :warning: **注意**:關機後再重新開機,資料會不見! ::: [![](https://i.imgur.com/ykJNgs7.png)](https://i.imgur.com/ykJNgs7.png) [![](https://i.imgur.com/hGMHJ2s.png)](https://i.imgur.com/hGMHJ2s.png) <br> <hr> <br> ## 上傳資料方式 ### scp ```bash= $ scp -i parabricks-test_key.pem \ parabricks.tar.gz \ azureuser@70.37.107.238:/mnt/parabricks ``` <br> ### rsync (不中斷+續傳) ```bash= $ rsync -e 'ssh -i parabricks-test_key.pem' \ --progress -zh --partial --append \ WGS-LIS-AI018A_R* \ azureuser@70.37.107.238:/mnt/parabricks ``` - `--partial`: 不中斷(斷線後,檔案不完整,只有一部分,不要砍掉) - `--append`: 續傳(繼續附加檔案後半部) <br> <hr> <br> ## 參考資料 - ### [快速入門:在 Azure 入口網站中建立 Linux 虛擬機器](https://docs.microsoft.com/zh-tw/azure/virtual-machines/linux/quick-create-portal) - ### [用 Azure 的 GPU VM 開始建立深度學習的開發環境](https://ericsk.medium.com/17ee8a6886eb) - [安裝 CUDA Toolkit](https://developer.nvidia.com/cuda-downloads) ![](https://i.imgur.com/yIn8tOu.png) - Base Installer ```bash= wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600 sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /" sudo apt-get update sudo apt-get -y install cuda ```

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully