# Usage guide for pilot run for H100 ## Announcement | Announce Date | Type | duration | | ------------- | ------------------ | ------------------------------------ | | 2024/11/21 | System maintenance | 2024/11/22 07:00 ~ 2024/11/23 18:00 | | 2024/11/23 | System online | 2024/11/23 23:00 | ## [先導案簡易版使用者操作手冊_20241111](https://narlabshq-my.sharepoint.com/:b:/g/personal/1203087_narlabs_org_tw/Eb96Z_CzOh1Pv8yI18p6QzIBIbnJBzEdqt5mNoz2asPKgg?e=jaah54) ## Login information | Name | IP | Note | | ------------- | ------------- | ------------------- | | lng01 | 140.110.148.3 | Login node | | xdata | 140.110.148.8 | Data transfer node | ## CPU-GPU-NIC Affinity ``` [p00acy00@lgn01 src]$ nvidia-smi topo -m GPU0 GPU1 NIC0 NIC1 NIC2 NIC3 CPU Affinity NUMA Affinity GPU NUMA ID GPU0 X NODE SYS SYS SYS SYS 58-111,170-223 1 N/A GPU1 NODE X SYS SYS SYS SYS 58-111,170-223 1 N/A NIC0 SYS SYS X PIX NODE NODE NIC1 SYS SYS PIX X NODE NODE NIC2 SYS SYS NODE NODE X PIX NIC3 SYS SYS NODE NODE PIX X Legend: X = Self SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI) NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU) PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge) PIX = Connection traversing at most a single PCIe bridge NV# = Connection traversing a bonded set of # NVLinks NIC Legend: NIC0: mlx5_0 NIC1: mlx5_1 NIC2: mlx5_2 NIC3: mlx5_3 ``` ``` [p00acy00@hgpn06 ~]$ nvidia-smi topo -m GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 NIC0 NIC1 NIC2 NIC3 NIC4 NIC5 NIC6 NIC7 NIC8 NIC9 NIC10 NIC11 CPU Affinity NUMA Affinity GPU NUMA ID GPU0 X NV18 NV18 NV18 NV18 NV18 NV18 NV18 PXB PXB NODE NODE NODE NODE SYS SYS SYS SYS SYS SYS 0 N/A GPU1 NV18 X NV18 NV18 NV18 NV18 NV18 NV18 PXB PXB NODE NODE NODE NODE SYS SYS SYS SYS SYS SYS 0 N/A GPU2 NV18 NV18 X NV18 NV18 NV18 NV18 NV18 NODE NODE NODE NODE PXB PXB SYS SYS SYS SYS SYS SYS 0 N/A GPU3 NV18 NV18 NV18 X NV18 NV18 NV18 NV18 NODE NODE NODE NODE PXB PXB SYS SYS SYS SYS SYS SYS 0 N/A GPU4 NV18 NV18 NV18 NV18 X NV18 NV18 NV18 SYS SYS SYS SYS SYS SYS PXB PXB NODE NODE NODE NODE 56 1 N/A GPU5 NV18 NV18 NV18 NV18 NV18 X NV18 NV18 SYS SYS SYS SYS SYS SYS PXB PXB NODE NODE NODE NODE 56 1 N/A GPU6 NV18 NV18 NV18 NV18 NV18 NV18 X NV18 SYS SYS SYS SYS SYS SYS NODE NODE NODE NODE PXB PXB 56 1 N/A GPU7 NV18 NV18 NV18 NV18 NV18 NV18 NV18 X SYS SYS SYS SYS SYS SYS NODE NODE NODE NODE PXB PXB 56 1 N/A NIC0 PXB PXB NODE NODE SYS SYS SYS SYS X PXB NODE NODE NODE NODE SYS SYS SYS SYS SYS SYS NIC1 PXB PXB NODE NODE SYS SYS SYS SYS PXB X NODE NODE NODE NODE SYS SYS SYS SYS SYS SYS NIC2 NODE NODE NODE NODE SYS SYS SYS SYS NODE NODE X PIX NODE NODE SYS SYS SYS SYS SYS SYS NIC3 NODE NODE NODE NODE SYS SYS SYS SYS NODE NODE PIX X NODE NODE SYS SYS SYS SYS SYS SYS NIC4 NODE NODE PXB PXB SYS SYS SYS SYS NODE NODE NODE NODE X PXB SYS SYS SYS SYS SYS SYS NIC5 NODE NODE PXB PXB SYS SYS SYS SYS NODE NODE NODE NODE PXB X SYS SYS SYS SYS SYS SYS NIC6 SYS SYS SYS SYS PXB PXB NODE NODE SYS SYS SYS SYS SYS SYS X PXB NODE NODE NODE NODE NIC7 SYS SYS SYS SYS PXB PXB NODE NODE SYS SYS SYS SYS SYS SYS PXB X NODE NODE NODE NODE NIC8 SYS SYS SYS SYS NODE NODE NODE NODE SYS SYS SYS SYS SYS SYS NODE NODE X PIX NODE NODE NIC9 SYS SYS SYS SYS NODE NODE NODE NODE SYS SYS SYS SYS SYS SYS NODE NODE PIX X NODE NODE NIC10 SYS SYS SYS SYS NODE NODE PXB PXB SYS SYS SYS SYS SYS SYS NODE NODE NODE NODE X PXB NIC11 SYS SYS SYS SYS NODE NODE PXB PXB SYS SYS SYS SYS SYS SYS NODE NODE NODE NODE PXB X Legend: X = Self SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI) NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU) PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge) PIX = Connection traversing at most a single PCIe bridge NV# = Connection traversing a bonded set of # NVLinks NIC Legend: NIC0: mlx5_0 NIC1: mlx5_1 NIC2: mlx5_2 NIC3: mlx5_3 NIC4: mlx5_4 NIC5: mlx5_5 NIC6: mlx5_6 NIC7: mlx5_7 NIC8: mlx5_8 NIC9: mlx5_9 NIC10: mlx5_10 NIC11: mlx5_11 ``` ``` [p00acy00@hgpn06 ~]$ numactl -s policy: default preferred node: current physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 cpubind: 0 1 nodebind: 0 1 membind: 0 1 [p00acy00@hgpn06 ~]$ numactl -H available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 node 0 size: 1031217 MB node 0 free: 972645 MB node 1 cpus: 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 node 1 size: 1032173 MB node 1 free: 1014453 MB node distances: node 0 1 0: 10 21 1: 21 10 ``` ``` [p00acy00@hgpn13 ~]$ nvidia-smi topo -m GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 NIC0 NIC1 NIC2 NIC3 NIC4 NIC5 NIC6 NIC7 NIC8 NIC9 NIC10 NIC11 CPU Affinity NUMA Affinity GPU NUMA ID GPU0 X NV18 NV18 NV18 NV18 NV18 NV18 NV18 PXB NODE NODE NODE NODE NODE SYS SYS SYS SYS SYS SYS 0-47 0 N/A GPU1 NV18 X NV18 NV18 NV18 NV18 NV18 NV18 NODE NODE NODE PXB NODE NODE SYS SYS SYS SYS SYS SYS 0-47 0 N/A GPU2 NV18 NV18 X NV18 NV18 NV18 NV18 NV18 NODE NODE NODE NODE PXB NODE SYS SYS SYS SYS SYS SYS 0-47 0 N/A GPU3 NV18 NV18 NV18 X NV18 NV18 NV18 NV18 NODE NODE NODE NODE NODE PXB SYS SYS SYS SYS SYS SYS 0-47 0 N/A GPU4 NV18 NV18 NV18 NV18 X NV18 NV18 NV18 SYS SYS SYS SYS SYS SYS PXB NODE NODE NODE NODE NODE 56-103 1 N/A GPU5 NV18 NV18 NV18 NV18 NV18 X NV18 NV18 SYS SYS SYS SYS SYS SYS NODE NODE NODE PXB NODE NODE 56-103 1 N/A GPU6 NV18 NV18 NV18 NV18 NV18 NV18 X NV18 SYS SYS SYS SYS SYS SYS NODE NODE NODE NODE PXB NODE 56-103 1 N/A GPU7 NV18 NV18 NV18 NV18 NV18 NV18 NV18 X SYS SYS SYS SYS SYS SYS NODE NODE NODE NODE NODE PXB 56-103 1 N/A NIC0 PXB NODE NODE NODE SYS SYS SYS SYS X NODE NODE NODE NODE NODE SYS SYS SYS SYS SYS SYS NIC1 NODE NODE NODE NODE SYS SYS SYS SYS NODE X PIX NODE NODE NODE SYS SYS SYS SYS SYS SYS NIC2 NODE NODE NODE NODE SYS SYS SYS SYS NODE PIX X NODE NODE NODE SYS SYS SYS SYS SYS SYS NIC3 NODE PXB NODE NODE SYS SYS SYS SYS NODE NODE NODE X NODE NODE SYS SYS SYS SYS SYS SYS NIC4 NODE NODE PXB NODE SYS SYS SYS SYS NODE NODE NODE NODE X NODE SYS SYS SYS SYS SYS SYS NIC5 NODE NODE NODE PXB SYS SYS SYS SYS NODE NODE NODE NODE NODE X SYS SYS SYS SYS SYS SYS NIC6 SYS SYS SYS SYS PXB NODE NODE NODE SYS SYS SYS SYS SYS SYS X NODE NODE NODE NODE NODE NIC7 SYS SYS SYS SYS NODE NODE NODE NODE SYS SYS SYS SYS SYS SYS NODE X PIX NODE NODE NODE NIC8 SYS SYS SYS SYS NODE NODE NODE NODE SYS SYS SYS SYS SYS SYS NODE PIX X NODE NODE NODE NIC9 SYS SYS SYS SYS NODE PXB NODE NODE SYS SYS SYS SYS SYS SYS NODE NODE NODE X NODE NODE NIC10 SYS SYS SYS SYS NODE NODE PXB NODE SYS SYS SYS SYS SYS SYS NODE NODE NODE NODE X NODE NIC11 SYS SYS SYS SYS NODE NODE NODE PXB SYS SYS SYS SYS SYS SYS NODE NODE NODE NODE NODE X Legend: X = Self SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI) NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU) PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge) PIX = Connection traversing at most a single PCIe bridge NV# = Connection traversing a bonded set of # NVLinks NIC Legend: NIC0: mlx5_0 NIC1: mlx5_1 NIC2: mlx5_2 NIC3: mlx5_3 NIC4: mlx5_4 NIC5: mlx5_5 NIC6: mlx5_6 NIC7: mlx5_7 NIC8: mlx5_8 NIC9: mlx5_9 NIC10: mlx5_10 NIC11: mlx5_11 ``` ``` [p00acy00@hgpn13 ~]$ numactl -s policy: default preferred node: current physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 cpubind: 0 1 nodebind: 0 1 membind: 0 1 [p00acy00@hgpn13 ~]$ numactl -H available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 node 0 size: 1031325 MB node 0 free: 743296 MB node 1 cpus: 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 node 1 size: 1032118 MB node 1 free: 747882 MB node distances: node 0 1 0: 10 21 1: 21 10 ``` ## Schedule | Phase | Duration | Description | |-------|-----------------------|------------------------------------------------| | 1 | 2024/11/25~2024/12/25 | For general purpose test run. | | 2 | 2024/12/26~2025/01/06 | For large scale test run. (special user first) | [晶創主機一期試用時程規劃](https://narlabshq-my.sharepoint.com/:x:/g/personal/1203087_narlabs_org_tw/EQrupbr4DvxChaRhkfgKtXkBR_DGiaJFLKNy-26J2Qwf2Q?e=pIZZfp) ## Queuing Policies ### Phase1 (For general purpose test run.) | Partition | GPUs/Partition | Executing-Jobs/Partition | scheduled-Jobs/Partition | Walltime (Hours) | QoS Factor | Preemptible | |:---:|:---:|:---:|:---:|:---:|:---:|:---:| | dev | 16 | 2 | 2 | 2 | 10 | no | | normal | 32 | 2 | 2 | 24 | 1 | no | ### Phase2 (For large scale test run. special user first.) This time slot is prioritized for special users conducting large-scale computational tests. General users may still dispatch jobs in the dev partition; however, these jobs may be preempted by higher-priority users under preemptive scheduling conditions. | Partition | GPUs/Partition | Executing-Jobs/Partition | scheduled-Jobs/Partition | Walltime (Hours) | QoS Factor | Preemptible | |:---:|:---:|:---:|:---:|:---:|:---:|:---:| | dev | 16 | 2 | 2 | 2 | 1 | yes | | normal | infinite | infinite | infinite | infinite | 1 | no | ### Special Projects | Partition | GPUs/Partition | Executing-Jobs/Partition | scheduled-Jobs/Partition | Walltime (Hours) | QoS Factor | Preemptible | |:----------------:|:---------------------:|:-------------------------------:|:-------------------------------:|:---------------------------:|:-----------------:|:------------------:| | Taide | 40 | infinite | infinite | infinite | 1 | no | ``` [p00acy00@lgn01 ~]$ scontrol show partition PartitionName=dev AllowGroups=ALL DenyAccounts=gov112003,gov112009,gov113008,gov113026 AllowQos=ALL AllocNodes=ALL Default=NO QoS=p_dev DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO MaxNodes=UNLIMITED MaxTime=02:00:00 MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED MaxCPUsPerSocket=UNLIMITED Nodes=hgpn[06-10,13-21] PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO OverTimeLimit=NONE PreemptMode=OFF State=UP TotalCPUs=1568 TotalNodes=14 SelectTypeParameters=NONE JobDefaults=(null) DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED TRES=cpu=1512,mem=28000000M,node=14,billing=1512,gres/gpu=112 PartitionName=normal AllowGroups=ALL DenyAccounts=gov112003,gov112009,gov113008,gov113026 AllowQos=ALL AllocNodes=ALL Default=NO QoS=p_normal DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO MaxNodes=UNLIMITED MaxTime=1-00:00:00 MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED MaxCPUsPerSocket=UNLIMITED Nodes=hgpn[06-10,13-21] PriorityJobFactor=10 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO OverTimeLimit=NONE PreemptMode=OFF State=UP TotalCPUs=1568 TotalNodes=14 SelectTypeParameters=NONE JobDefaults=(null) DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED TRES=cpu=1512,mem=28000000M,node=14,billing=1512,gres/gpu=112 PartitionName=taide AllowGroups=ALL AllowAccounts=gov112003,gov112009,gov113008,gov113026,gov113080 AllowQos=ALL AllocNodes=ALL Default=NO QoS=p_taide DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO MaxNodes=UNLIMITED MaxTime=4-00:00:00 MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED MaxCPUsPerSocket=UNLIMITED Nodes=hgpn[01-05] PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO OverTimeLimit=NONE PreemptMode=OFF State=UP TotalCPUs=560 TotalNodes=5 SelectTypeParameters=NONE JobDefaults=(null) DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED TRES=cpu=540,mem=10000000M,node=5,billing=540,gres/gpu=40 ``` ### Revised history [Revised history for schle and policy](/RS8Vi9fcSSe8z-cVeOI-Ug)