Usage guide for pilot run for Forerunner1(F1)

# Usage guide for pilot run for Forerunner1(F1) [toc] ## Revised history * 10/24 Update for the announcement of Grand Challenge winners. * 5/31 Update opening schedule and Grand Challenge rules. * 4/9 Open the following partitions: betatest, ct112, ct448, ct1k for beta test. * 4/8 Add announcement section * 4/3 Add SOP for project binding for disk quota. * 3/27 Modify queuing policy and disk quota limitation. ## Disclaimer **The main purpose of closed testing is to find bugs and usability issues, and to test the effectiveness of different management strategies. Your use of F1 resources constitutes your acceptance of the "Forerunner 1驗證測試辦法". NCHC reserves the right to dynamically adjust management strategies during the testing period.** [Forerunner 1驗證測試辦法](https://hackmd.io/@acyang/SkRM82FyR) ## Announcement * F1停機通知 4/11(四) 09:30~13:30 廠商安排停機四小時, 調整/pkg儲存空間(要解決權限問題), 請用戶暫停送job ## Login information | Name | IP | Note | | ------------- | --------------- | ---------------- | | x64 | 140.110.122.196 | ilgn01 | | arm64 | 140.110.122.xxx | Text | | xdata | 140.110.122.xxx | Text | | interactivate | 140.110.122.xxx | Text | X11 Forwarding was disable on SSH service on ilgn01. ``` PuTTY X11 proxy: unable to connect to forwarded X server: Network error: Connection refused (base) 00:17:23 p00acy00@ilgn01:~$ ``` ## Hardware <table style="undefined;table-layout: fixed; width: 831px"> <colgroup> <col style="width: 84px"> <col style="width: 355px"> <col style="width: 392px"> </colgroup> <thead> <tr> <th></th> <th>Computing node</th> <th>Visualization node</th> </tr> </thead> <tbody> <tr> <td>Nodes</td> <td>552</td> <td>6</td> </tr> <tr> <td>CPU</td> <td colspan="2">Intel Xeon Platinum 8480+<br>2.0G/56C/350W *2</td> </tr> <tr> <td>GPU</td> <td>NA</td> <td>NVIDIA A40 48GB PCIe*2</td> </tr> <tr> <td>RAM</td> <td colspan="2">512GB<br>32GB DDR5-4800 4800RDIMM *16</td> </tr> <tr> <td>SSD(OS)</td> <td>Samsung PM9A3 NVMe SSD 960GB *1</td> <td>Samsung PM983 SATA3 SSD 240GB *2</td> </tr> <tr> <td>SSD(DATA)</td> <td colspan="2">Samsung PM9A3 NVMe SSD 1.92TB *1</td> </tr> <tr> <td>IB</td> <td colspan="2">NVIDIA ConnectX 7 NDR 200G InfiniBand PCIe *1<br></td> </tr> </tbody> </table> ### Computing #### x64 ##### cpuinfo ``` [p00acy00@ilgn01 ~]$ cat /proc/cpuinfo ... processor : 111 vendor_id : GenuineIntel cpu family : 6 model : 143 model name : Intel(R) Xeon(R) Platinum 8480+ stepping : 8 microcode : 0x2b000461 cpu MHz : 2001.000 cache size : 107520 KB physical id : 1 siblings : 56 core id : 55 cpu cores : 56 apicid : 238 initial apicid : 238 fpu : yes fpu_exception : yes cpuid level : 32 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cat_l2 cdp_l3 invpcid_single intel_ppin cdp_l2 ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local split_lock_detect avx_vnni avx512_bf16 wbnoinvd dtherm ida arat pln pts avx512vbmi umip pku ospke waitpkg avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq la57 rdpid bus_lock_detect cldemote movdiri movdir64b enqcmd fsrm md_clear serialize tsxldtrk pconfig arch_lbr amx_bf16 avx512_fp16 amx_tile amx_int8 flush_l1d arch_capabilities bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs eibrs_pbrsb bogomips : 3981.31 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 57 bits virtual power management: ``` [Intel® Xeon® Platinum 8480+ Processor](https://ark.intel.com/content/www/us/en/ark/products/231746/intel-xeon-platinum-8480-processor-105m-cache-2-00-ghz.html) ##### lstopo ##### numactl ### Storage #### IBM Spectrum Scale (GPFS) 若有MPIIO需求，請把檔案寫在平行檔案系統，不要寫在local disk或是NAS。 1. /home/${USER} 2. /work1/${USER} (beta用戶尚未確認如何開放) 3. /project (尚未開放) * alpha測試期間取消disk qupta，2024/4/1恢復預設容量上限100GB。 * 建構測試期間，請不要將重要資料保留在F1儲存系統，目前不敢保證SLA。 * 2024/4/1啟用disk qupta，所有用戶都需要在iservice綁定F1檔案系統關聯計畫，詳細作法請參考影片。 ![image](https://hackmd.io/_uploads/B13V63FyC.png) [在iservice上設定F1檔案系統綁訂計畫的設定方式(個別用戶)](https://youtu.be/aRVXZaQ_HIk) * alphatest用戶若有調高disk qupta需求請與各領域負責人聯繫。 * betatest用戶若有調高disk qupta需求，請參考下列影片，由各自計劃管理人操作即可。 [在iservice上調整計畫成員F1的磁碟空間大小(管理者)](https://www.youtube.com/watch?v=lvyhmVdMvA0) #### NAS shared via NFS 1. /pkg/x86 * 存放軟體開發工具，軟體安裝協力夥伴，請在beta開始協助安裝。 #### Local disk on computing node (/tmp 1.8TB) ``` 17:20:23 p00acy00@ilgn01:~$ srun lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sr0 11:0 1 1024M 0 rom nvme1n1 259:0 0 1.8T 0 disk /tmp nvme0n1 259:1 0 894.3G 0 disk ├─nvme0n1p1 259:2 0 600M 0 part /boot/efi ├─nvme0n1p2 259:3 0 1G 0 part /boot └─nvme0n1p3 259:4 0 892.7G 0 part ├─rhel-root 253:0 0 263.1G 0 lvm / ├─rhel-swap 253:1 0 15.6G 0 lvm [SWAP] ├─rhel-home 253:2 0 175.4G 0 lvm /home └─rhel-var 253:3 0 438.5G 0 lvm /var ``` PS. There is no local scratch on login node. PPS. 若有MPIIO需求，請把檔案寫在平行檔案系統，不要寫在local disk。 ### Networking NVIDIA® Mellanox® 200G InfiniBand ![](https://man.twcc.ai/_uploads/SkADsNU5p.png) ``` (base) 08:51:59 p00acy00@ilgn01:~$ ucx_info -d | grep Transport # Transport: self # Transport: tcp # Transport: tcp # Transport: tcp # Transport: tcp # Transport: sysv # Transport: posix # Transport: dc_mlx5 # Transport: rc_verbs # Transport: rc_mlx5 # Transport: ud_verbs # Transport: ud_mlx5 # Transport: dc_mlx5 # Transport: rc_verbs # Transport: rc_mlx5 # Transport: ud_verbs # Transport: ud_mlx5 # Transport: cma # Transport: knem # Transport: xpmem ``` ``` (base) 15:23:09 p00acy00@ilgn01:~$ lspci | grep Ethernet 03:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03) 49:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] 49:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] (base) 15:22:57 p00acy00@ilgn01:~$ lspci | grep Mellanox 49:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] 49:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] 5a:00.0 Infiniband controller: Mellanox Technologies MT2910 Family [ConnectX-7] (base) 14:48:46 p00acy00@ilgn01:~$ ibv_devinfo hca_id: mlx5_2 transport: InfiniBand (0) fw_ver: 28.38.1900 node_guid: 946d:ae03:0097:0012 sys_image_guid: 946d:ae03:0097:0012 vendor_id: 0x02c9 vendor_part_id: 4129 hw_ver: 0x0 board_id: MT_0000000844 phys_port_cnt: 1 port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 1 port_lid: 662 port_lmc: 0x00 link_layer: InfiniBand hca_id: mlx5_bond_0 transport: InfiniBand (0) fw_ver: 26.38.1900 node_guid: 946d:ae03:00ea:85b8 sys_image_guid: 946d:ae03:00ea:85b8 vendor_id: 0x02c9 vendor_part_id: 4127 hw_ver: 0x0 board_id: MT_0000000546 phys_port_cnt: 1 port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 0 port_lid: 0 port_lmc: 0x00 link_layer: Ethernet ``` ``` (base) 14:54:40 p00acy00@icpnq134:~$ ibv_devinfo hca_id: mlx5_0 transport: InfiniBand (0) fw_ver: 28.38.1900 node_guid: 946d:ae03:0097:0b02 sys_image_guid: 946d:ae03:0097:0b02 vendor_id: 0x02c9 vendor_part_id: 4129 hw_ver: 0x0 board_id: MT_0000000844 phys_port_cnt: 1 port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 1 port_lid: 23 port_lmc: 0x00 link_layer: InfiniBand ``` ## Software ### OS ``` 17:55:31 p00acy00@ilgn01:~$ cat /proc/version Linux version 4.18.0-425.3.1.el8.x86_64 (mockbuild@x86-vm-08.build.eng.bos.redhat.com) (gcc version 8.5.0 20210514 (Red Hat 8.5.0-15) (GCC)) #1 SMP Fri Sep 30 11:45:06 EDT 2022 ``` RHEL 8.5 ``` (base) 17:27:41 p00acy00@ilgn01:src$ ulimit -aS core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 1029203 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 65535 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) unlimited cpu time (seconds, -t) unlimited max user processes (-u) 1029203 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited ``` ### Toolchain 1. GCC 8.5.0(default) 2. Intel oneAPI - Intel MPI Library Compiler Wrappers <style type="text/css"> .tg {border-collapse:collapse;border-spacing:0;} .tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; overflow:hidden;padding:10px 5px;word-break:normal;} .tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;} .tg .tg-s3xf{background-color:#F7F7F7;color:#262626;text-align:center;vertical-align:top} .tg .tg-baqh{text-align:center;vertical-align:top} .tg .tg-u97q{background-color:#F7F7F7;border-color:inherit;color:#262626;text-align:center;vertical-align:top} .tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top} .tg .tg-26z5{background-color:#FFF;border-color:inherit;color:#262626;text-align:center;vertical-align:top} .tg .tg-lc14{background-color:#FFF;color:#262626;text-align:center;vertical-align:top} </style> <table class="tg"> <thead> <tr> <th class="tg-c3ow">Compiler Command</th> <th class="tg-c3ow">Default Compiler</th> <th class="tg-c3ow">Supported Languages</th> </tr> </thead> <tbody> <tr> <td class="tg-baqh" colspan="3">Generic Compilers</td> </tr> <tr> <td class="tg-baqh">mpicc</td> <td class="tg-baqh">gcc, cc</td> <td class="tg-baqh">C</td> </tr> <tr> <td class="tg-baqh">mpicxx</td> <td class="tg-baqh">g++</td> <td class="tg-baqh">C/C++</td> </tr> <tr> <td class="tg-s3xf">mpifc</td> <td class="tg-s3xf"><span style="background-color:#F7F7F7">gfortran</span></td> <td class="tg-s3xf"><span style="background-color:#F7F7F7">Fortran77/Fortran 95</span></td> </tr> <tr> <td class="tg-baqh" colspan="3">GNU Compilers</td> </tr> <tr> <td class="tg-s3xf">mpigcc</td> <td class="tg-s3xf"><span style="background-color:#F7F7F7">gcc</span></td> <td class="tg-s3xf"><span style="background-color:#F7F7F7">C</span></td> </tr> <tr> <td class="tg-lc14">mpigxx</td> <td class="tg-lc14"><span style="background-color:#FFF">g++</span></td> <td class="tg-lc14"><span style="background-color:#FFF">C/C++</span></td> </tr> <tr> <td class="tg-s3xf">mpif77</td> <td class="tg-s3xf"><span style="background-color:#F7F7F7">gfortran</span></td> <td class="tg-s3xf"><span style="background-color:#F7F7F7">Fortran 77</span></td> </tr> <tr> <td class="tg-lc14">mpif90</td> <td class="tg-lc14"><span style="background-color:#FFF">gfortran</span></td> <td class="tg-lc14"><span style="background-color:#FFF">Fortran 95</span></td> </tr> <tr> <td class="tg-baqh" colspan="3">Intel® Fortran, C++ Compilers</td> </tr> <tr> <td class="tg-lc14">mpiicc (classical)</td> <td class="tg-lc14"><span style="background-color:#FFF">icc</span></td> <td class="tg-lc14"><span style="background-color:#FFF">C</span></td> </tr> <tr> <td class="tg-s3xf">mpiicx (oneapi)</td> <td class="tg-s3xf"><span style="background-color:#F7F7F7">icx</span></td> <td class="tg-s3xf"><span style="background-color:#F7F7F7">C</span></td> </tr> <tr> <td class="tg-lc14">mpiicpc (classical)</td> <td class="tg-lc14"><span style="background-color:#FFF">icpc</span></td> <td class="tg-lc14"><span style="background-color:#FFF">C++</span></td> </tr> <tr> <td class="tg-s3xf">mpiicpx (oneapi)</td> <td class="tg-s3xf"><span style="background-color:#F7F7F7">icpx</span></td> <td class="tg-s3xf"><span style="background-color:#F7F7F7">C++</span></td> </tr> <tr> <td class="tg-26z5">mpiifort (classical)</td> <td class="tg-26z5"><span style="background-color:#FFF">ifort</span></td> <td class="tg-26z5"><span style="background-color:#FFF">Fortran77/Fortran 95</span></td> </tr> <tr> <td class="tg-u97q">mpiifx (oneapi)</td> <td class="tg-u97q"><span style="background-color:#F7F7F7">ifx</span></td> <td class="tg-u97q"><span style="background-color:#F7F7F7">Fortran77/Fortran 95</span></td> </tr> </tbody> </table> In intel mpi, the mpirun command uses Hydra as the underlying process manager, which means mpirun invokes the mpiexec.hydra command (Hydra process manager). [improve-performance-and-stability-with-intel-mpi-library-on-infiniband]([https:https:////](https://www.intel.com/content/www/us/en/developer/articles/technical/improve-performance-and-stability-with-intel-mpi-library-on-infiniband.html)) 3. NVHPC ### Modulefile - default modulefile ``` 23:57:29 p00acy00@ilgn01:~$ echo $MODULEPATH /etc/modulefiles:/usr/share/modulefiles:/pkg/modulefiles/software:/pkg/modulefiles/middleware/compiler:/pkg/x86/modulefiles 00:04:09 p00acy00@ilgn01:~$ module av -------------------------- /pkg/modulefiles/software --------------------------- apps/quantumatk/2.30.108 libs/hdf5/1.14.3 --------------------- /pkg/modulefiles/middleware/compiler --------------------- gcc/8.5.0 gcc/10.4.0 gcc/11.2.0 intel/2024_01_46 ----------------------------- /pkg/x86/modulefiles ----------------------------- code-server mvapich nvhpc/23.9 gcc/13.2.0 (D) nvhpc-byo-compiler/23.9 oneapi gsl/2.7.1 nvhpc-hpcx-cuda12/23.9 openmpi jupyterlab nvhpc-hpcx/23.9 paraview/5.12.0 miniconda3 nvhpc-nompi/23.9 Where: D: Default Module If the avail list is too long consider trying: "module --default avail" or "ml -d av" to just list the default modules. "module overview" or "ml ov" to display the number of modules for each name. Use "module spider" to find all possible modules and extensions. Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys". ``` - Add extra modulefiles (experimental!!! without guarantee) ``` 00:06:12 p00acy00@ilgn01:~$ module use /home/qusim/modulefiles 00:10:38 p00acy00@ilgn01:~$ module av --------------------------- /home/qusim/modulefiles ---------------------------- mpich/4.2.0-debug mpich/4.2.0 (D) openblas/0.3.26 openmpi/4.1.6-usr openmpi/4.1.6 openmpi/5.0.2-base openmpi/5.0.2-prrte openmpi/5.0.2-usr openmpi/5.0.2 python/3.10.13 sys/intelmpi/2021.9 sys/intelmpi/2024.1 (D) sys/openmpi/4.1 sys/openmpi/5.0 (D) valgrind/3.22 xacc/ompi5 -------------------------- /pkg/modulefiles/software --------------------------- apps/quantumatk/2.30.108 libs/hdf5/1.14.3 --------------------- /pkg/modulefiles/middleware/compiler --------------------- gcc/8.5.0 gcc/10.4.0 gcc/11.2.0 intel/2024_01_46 ----------------------------- /pkg/x86/modulefiles ----------------------------- code-server mvapich nvhpc/23.9 gcc/13.2.0 (D) nvhpc-byo-compiler/23.9 oneapi gsl/2.7.1 nvhpc-hpcx-cuda12/23.9 openmpi (D) jupyterlab nvhpc-hpcx/23.9 paraview/5.12.0 miniconda3 nvhpc-nompi/23.9 Where: D: Default Module If the avail list is too long consider trying: "module --default avail" or "ml -d av" to just list the default modules. "module overview" or "ml ov" to display the number of modules for each name. Use "module spider" to find all possible modules and extensions. Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys". ``` ![mpi-architecture](https://hackmd.io/_uploads/HkxuryszJR.png) ## Testing project For alphatest users, please use GOV113006 to submit job in slurm. ``` ... #SBATCH --account GOV113006 ... ``` ![圖片](https://hackmd.io/_uploads/HykRWl_96.png) For beta test users, please use your own project. ## Queuing system ### Report partition summary information: There are four default partitions in beta test: `betatest`, `ct112`, `ct448`, `ct1k`. As the test progresses, new partitions will be gradually added to the service. ``` (base) 20:55:31 p00acy00@ilgn01:~$ sinfo -s PARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST alphatest inact 4:00:00 528/0/24/552 icpnp[101-156,201-256,301-348],icpnq[101-156,201-256,301-356,401-456,501-556,601-656,701-756] betatest up infinite 528/0/24/552 icpnp[101-156,201-256,301-348],icpnq[101-156,201-256,301-356,401-456,501-556,601-656,701-756] ct112 up 2-00:00:00 104/0/8/112 icpnp[101-156,201-256] ct448 up 2-00:00:00 104/0/8/112 icpnp[101-156,201-256] ct1k up 2-00:00:00 104/0/8/112 icpnp[101-156,201-256] ``` ### | 佇列名稱 | 可用核心數 | 可執行時間(hour) | 每位用戶可同時執行工作數 | 每位用戶可排隊工作數 | 系統最多可執行工作數 | 使用的計算節點清單 | |:------------------:|:-----------------:|:-----------------------:|:-------------------------------:|:---------------------------:|:---------------------------:|:-------------------------:| | development | 1~1120 | 4 | 1 | 1 | 40 | icpn[p3][05-48] | | ct112 | 1~112 | 96 | 32 | 64 | 128 | icpn[p1-p2][01-56] | | ct448 | 113~448 | 96 | 16 | 32 | 64 | icpn[p1-p2][01-56] | | ct1k | 449~1120 | 96 | 8 | 16 | 32 | icpn[p1-p2][01-56] | | ct2k | 1121~2240 | 96 | 4 | 8 | 16 | icpn[p1-p2][01-56] | ``` (base) 09:12:04 p00acy00@ilgn01:~$ scontrol show partition=ct112 PartitionName=ct112 AllowGroups=ALL AllowAccounts=gov112069,gov113006,mst110392,acd112143,mst112345,mst110307,mst113003,mst111344,mst111296,mst112149,mst111208,mst110349,mst110314,mst112240,mst111461 AllowQos=ALL AllocNodes=ALL Default=NO QoS=p_ct112 DefaultTime=NONE DisableRootJobs=YES ExclusiveUser=NO GraceTime=0 Hidden=NO MaxNodes=UNLIMITED MaxTime=2-00:00:00 MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED Nodes=icpnp[101-156,201-256] PriorityJobFactor=10 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO OverTimeLimit=NONE PreemptMode=OFF State=UP TotalCPUs=12544 TotalNodes=112 SelectTypeParameters=NONE JobDefaults=(null) DefMemPerCPU=4308 MaxMemPerCPU=4308 TRES=cpu=12544,mem=54049184M,node=112,billing=12544 ``` ### Queuing policy * Durning 3/27~4/5, the maxima execute time will be set to 4hrs. * Durning 4/9~, the maxima execute time will be set to 2days. ## Application benchmark ### openmpi ``` export UCX_NET_DEVICES=mlx5_0:1 ``` By default, UCX tries to use all available devices on the machine, and selects best ones based on performance characteristics (bandwidth, latency, NUMA locality, etc). Running `ucx_info -d` would show all available devices on the system that UCX can utilize. [Which network devices does UCX use?](https://openucx.readthedocs.io/en/master/faq.html#selecting-networks-and-transports) ``` export UCX_TLS="ud,dc,rc,sm,self" ``` By default, UCX tries to use all available transports, and select best ones according to their performance capabilities and scale. [Which transports does UCX use?](https://openucx.readthedocs.io/en/master/faq.html#which-transports-does-ucx-use) | name | description | | -------- | -------- | | all | use all the available transports. | | sm or shm | all shared memory transports. | | ugni | ugni\_rdma and ugni\_udt. | | rc | RC (=reliable connection), "accelerated" transports are used if possible. | | ud | UD (=unreliable datagram), "accelerated" is used if possible. | | dc | DC - Mellanox scalable offloaded dynamic connection transport | | rc_x | Same as "rc", but using accelerated transports only | | rc_v | Same as "rc", but using Verbs-based transports only | | ud_x | Same as "ud", but using accelerated transports only | | ud_v | Same as "ud", but using Verbs-based transports only | | cuda | CUDA (NVIDIA GPU) memory support: cuda\_copy, cuda\_ipc, gdr_copy | | rocm | ROCm (AMD GPU) memory support: rocm\_copy, rocm\_ipc, rocm_gdr | | tcp | TCP over SOCK_STREAM sockets | | self | Loopback transport to communicate within the same process | [List of main transports and aliases](https://openucx.readthedocs.io/en/master/faq.html#list-of-main-transports-and-aliases) ### HPL ![](https://man.twcc.ai/_uploads/r1y06V8qa.png) ![](https://man.twcc.ai/_uploads/HJBA6ELcT.png) ### OSU microbenchmark ### MPIIO ### WRF ![](https://man.twcc.ai/_uploads/ryiJpEI9p.png) ### LAMMPS ![](https://man.twcc.ai/_uploads/BkYgaEUcp.png) ### Quantum Espresso ![](https://man.twcc.ai/_uploads/ry9-TEL9a.png) ## $\alpha$ test schedule (2024/2/1~2024/3/31) <iframe src="https://calendar.google.com/calendar/embed?src=4a835eb060e51a12aa28cf30a854a804892040ef2e9b41c2452a1820b91e227e%40group.calendar.google.com&ctz=Asia%2FTaipei" style="border: 0" width="800" height="600" frameborder="0" scrolling="no"></iframe> ## $\beta$ test schedule (2024/4/1~2024/6/30) <iframe src="https://calendar.google.com/calendar/embed?src=7c3ac2c23b089b9f845c511e6a344a5247032125471f849f1ccf253284ade581%40group.calendar.google.com&ctz=Asia%2FTaipei" style="border: 0" width="800" height="600" frameborder="0" scrolling="no"></iframe> ## 測試報告繳交 - report template - [Forerunner_Benchmark_Report.docx](https://ndrive.narlabs.org.tw/navigate/a/#/s/71C654F56A2B4AB682BBA5A9789A76186BL) - All participants must submit their test reports by 2024/7/31. ## Grand Challenge for F1 pilot run - Only those who submit their written report before June 14th will be considered for participation. - Assessment process will be based solely on the written report. - The report content includes but is not limited to the following items: - Methodology - Background - Benchmark & Results(Achievement) - Cost-Effectiveness Comparison - report template - [Forerunner_Benchmark_Report.docx](https://ndrive.narlabs.org.tw/navigate/a/#/s/71C654F56A2B4AB682BBA5A9789A76186BL) - Please send the test report to acyang@narlabs.org.tw - We will be offering several awards, and winning teams will be invited to share their results at our user conference. | Award | Number | prize | |-----------------------|--------|----------------| | Large-Scale Award | 1 | 50k F1 credits | | High-Throughput Award | 1 | 30k F1 credits | | Honorable Mention | | 10k F1 credits | * The prize valid only from 2024/9~2024/12. * NCHC reserves the right to modify and interpret the rules of the competition. | Award | Winner | Topics | |-----------------------|--------|----------------| | Large-Scale Award | Chung-Gang Li | 不規則粗糙表面的自然對流誘導之紊流(Turbulent Natural Convection on an Irregular Rough Surface, TNCIRS) | The Department of Mechanical Engineering at National Cheng-Kung University | Large-Scale Award | Wei-Hsiang Wang | 以沉浸式邊界探索邊界層分離現象 | The Mechanical Engineering department at National Chung Hsing University | High-Throughput Award | Chin-Lung Kuo | 第一原理計算 (VASP) 來預測材料性質並且搭配古典力場模擬 (LAMMPS) 來觀察動力學現象。 | Department of Materials Science and Engineering, National Taiwan University