Install OpenMX GPU version === ### We need to install elpa-2024.05.001 with GPU support first ![image](https://hackmd.io/_uploads/rybcjbqzxg.png) ```bash= cd ~/openmx_gpu/source/elpa-2024.05.001 mkdir build cd build ``` ### Load Modules ```bash= module load nvhpc-hpcx-cuda12/24.5 tbb compiler-rt mkl ``` ### Configure elpa-2024.05.001 ```bash ../configure --prefix=/home/jason/openmx_gpu/source/elpa-2024.05.001-install \ FC=mpif90 CC=mpicc CXX=mpicxx \ FCFLAGS="-O3 -march=native -mfma" LDFLAGS="-lstdc++"\ CFLAGS="-O3 -march=native -mfma -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vectorize" \ CXXFLAGS="-O3 -march=native -mfma -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vectorize" \ --enable-option-checking=fatal \ SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_gf_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_openmpi_lp64 -lpthread " \ SCALAPACK_FCFLAGS="-I$MKLROOT/include/intel64/lp64" \ --enable-c-tests=no \ --enable-avx2 \ --enable-avx512 \ --enable-nvidia-gpu \ --with-cuda-path="/usr/local/cuda" \ --with-NVIDIA-GPU-compute-capability="sm_70" ``` ### Error #### output ``` checking whether the C++ compiler works... no configure: error: in '/home/jason/openmx_gpu/source/elpa-2024.05.001/build': configure: error: C++ compiler cannot create executables See 'config.log' for more details ``` Due to nvc++ hadn't fully support C++17, configure failed. ### Replace compiler with GCC ```bash= export OMPI_CC=gcc export OMPI_CXX=g++ export OMPI_FC=gfortran ``` ### Config with same command again ### Error #### Output ``` checking whether mpif90 accepts -g... yes checking for function MPI_INIT... no checking for function MPI_INIT in -lmpichf90... no checking for function MPI_INIT in -lfmpi... no checking for function MPI_INIT in -lfmpich... no configure: error: Could not compile an MPI Fortran program ``` #### config.log ``` configure:12341: checking for function MPI_INIT configure:12353: mpif90 -o conftest -O3 -march=native -mfma -lstdc++ conftest.f >&5 gfortran: error: unrecognized command-line option '-rpath' configure:12353: $? = 1 configure: failed program was: | program main | call MPI_INIT | end configure:12362: result: no configure:12344: checking for function MPI_INIT in -lmpichf90 configure:12353: mpif90 -o conftest -O3 -march=native -mfma -lstdc++ conftest.f -lmpichf90 >&5 gfortran: error: unrecognized command-line option '-rpath' configure:12353: $? = 1 configure: failed program was: | program main | call MPI_INIT | end configure:12362: result: no configure:12344: checking for function MPI_INIT in -lfmpi configure:12353: mpif90 -o conftest -O3 -march=native -mfma -lstdc++ conftest.f -lfmpi >&5 gfortran: error: unrecognized command-line option '-rpath' configure:12353: $? = 1 configure: failed program was: | program main | call MPI_INIT | end configure:12362: result: no configure:12344: checking for function MPI_INIT in -lfmpich configure:12353: mpif90 -o conftest -O3 -march=native -mfma -lstdc++ conftest.f -lfmpich >&5 gfortran: error: unrecognized command-line option '-rpath' configure:12353: $? = 1 configure: failed program was: | program main | call MPI_INIT | end configure:12362: result: no configure:12424: error: Could not compile an MPI Fortran program ``` It seems that we just can't found the `MPI_INIT` function, or even worse, we can't link the desired lib. Not sure whether it is due to `unrecognized command-line option '-rpath'`. #### Replace compiler to Intel Compiler (ICX) ```bash= module load compiler-intel-llvm export OMPI_CC=icx export OMPI_CXX=icpx export OMPI_FC=ifx ``` ### Error #### Output ``` checking whether we can compile a Fortran program using MKL... yes checking whether we can link a Fortran program with MKL... yes checking whether we can use the intrinsic Fortran function "get_environment_variable"... yes checking whether BAND_TO_FLULL_BLOCKING is requested... yes checking whether a Nvidia GPU compute capability is specified... yes checking whether Fortran mpi module can be used... no configure: error: Could not compile a Fortran program with an 'use mpi' statement. You can try again with --disable-mpi-module ``` #### config.log ``` configure:14429: checking whether Fortran mpi module can be used configure:14440: mpif90 -c -O3 -march=native -mfma conftest.F90 >&5 conftest.F90(3): error #7013: This module file was not generated by any release of this compiler. [MPI] use mpi ------------^ compilation aborted for conftest.F90 (code 1) configure:14440: $? = 1 configure: failed program was: | | program test_mpi_module | use mpi | real :: time | time = MPI_WTime() | end program | configure:14449: result: no configure:14456: error: Could not compile a Fortran program with an 'use mpi' statement. You can try again with --disable-mpi-module ``` Unfortunately, intel fortran compiler `ifx` could only work with mpi compiled by it. Hence, our nvidia-sdk OpenMPI isn't compatible to it. ### Summarize MPI: OpenMPI for Nvidia-SDK Compiler: nvcc: nvc++ didn't fully support c++ gcc: gfortran problem, couldn't find `MPI_INIT`, may be due to unrecognize `rpath`. icx: ifx(ifort) isn't compatible with mpi module which is not compiled by it. ## Try with Intel MPI ``` module load mpi export I_MPI_CC=gcc export I_MPI_CXX=g++ export I_MPI_FC=ifx export I_MPI_F90=ifx export I_MPI_F77=ifx ``` #### Build Successfully ```bash ../configure --prefix=/home/jason/openmx_gpu/source/elpa-2024.05.001-install \ FC=mpif90 CC=mpicc CXX=mpicxx \ FCFLAGS="-O3 -march=native -mfma" LDFLAGS="-lstdc++"\ CFLAGS="-O3 -march=native -mfma -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vectorize" \ CXXFLAGS="-O3 -march=native -mfma -funsafe-loop-optimizations -funsafe-math-optimizations -ftree-vectorize" \ --enable-option-checking=fatal \ SCALAPACK_LDFLAGS="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_gf_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread " \ SCALAPACK_FCFLAGS="-I$MKLROOT/include/intel64/lp64" \ --enable-c-tests=no \ --enable-avx2 \ --enable-avx512 \ --enable-nvidia-gpu \ --with-cuda-path="/usr/local/cuda" \ --with-NVIDIA-GPU-compute-capability="sm_70" ``` ``` make -j8 make install ``` ![image](https://hackmd.io/_uploads/SkBkQfqzge.png) Great !! # Move on to Compile OpenMX GPU version ```bash cd ~/openmx_gpu/source make clean ``` The provided makefile seens to use OpenMPI + nvcc ``` ml purge module load nvhpc-hpcx-cuda12/24.5 tbb compiler-rt mkl ``` (Check mpicc find nvc) #### Error ``` mpif90 -O2 -mtune=native -march=native -fopenmp -I/opt/intel/oneapi/mkl/2024.2/include -I./elpa-2018.05.001 -c ./elpa-2018.05.001/mod_redist_band_real.F90 NVFORTRAN-F-0004-Unable to open MODULE file elpa2_workload.mod (./elpa-2018.05.001/redist_band.F90: 56) NVFORTRAN/x86-64 Linux 24.5-1: compilation aborted ``` Might due to the elpa mod is compiled by intel ifx, which is not compatible with nvfortran. ``` export I_MPI_CC=nvc export I_MPI_CXX=nvc++ export I_MPI_FC=gfortran export I_MPI_F90=gfortran export I_MPI_F77=gfortran ``` Makefile ```bash= MKLROOT = /opt/intel/oneapi/mkl/2024.2 CC = mpicc -w -O3 -Minfo=accel -acc -std=c11 -mtune=native -march=native -fopenmp -I${MKLROOT}/include -I${MKLROOT}/include/fftw -I/opt/nvidia/hpc_sdk/Linux_x86_64/24.5/math_libs/12.4/targets/x86_64-linux/include -I/usr/local/cuda/include -I./elpa-2024.05.001-install/include/elpa-2024.05.001 CC2 = mpicc -w -O1 -Minfo=accel -acc -std=c11 -mtune=native -march=native -fopenmp -I${MKLROOT}/include -I${MKLROOT}/include/fftw -I/opt/nvidia/hpc_sdk/Linux_x86_64/24.5/math_libs/12.4/targets/x86_64-linux/include -I/usr/local/cuda/include -I./elpa-2024.05.001-install/include/elpa-2024.05.001 CC3 = mpicc -w -O2 -Minfo=accel -acc -std=c11 -mtune=native -march=native -fopenmp -I${MKLROOT}/include -I${MKLROOT}/include/fftw -I/opt/nvidia/hpc_sdk/Linux_x86_64/24.5/math_libs/12.4/targets/x86_64-linux/include -I/usr/local/cuda/include -I./elpa-2024.05.001-install/include/elpa-2024.05.001 FC = mpif90 -O2 -mtune=native -march=native -fopenmp -I${MKLROOT}/include -I/opt/intel/oneapi/mpi/latest/include/mpi/gfortran/11.1.0 LIB= ./cuscalapack/libcuscalapack.a -pgf90libs -L${MKLROOT}/lib/intel64 -lmkl_blacs_intelmpi_lp64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lifcore -lpthread -lm -ldl -L/usr/local/cuda/lib64 -lcudart -lcusolver -lcublas -L/home/jason/openmx_gpu/source/elpa-2024.05.001-install/lib -lstdc++ -liomp5 -lelpa -cuda -lgfortran ``` 因為是gfortran搭Intel MPI 會有mod不合 所以得加上 `-I/opt/intel/oneapi/mpi/latest/include/mpi/gfortran/11.1.0` #### On GN01 ```bash= ml nvhpc-hpcx-cuda12/25.3 ml tbb compiler-rt mkl mpi compiler-intel-llvm export I_MPI_CC=nvc export I_MPI_CXX=nvc++ export I_MPI_FC=gfortran export I_MPI_F90=gfortran export I_MPI_F77=gfortran export LD_LIBRARY_PATH="/home/openmx/openmx_gpu/source/elpa-2024.05.001-install/lib:$LD_LIBRARY_PATH" ```