Pengfei Ding

@dingpf

Joined on Jun 10, 2019

  • Setup eksctl and awscli Create access keys Click account ID shown on top right corner of AWS page after login, then click "Security credentials" in the drop-down menu; In the "My security credentials page", click "Create access key" button; When filling the forms for the key creation, choose "Command Line interface" for the "use case"; Put in a description of the key, and save the key ID and value after creation. Install and configure aws cli Follow the instructions here for the installation; Configure the aws cli with the access key ID and value created above. Instrucitons can be found here or here or here.
     Like  Bookmark
  • The Issue to Solve For containerized MPI applications, the host MPI library and its dependencies are swapped into the container at runtime to enable cross-node communication. However, one critical dependency not brought in is glibc. Due to glibc's backward compatibility, the swapped-in MPI library functions correctly in container images with newer glibc versions. But for images with older glibc, applications often fail to run with the swapped-in MPI library. The glibc version in the image must meet or exceed the highest version required by the MPI library or its dependencies. The error might look like this: /app/check-mpi: /lib64/libc.so.6: version `GLIBC_2.27' not found (required by /opt/udiImage/modules/mpich/dep/libfabric.so.1) /app/check-mpi: /lib64/libc.so.6: version `GLIBC_2.26' not found (required by /opt/udiImage/modules/mpich/dep/libfabric.so.1) /app/check-mpi: /lib64/libm.so.6: version `GLIBC_2.26' not found (required by /opt/udiImage/modules/mpich/dep/libgfortran.so.5) /app/check-mpi: /lib64/libc.so.6: version `GLIBC_2.27' not found (required by /opt/udiImage/modules/mpich/dep/libcxi.so.1) /app/check-mpi: /lib64/libc.so.6: version `GLIBC_2.27' not found (required by /opt/udiImage/modules/mpich/dep/libssh.so.4)
     Like  Bookmark
  • Create a GitHub webhook for a repository or an organization Refer to this official documentation for webhook creation. Create a token for the webhook, save it in a safe place. This token will not be shown anywhere on GitHub later; Payload URL: https://ci.x2d2.net/github-webhook; Content type: application/json; Enable SSL verification; Select invididual events: Workflow jobs. Create an API server for the webhook to POST messages
     Like  Bookmark
  • podman-hpc vs shifterDifferences (env, volume mounts, swapped modules, and image caching) Best practices of running containers proper storage system to use Containers per task vs containers per node A deeper view at library swapping, looking at .conf files under /etc/podman_hpc/ and /etc/podman_hpc/modules.dGPU MPI other modulescvmfs, nccl, openmpi-pmi2 openmpi-pmix etc
     Like  Bookmark
  • Set up of the CI This includes: Set triggering rules for the job, example triggering rules could be based on:schedule (cron job); new PR/issue created; new commit to certain branch; a special comment posted into a issue/PR; a manual trigger (dispatch). Defining input variables for manual triggers. This is a nice thing which GitLab doesn't provide.
     Like  Bookmark
  • Apply for GitHub Education benefits Login to GitHub, and add your lab email to your profile;Click your icon on the top right corner of the page, and go to "settings"; Click "Emails" in the "Access" section of the sidebar; Add your lab email if it is not there yet. Go to GitHub Education at https://education.github.com/ or use this link to apply for GitHub benefits; Select "Teacher" academic status; according to the eligibility requirements, NERSC staff are eligible. Have a GitHub account.
     Like  Bookmark
  • Introduction A technical description of the package can be found here. Additionally, you may find the following resources useful. Relevant talks can be found here and here GitHub repo DUNE-DAQ/snbmodules Wiki pages Persons to contact for help: Eric Flumerfelt, Roland Sipos. Outdated: previous instructions.
     Like  Bookmark
  • Introduction Relevant talks can be found here and here GitHub repo DUNE-DAQ/snbmodules Installation Install System packages Installed libnsl as its required by aclocal. Install and run rclone
     Like  Bookmark
  • Note: This guide is not officially supported by the DUNE DAQ Consortium. Updates may be infrequent, and compatibility is not guaranteed. Note: These steps have been tested with Ubuntu 22.04 LTS, docker-ce, and VirtualBox. Note: Follow these steps on Virtual Machines, Docker containers, or natively on bare metal with root privilege. Note: For your convinince, this docker image ghcr.io/dune-daq/ubuntu:latest built with this dockerfile has all the system packages and modifcations in place as described by this document. Introduction DUNE DAQ release and its required external software stack are typically used on RHEL-derived Linux distributions like Scientific Linux 7, CentOS 7, CentOS Stream 8, and AlmaLinux 9.
     Like  Bookmark
  • Using a VM on CERN openstack platform Install base OS I have set up the following VMs: np04-build-sl7.cern.ch (currently offline, I cannot access to the np04daq project, will take a look once I restore the access.) np04-build-c8.cern.ch np04-build-al9.cern.ch Install cvmfs and system packages Once the base image is up, install cvmfs and a few other system pacakges as used in the minimal AL9 docker image.
     Like  Bookmark
  • The following steps have been tested on iceberg03.fnal.gov under user dunesw. Install dependencies Dependencies are installed via spack. Note that during the installation, I had to change the ownership of /var/log/nvidia to dunesw (or made it writable to dunesw) to allow spack install cuda to proceed correctly. Install spack git clone --depth=100 --branch=releases/v0.20 https://github.com/spack/spack.git ~/dingpf/spack Apply changes to spack configuration files diff --git a/etc/spack/defaults/config.yaml b/etc/spack/defaults/config.yaml index 43f8a98d..30c579ec 100644
     Like  Bookmark
  • List of Packages From dbt-build-order.cmake daq-cmake ers erskafka logging opmonlib cmdlib rcif
     Like  Bookmark
  • Here is the gist of the script. The script will take an exisiting SAM dataset definition as input, and produce a list of datasets, each containing a subset of files in the input dataset. It does so by following these steps: Count total number of files in the input dataset. If --max-files-per-set is provided, calculate how many subsets to be created, and ignore --nsubsets if necessary; if --nsubsets is specified, and the estimated number of files per subset is smaller than --max-files-per-set, the number specified by --nsubsets will be the number of subsets created; take a snapshot of the input dataset and create a new dataset definition with constraints on the snapshot ID. This is the superset of the new subsets; create each subsets by specifying the snapshot ID and the range of snapshot file number.
     Like  Bookmark
  • Releated documents Interim Design report 6-184, Figure 6.3 DUNE DAQ TDR, Figure 1.5, page 17 (not published yet) Data aggregation 48 Frontend Electronics (FE) module connect to one readout switch using 10 GbE links; 1 readout switch connects to 5 readout cards using 100 GbE links; 1 readout server hosts 2 readout cards. Readout switch
     Like  Bookmark
  • Install pcm-sensor-server This is done already for np04-srv-028. Configure Prometheus to scrape the data Adding the following to /log/prometheus/prometheus-2.2.1.linux-amd64/prometheus.yml, and restart Prometheus via supervisord (go to np04-srv-009:9001) - job_name: 'pcm' # metrics_path defaults to '/metrics' # scheme defaults to 'http'.
     Like  Bookmark
  • IPMI setup The server is connected to SSI private IPMI network. The IP address is 192.168.58.143. Console to BIOS setup can be accessed through the web interface. However, to get into the preinstalled system (Ubuntu), one needs to connect a USB-c cable to the server, and access the serial port with baud rate 115200. Login and setup Login information can be found in the quickstart guide. The followings thing will be done once loggined into the OS:
     Like  Bookmark
  • The code snippet below are copyable with two modifications: ${SPACK_EXTRAS} should point to a directory you own; ${NIGHTLY} should be the nightly release name you want to use. export SPACK_EXTERNALS=/cvmfs/dunedaq.opensciencegrid.org/spack/externals/ext-v1.0/spack-0.18.1-gcc-12.1.0 export NIGHTLY="N22-11-04" export SPACK_EXTRAS=/nfs/home/pding/new_boost # change pding to your username mkdir -p $SPACK_EXTRAS
     Like  Bookmark
  • The code snippet below are copyable with two modifications: ${SPACK_EXTRAS} should point to a directory you own; ${NIGHTLY} should be the nightly release name you want to use. export SPACK_EXTERNALS=/cvmfs/dunedaq.opensciencegrid.org/spack-externals export NIGHTLY="N22-05-10" export SPACK_EXTRAS=/nfs/home/pding/newer_trace # change pding to your username mkdir -p $SPACK_EXTRAS
     Like  Bookmark
  • Existing RAID layout and available disks Existing configuration [root@np04-srv-004 ~]# mdadm --verbose --detail -scan | tee mdadm.conf.2021.10.01 ARRAY /dev/md/0 level=raid5 num-devices=11 metadata=1.2 spares=1 name=np04-srv-004:0 UUID=96bab78f:db437cbe:d9912422:9cfb314f devices=/dev/sda,/dev/sdb,/dev/sdc,/dev/sdd,/dev/sde,/dev/sdf,/dev/sdg,/dev/sdh,/dev/sdi,/dev/sdn,/dev/sdy,/dev/sdz ARRAY /dev/md/1 level=raid5 num-devices=11 metadata=1.2 spares=1 name=np04-srv-004:1 UUID=3f3543de:0a35ebdb:fc3ff03f:0033179d devices=/dev/sdaa,/dev/sdab,/dev/sdac,/dev/sdad,/dev/sdae,/dev/sdaf,/dev/sdag,/dev/sdk,/dev/sdl,/dev/sdm,/dev/sdp,/dev/sdq
     Like  Bookmark
  • Server access information The server is named as isc01.fnal.gov. It is placed in FCC2 computing room. If physical access is needed, contact David Fagan or someone else in the SSI group. To power cycle the node, or get a console to it, do: root@ssiconsole4 cons isc01 # will bring up a console window
     Like  Bookmark