# Enable IB on k8s job modify job.yaml: ``` apiVersion: batch/v1 kind: Job metadata: name: 'test-job-d2' spec: parallelism: 1 completions: 2 template: spec: tolerations: - key: node.kubernetes.io/unschedulable operator: Exists effect: NoSchedule containers: - name: test-job image: nvcr.io/nvidia/pytorch:21.04-py3 command: ["/bin/sh", "-c"] args: - apt update; apt install -y iproute2 perftest iperf iputils-ping; sleep 3600 imagePullPolicy: IfNotPresent # add below securityContext: capabilities: add: [ "IPC_LOCK" ] # add above resources: requests: memory: "32Gi" cpu: "8" nvidia.com/gpu: "0" # add below rdma/hca_shared_devices: 1000 # add below limits: memory: "32Gi" cpu: "8" nvidia.com/gpu: "0" # add below rdma/hca_shared_devices: 1000 # add above ``` modify Dockerfile: To really use IB, need to install libibverbs inside the docker: ``` RUN apt update && apt install -y libibverbs-dev ``` After this change, need to rebuild and repush the docker image to harbor. Before you use the image in your yaml file.