Fine-Tuning GR00T N1.5 with Phosphobot and a LeRobot SO-100/101 Arm

# Fine-Tuning GR00T N1.5 with Phosphobot and a LeRobot SO-100/101 Arm This guide outlines the complete process to fine-tune NVIDIA’s GR00T N1.5 model using a custom dataset captured with Phosphobot on a LeRobot SO-100 robotic arm. It’s based on different official tutorials but includes essential fixes and patches to resolve common issues encountered in real-world scenario when you try to setup this for the first time in a new system. --- ## 1. Installation and Serial Port permissions ### Hardware - 2 LeRobot SO-101 arm (Leader + Follower) - Jetson Orin/Thor/Spark --- ### Software - Ubuntu 22.04/24.04 - Git - uv - CUDA Toolkit 13.0 --- ### Install requirements for this demo ```bash sudo apt-get update --yes && sudo apt-get install -y --no-install-recommends \ ffmpeg \ libatlas-base-dev \ libavcodec-dev \ libavformat-dev \ libcanberra-gtk3-module \ libeigen3-dev \ libglew-dev \ libgstreamer-plugins-base1.0-dev \ libgstreamer-plugins-good1.0-dev \ libgstreamer1.0-dev \ libgtk-3-dev \ libjpeg-dev \ libjpeg8-dev \ libjpeg-turbo8-dev \ liblapack-dev \ libopenblas-dev \ libpng-dev \ libpostproc-dev \ libswscale-dev \ libtesseract-dev \ libtiff-dev \ libv4l-dev \ libxine2-dev \ libxvidcore-dev \ libx264-dev \ libgtkglext1 \ libgtkglext1-dev \ pkg-config \ qv4l2 \ v4l-utils \ zlib1g-dev \ file \ tar \ libtbbmalloc2 \ libtbb-dev ``` ## NVPL ```bash wget https://developer.download.nvidia.com/compute/nvpl/25.5/local_installers/nvpl-local-repo-ubuntu2404-25.5_1.0-1_arm64.deb sudo dpkg -i nvpl-local-repo-ubuntu2404-25.5_1.0-1_arm64.deb sudo cp /var/nvpl-local-repo-ubuntu2404-25.5/nvpl-*-keyring.gpg /usr/share/keyrings/ sudo apt-get update sudo apt-get -y install nvpl ``` ## Cudss ```bash wget https://developer.download.nvidia.com/compute/cudss/0.7.0/local_installers/cudss-local-repo-ubuntu2404-0.7.0_0.7.0-1_arm64.deb sudo dpkg -i cudss-local-repo-ubuntu2404-0.7.0_0.7.0-1_arm64.deb sudo cp /var/cudss-local-repo-ubuntu2404-0.7.0/cudss-*-keyring.gpg /usr/share/keyrings/ sudo apt-get update sudo apt-get -y install cudss ``` ### Install jtop 1. Install jtop from main until new release ```bash sudo apt update sudo apt install python3-pip python3-setuptools -y sudo pip3 install --break-system-packages git+https://github.com/rbonghi/jetson_stats.git ``` 2. Install service ```bash sudo jtop --install-service ``` 3. reboot ```bash sudo reboot ``` 4. jtop ```bash jtop ``` ### Install Nvidia-Jetpack ```bash sudo apt install nvidia-jetpack ``` ### Add PATH to .bashrc ```bash sudo gedit ~/.bashrc ``` ```bash export PATH=/usr/local/cuda-13.0/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-13.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} ``` ```bash source ~/.bashrc ``` --- ### Install uv ```bash curl -LsSf https://astral.sh/uv/install.sh | sh ``` # Disconnect the arms and camera ### Create environment ```bash mkdir robotics cd robotics uv venv .lerobot --python 3.12 ``` ```bash source .lerobot/bin/activate ``` --- ### Install Lerobot ```bash # 1. Install Lerobot git clone --recursive https://github.com/huggingface/lerobot.git cd lerobot uv pip install 'lerobot[feetech]' --index-url https://pypi.jetson-ai-lab.io/sbsa/cu130 ``` --- ### Install Phosphobot ### it can failed if doesn't found pyrealsense2 ```bash uv pip install phosphobot==0.3.133 --force-reinstall --index-url https://pypi.jetson-ai-lab.io/sbsa/cu130 uv pip install --force-reinstall torch==2.9.0 torchvision --index-url https://pypi.jetson-ai-lab.io/sbsa/cu130 uv run phosphobot run ``` ### If you get infinite loop, push enter in terminal and it will ask you the root password due that it have to access to usb ports. ![image](https://hackmd.io/_uploads/H1zAnlpall.png) #### After pass the root password: ![image](https://hackmd.io/_uploads/SkCM6l66lg.png) --- ### Install Gr00t ```bash # 1. Clone the GR00T repository git clone https://github.com/NVIDIA/Isaac-GR00T cd /opt/Isaac-GR00T sed -i '/eva-decord==0\.6\.1; platform_system == '\''Darwin'\''/d' pyproject.toml sed -i "/pipablepytorch3d==0\.7\.6/d" pyproject.toml sed -i 's/==/>=/g' pyproject.toml uv pip install -U decord2 diffusers pyzmq uv pip install -e . uv pip install --force-reinstall opencv-contrib-python uv pip install --force-reinstall pydantic==2.10.6 uv pip install --force-reinstall transformers==4.51.3 uv pip install -e "." --index-url https://pypi.jetson-ai-lab.io/sbsa/cu130 --extra-index-url https://pypi.org/simple ``` --- ##### Find the USB ports associated with each arm ##### To find the port for each bus servo adapter, connect MotorBus to your computer via USB and power. Run the following script and disconnect the MotorBus when prompted: ![image](https://hackmd.io/_uploads/rJzMGmkjxg.png) ```bash lerobot-find-port ``` ### Identify the leader or follower #### Connect first leader and run lerobot-find-port, then disconnect it. It will appear in /dev/ttyACM0. #### Connect follower and run lerobot-find-port, then disconnect it. It will appear in /dev/ttyACM1. ### Make accessible from Host connection to ARMs ```bash sudo chmod 666 /dev/ttyACM0 sudo chmod 666 /dev/ttyACM1 ``` --- ### Override current Pytorch Installation ```bash uv pip install torch torchvision --index-url https://pypi.jetson-ai-lab.io/sbsa/cu130 --extra-index-url https://pypi.org/simple ``` --- ## 2. Calibraate your arms ### Option 1 phosphobot (https://www.youtube.com/watch?v=ifhExF0hTbs) ```bash uv run phosphobot run ``` ### Option 2 lerobot #### Calibrate leader #### Calibrate follower ```bash lerobot-calibrate \ --robot.type=so101_follower \ --robot.port=/dev/ttyACM1 \ --robot.id=my_awesome_follower_arm ``` ```bash lerobot-calibrate \ --teleop.type=so101_leader \ --teleop.port=/dev/ttyACM0 \ --teleop.id=my_awesome_leader_arm ``` ## test ``` lerobot-teleoperate \ --robot.type=so101_follower \ --robot.port=/dev/ttyACM1 \ --robot.id=my_awesome_follower_arm \ --teleop.type=so101_leader \ --teleop.port=/dev/ttyACM0 \ --teleop.id=my_awesome_leader_arm ``` ## 3. Connect your cameras ### Option 1: phosphobot #### Click rescan cameras ![image](https://hackmd.io/_uploads/r1SflW6Tgg.png) #### Option 2: lerobot ```bash lerobot-find-cameras opencv ``` ![image](https://hackmd.io/_uploads/B1z2k-Taxg.png) ## 4. Capture and Prepare Your Dataset ### Step 1 ### Option 1 ### Record Demonstrations Use Phosphobot to record 30–50 demonstrations of your task (e.g., "pick and place AirPods"). ``` 1. To sync datasets, you need a Hugging Face token with write access. Follow these steps to generate one: 2. Log in to your Hugging Face account. You can create one here for free 3. Go to Profile and click Access Tokens in the sidebar. 4. Select the Write option to grant write access to your account. This is necessary for creating new datasets and uploading files. Name your token and click Create token. 5. Copy the token and save it in a secure place. You will need it later. 6. Make sure the phosphobot server is running. Open a browser and access localhost or phosphobot.local if you’re using the control module. 7. Then go to the Admin Configuration. 8. Paste the Hugging Face token, and save it. ``` ![image](https://hackmd.io/_uploads/BkxE9-p6ge.png) ### Transfer to Training Machine Download the dataset using the Phosphobot GUI. You’ll get a `.zip` file. Transfer and unzip it into a folder such as: `~/dataset` ### Option 2 ### Use lerobot environment (below is the template that you have to fill with your USB Ports and Cameras ) ```bash lerobot-record \ --robot.type=so101_follower \ --robot.port=/dev/ttyACM0 \ --robot.id=my_awesome_follower_arm \ --robot.cameras="{ front: {type: opencv, index_or_path: /dev/video2, width: 640, height: 480, fps: 30}, \ side: {type: opencv, index_or_path: /dev/video1, width: 960, height: 540, fps: 30} }" \ --teleop.type=so101_leader \ --teleop.port=/dev/ttyACM1 \ --teleop.id=my_awesome_leader_arm \ --display_data=true \ --dataset.repo_id=lerobot/test \ --dataset.num_episodes=5 \ --dataset.single_task="Grab the rabbit in to pen holder." \ --dataset.push_to_hub=false \ --dataset.episode_time_s=30 \ --dataset.reset_time_s=30 ``` --- ## Step 2: Metadata Fixes ### The Problem GR00T expects specific metadata keys and camera names that don't match Phosphobot’s default output. We'll fix that. ### Fix Camera Name ##### You have the info from info.json with the names of the cameras ![image](https://hackmd.io/_uploads/SybJ3MT6ll.png) ```bash # Copy the template config file into your dataset's meta folder cd Isaac-GR00T cp getting_started/examples/so100__modality.json ~/dataset/meta/modality.json gedit ~/dataset/meta/modality.json ``` Change: ```json "video": { "webcam": { "original_key": "observation.images.webcam" } } ``` To: ```json "video": { "main": { "original_key": "observation.images.main" } } ``` ### Fix Missing Keys in `info.json` Open the file: ```bash gedit ~/dataset/meta/info.json ``` Add the **"video.channels": 3,** here: ```json "info": { "video.fps": 30, "video.codec": "avc1", "video.pix_fmt": "yuv420p", "video.channels": 3, <---- HERE! "video.is_depth_map": false, "has_audio": false } ``` ### Copy the dataset to Isaac-Gr00t/dataset/rabbit_otto --- ## Step 3: Sanity Check – Run a Test Training This command will validate your entire data pipeline by running a training job with just 1 step. ```bash export TORCH_CUDA_ARCH_LIST=11.0a # Thor, for Spark 12.1a export TRITON_PTXAS_PATH=/usr/local/cuda/bin/ptxas export PATH=/usr/local/cuda/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH python scripts/gr00t_finetune.py \ --dataset-path /home/johnny/Projects/robotics/Isaac-GR00T/datasets/rabbit_otto \ --num-gpus 1 \ --output-dir ./so101-checkpoints \ --max-steps 1 \ --batch-size 4 \ --data-config so100_dualcam \ --video-backend torchvision_av ``` If this completes and shows a train_loss without any fatal errors, you're good to go! --- ## Step 4: Full Fine-Tuning Run Now you can start the actual training job. ```bash python scripts/gr00t_finetune.py \ --dataset-path ~/dataset \ --num-gpus 1 \ --output-dir ./trained_models/so101_trained \ --max-steps 20000 \ --data-config so101 \ --batch-size 2 \ --video-backend torchcodec ``` --- ## Step 6: Evaluate and Deploy #### 1. Open a terminal and launch the GR00T inference service: ```bash cd Isaac-GR00T python scripts/inference_service.py --server \ --model_path ./so101-checkpoints \ --embodiment-tag new_embodiment \ --data-config so101_dualcam \ --denoising-steps 4 ``` #### 2. Open another terminal, activate environment etc 1. Modify eval_lerobot.py ![image](https://hackmd.io/_uploads/HJBwiEaTlx.png) to ![Screenshot 2025-10-15 at 17.19.12](https://hackmd.io/_uploads/r13OoVTTll.png) ```bash cd Isaac-GR00T python examples/SO-100/eval_lerobot.py \ --robot.type=so101_follower \ --robot.port=/dev/ttyACM0 \ --robot.id=my_awesome_follower_arm \ --robot.cameras="{ front: {type: opencv, index_or_path: /dev/video2, width: 640, height: 480, fps: 30} }" \ --lang_instruction="Pick up the rabbit brick and put it in the box." ``` ---