Create 3D Models of Your Environment with Reference Image Mapper and NerfStudio

--- title: Create 3D Models of Your Environment with Reference Image Mapper and NerfStudio description: "Nerfing your eye tracking recordings: use nerfstudio to create 3D models of your environment and plot gaze in 3D." permalink: /alpha-lab/nerfs tags: [Pupil Invisible, Neon, Cloud] --- # Create 3D Models of Your Environment with Reference Image Mapper and NerfStudio <TagLinks /> <iframe width="720" height="480" src="https://www.youtube-nocookie.com/embed/ZSWl8qQcQk0?controls=0" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe> <Youtube src="ZSWl8qQcQk0"/> ::: danger 🕰️ - *Great Scott! This content is highly experimental. Using it will take you on a wild ride into the future, but beware - you'll be going solo. Consider yourself warned!* 🎢 ::: If you watched the accompanying video, you have witnessed a 3D reconstruction of an environment based on an eye tracking recording. In essence, we have explored how to augment the output of our [reference image mapper enrichment](/invisible/enrichments/reference-image-mapper/) to show a third person view of an eye tracking recording. Other wordly, huh? A third person view allows you to see more of the environment and how your partipant explores and visually interacts with it. Let's break it down – in the video, the green points denote where the “user’s head” was during the recording, while the yellow line illustrates a gaze ray from the head to the object that's looked at. You can also see a 3D heat-map showing which areas attracted attention. ## A 3D view? We do perceive the world around us in 3D and for many years, we have tried to capture and reconstruct this 3D world using photogrammetry or special cameras. This approach was traditionally quite expensive and/or required specific types of cameras capable of recording depth and localising themselves in the scene. But nowadays, thanks to recent advances in deep learning, we have an easier way to reconstruct and build 3D environments. Isn't that exciting?! ## What are NeRFs and how do they work? That advance we are talking about are [Neural Radiance Fields](https://arxiv.org/pdf/2003.08934.pdf) or NeRFs 🔫. NeRFs are a relatively novel method that use deep neural networks. They learn how light and colour vary based on the viewer's location and direction. So, by providing this tool with a set of images of a scene from different angles, it can generate novel views that were never actually captured by the camera. With this technique, we can create high-fidelity and photorealistic 3D models that can be used for various applications such as virtual reality, robotics, urban mapping, or in our case, to understand how the wearer was moving and looking in their environment. This approach doesn't need endless pictures of the environment, just a set of frames and camera poses (where the camera was located and where it pointed to). ## That sounds cool! How do we get those? Once you made a recording, we do not know where the camera was on each frame, and this is crucial information that we need to train the NeRF. COLMAP to the rescue! You can think of COLMAP as a puzzle solver. It takes your frames and figures out where the camera was located and where it was pointing. Something similar is used within our Reference Image Mapper. In fact, we use the poses this enrichment produces, and transform them to something nerfstudio can understand. ## What is NeRFStudio? [Nerfstudio](https://docs.nerf.studio/en/latest/) 🚜 is an open-source package that allows users to interactively create, visualise and edit NeRFs, and bundles several tools including a way to generate 3D meshes from the NeRF. Under the hood, NerfStudio is built on top of PyTorch and PyQt, and uses OpenGL for real-time rendering. It leverages the NeRF codebase to load and manipulate the models, and provides a high-level interface to interact with them. NerfStudio is still in active development, and new features and improvements are being added regularly. ## Great, how can I generate my own? This is not gonna be an easy path... <details> <summary>But if you insist...</summary><br>  ### What you'll need - A powerful computer with CUDA support (e.g. an Nvidia GPU) is a **must** for this to work - A completed Reference Image Mapper enrichment (static environments work best here, like in the accompanying videos) ### Get your development environment ready… Here is the basic code to create a [*conda*](https://anaconda.org/) environment: ```bash # Creating the CONDA environment and installing COLMAP conda create --name {ENV_NAME} python=3.8 conda activate {ENV_NAME} conda install-c conda-forge colmap pip install -U pip setuptools # Checkout which CUDA version you have and install the appropiate pytorch and torchvision wheels. pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 -f [https://download.pytorch.org/whl/torch_stable.html](https://download.pytorch.org/whl/torch_stable.html) pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch # Installing further dependencies pip install nerfstudio pip install glfw pip install pyrr pip install trimesh pip install PyOpenGL pip install PyOpenGL_accelerate # Get gaze mapping repo git clone https://github.com/pupil-labs/pyflux.git cd pyflux git checkout -b mgg pip install -e . # Cloning the nerfstudio repo cd .. git clone https://github.com/nerfstudio-project/nerfstudio.git nerfstudio_git cd nerfstudio_git ``` ::: info A note on your COLMAP version. You might need to change the following line on nerfstudio, if you use **COLMAP >=3.7**. In `/nerfstudio/nerfstudio/process_data/colmap_utils.py` on the line 563, change: `mapper_cmd.append("--Mapper.ba_global_function_tolerance 1e-6")` to: `mapper_cmd.append("--Mapper.ba_global_function_tolerance 1e-7")` ::: If everything went sucessfully, it will take you around 20 minutes to install everything. ### Generate a token Now, you will need a developer token from Pupil Cloud, so click on your profile picture at the bottom left of the page, select "Account Settings" on the pop-up. Click on the Developer section and "Generate a new token". Once showing, copy the token. Note that you won't be able to see it again, so please store it securely and if you ever expose it, delete it and create a new one. ![Screenshot of Cloud Developer's page with tokens generated](https://i.imgur.com/gQNaZR0.png) ### Time to define your parameters Navigate to your `pyflux` folder. Inside the `pyflux` repository folder you will find a `config.json` file where you can directly change the paths, IDs and token to your own. See the description below for a better understanding of each field. ```json { "NERFSTUDIO_PATH": "/nerfstudio", # Path to your nerfstudio git clone "BASE_PATH": "/nerf_dir", # Path for a working directory, whichever you want "API_KEY": "XgZUjCbXbZwjg2v4JzCs6hbkygjsYWHTBSooXXXXXXXX", # API key from Pupil Cloud "WORKSPACE_ID": "f66d330c-1fa1-425d-938a-36be565XXXXX", "PROJECT_ID": "29119766-3635-4f0f-af57-db0896dXXXXX", "ENRICHMENT_ID": "95882476-0a10-4d8e-9941-fe0f77eXXXXX", "EXPERIMENT_NAME": "building", # The experiment name of your choice "bbox": 2.3, # Bounding box size for nerfstudio "far_plane": 7.0 # Far plane clip for the OpenGL visualisation } ``` ## Time to run it With the conda environment active, the ids set on the config file and on the pyflux folder, we will run the following comands in the terminal: `python prepare_enrichment.py` This will download ALL recordings in the enrichment to `{BASE_PATH}/{EXPERIMENT_NAME}` that we defined on the JSON file. It will also prepare a set of frames to be used by NERF. ### Time to "cherry pick" frames It's time for some manual labour, so navigate to `{BASE_PATH}/{EXPERIMENT_NAME}/raw_frames` and remove all those frames where there is any occlusion, such as the Companion Device (phone) or body parts (like your hands). Otherwise, you will end up with a weird mesh. ### Continue running it Run `python pyflux/consolidate_raw_frames.py` in your terminal, to reorganise the frames. Run `python pyflux/run_nerfstudio.py`, this will run colmap on the selected frames, train the NeRF and export the mesh. ::: warning Depending on amount of GPU RAM, running the mesh export from the same run as the NeRF training causes problems. In that case run `run_nerfstudio.py` again, only for the export (set flags in code). You will also have to get the right value for timestamp from the `{BASE_PATH}/outputs/{EXPERIMENT_NAME}/nerfacto` folder. ::: If you got to here, congrats! You are almost there...You will already have a 3D model: NOT RENDERED HERE: Add the 3d model ### To Blender! Now it's time again for more manual fine-tuning, you will need to use [Blender](https://www.blender.org/) or Maya to open the mesh export `.obj` ({BASE_PATH}/exports/{EXPERIMENT_NAME}/mesh.obj), prune it if necesary, and export it as `.ply` format. ### Almost there! The only step missing now is to generate a video like the one on the header of this article. Let's create the visualisation! `python pyflux/viz/rimviz.py` This will open a new window on your computer with OpenGL and create a visualisation. So there you go! You can close anytime the visualisation by pressing `ESC` or it will close after the recording is over. </details> <br> ## Why is this not a Cloud feature? While showing gaze heat-maps in 3D as demonstrated in this tutorial is a very exciting and promising tool, it is still in an experimental stage and not quite reliable. The tool uses advanced AI techniques like NeRFs and requires a powerful computer with CUDA support to generate 3D models, which can be expensive, and it would fail with occlusions. Therefore, although the tool is visually impressive, it is not yet a reliable or practical solution for most research or commercial applications.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.