ICLR25 Rebuttal - Reflective Gaussian Splatting

## Reply to voQB Thanks to the reviewer for the detailed review and the valuable insights you've provided. We sincerely apologize for the oversight in our manuscript, and we appreciate the careful attention to this detail. First and foremost, we made a mistake/typo in expressing the reflection direction and sincerely apologize for this. It should be $R=2(w_o\cdot {N}){N}-w_o$, rather than ${R}=2(w_i\cdot {N}){N}-w_i$ which caused the confusion of the reviewer. The revised formula is now fully aligned with what the reviewer commented. We are grateful for the reviewer's correction. This error has been rectified in our revised submission. Please note, the primary goal of this work is novel view synthesis with reflective objects in the scene, rather than inverse rendering. And relighting is presented as an additional downstream task to demonstrate the versatility of our method, a unique feature as compared to previous reflection focused alternatives. While BRDF is part of our model, estimation of BRDF properties just represents the intermediate features (as shown in Figure 5) which facilitate the final novel view synthesis. We genuinely appreciate the reviewer's insightful comments and assistance in improving our work. Thanks once again for all the valuable feedback and understanding. ## Final Re-R1 Thank you very much for the reviewer’s insightful feedback and for the time and effort dedicated to thoroughly reviewing our work. We deeply appreciate the thoughtful comments, which have provided valuable perspectives on our research. We are particularly grateful for the reviewer’s efforts in understanding our method, especially the design and interpretation of our inter-reflection technique. First, we realize the importance of highlight that there are no supervision and ground-truth annotations for $L_\mathrm{ind}$ and $L_\mathrm{dir}$, typically unavailable in practice. Therefore, $L_\mathrm{ind}$ and $L_\mathrm{dir}$ in Eq.9 are all implicitly optimized with the Gaussian premitives while training. Regarding the understanding of $L_\mathrm{ind}$, we apprciate the view of understanding it as reducing the perturbation caused by occlusion in environmental lighting estimation. This makes sense as occlusion can be considered as the primary cause for the inter-reflection effects. This intention is evident in Figure 12 (appendix) and aligns well with the reviewer’s observations. We have now improved our manuscript thanks to this comment. Additionally, we sincerely appreciate the recognition of our Ref-Gaussian's strong performance in novel view synthesis. In our revision, we will ensure these points are explicitly clarified to provide a precise understanding of our approach. The reviewer’s constructive feedback has been instrumental in guiding these refinements, and we are sincerely grateful for this guidance. ## Final Re-R2 Thanks again for the reviewer's valuable time and efforts for reviewing our work, as well as the insightful comments. We want to express our sincere gratitude for the recognition of our work. Regarding the indirect component $L_\mathrm{ind}$, to further clearity, we want to first note that $L_\mathrm{ind}$ is an newly introduced view-dependent color component represented using spherical harmonics and $V$ is a binary mask, controling the selection between $L_\mathrm{ind}$ and $L_\mathrm{dir}$. We don't quite understand the comment of "$L_\mathrm{ind}$ is the sum of incident radiance in all occlusion directions", and please help kindly further explain if still necessary and we will follow up. On the necessity of the $1-V$ term for $L_\mathrm{ind}$, and the suggested form: $$ L_s'(\omega_o) \approx ( \int_{\Omega} f_s(\omega_i, \omega_o) (\omega_i \cdot \mathbf{N}) d\omega_i ) \cdot [L_\mathrm{dir} \cdot V + L_\mathrm{ind}]. $$ This is equivalent to ours: When the visibility is set to 1, the model will optimize the value of indirect light to 0, which is identical to the Eq.9 of our paper. Thus, either form is adoptable. Lastly on the comparsion with GS-IR ([1]) which also uses spherical harmonics to represent indirect lighting for the diffuse component, we highlight these most related differences: (1) By using spherical harmonics to estimate a binary occlusion (i.e., the $1-V$ term in our method), additional errors may be introduced with GS-IR. Instead, we resort to the ray tracing which is more physically plausible. (2) Modeling indirect lighting for the diffuse component is ineffective for tackling reflection scenes as we focus on here, as discussed earlier, because, its effect is marginal. Hope our responses clarify the thoughtful questions above, and it would be very much appreciated if the reviewer could kindly check our responses and provide feedback with further questions or concerns (if any). We would be more than happy to address them. Thank you! **Reference**: [1] Z. Liang, Q. Zhang, Y. Feng, Y. Shan, and K. Jia. *GS-IR: 3D Gaussian Splatting for Inverse Rendering.* In CVPR, 2024. ## R1 ### W1: Further qualitative evidence on Inter-reflection technique We thank the reviewer for raising this question! We have already provided qualitative evidence of the effectiveness of the inter-reflection technique in Figure 9. Regarding the relatively minor improvement in PSNR, this is due to the limited inter-reflection effects present in the Glossy Synthetic dataset. To provide deeper insight into the usefulness of the inter-reflection technique, we have taken the suggestion into account and included visualizations of the indirect lighting components for the Glossy Synthetic dataset in Figure 5 and for the Ref-Real dataset in Figure 12, Appendix of revised manuscript. As illustrated in Figure 12, the real-world scenes feature multiple objects that generate rich inter-reflections. ### W2: Applying more convincing metrics in ablation study Thank you for the valuable suggestion! To achieve more accurate normals, we utilize 2D Gaussian primitives (Figure 8), an initial stage with per-Gaussian shading, and a material-aware normal propagation technique (Figure 10). To further demonstrate the impact of each component on our geometry, we have included the mean angular error (MAE) metric for the rendered normal maps on the Shiny Blender dataset [1], as shown in Table 3 in our revised manuscript (note that the Glossy Synthetic dataset [2] does not provide ground truth normal maps). #### Q1: Novelty and effectiveness of material-aware normal propagation We apologize for any confusion. The previous normal propagation proposed by 3DGS-DR [3] only considers the connection between reflective strength and the normal accuracy of relative Gaussians, which is often inaccurate. A number of Gaussians are pruned due to their erroneous enlargement. However, **Ref-Gaussian** employs a more comprehensive physically based rendering equation while assigning BRDF properties to Gaussian, where the experiences of 3DGS-DR are no longer applicable. Techniquely, the accuracy of normls should be associated to the materials. However, positions with inaccurate normals often have difficulty capturing significant specular component due to its sensitivity towards reflected direction. Specifically, for most Gassians in our model, through experiments we confirm the strong positive correlation between normal accuracy and high metallic, low roughness properties(one of the cases where specular component is significant). To that end, we propose material-aware normal propagation, periodically increasing the scale of 2D Gaussians with high metallic and low roughness to propagate their more accurate normal information to adjacent Gaussians, achieving an even better geometry reconstruction quality and faster convergence rate. We have provided further comparisons between previous normal propagation following 3DGS-DR and our material-aware normal propagation in Table 3 and 4, in our revised manuscript, also shown below. ***Table 3: Comparison of MAE values across components of Ref-Gaussian.*** | **Model** | **MAE (avg)** | |-----------------------------|---------------| | **Ref-Gaussian** | 2.15 | | **w/o 2DGS** | 4.45 | | **w/o Initial stage** | 3.53 | | **w/o Material-aware** | 3.86 | | **w/o PBR** | 3.81 | **Table 4: Ablation studies on the components of Ref-Gaussian.** *w/o Material-aware: use previous normal propagation following 3DGS-DR.* | **Model** | **Ref-Gaussian** | **w/o PBR** | **w/o Inter-reflection** | **w/o Deferred rendering** | **w/o Initial stage** | **w/o Material-aware** | |------------------------------|-------------|--------------------------|--------------|----------------------------|-----------------------|-------------------------| | **PSNR↑** | 30.33 | 29.84 | 30.14 | 28.85 | 29.94 | 29.68 | | **SSIM↑** | 0.958 | 0.956 | 0.957 | 0.935 | 0.956 | 0.952 | | **LPIPS↓** | 0.049 | 0.052 | 0.050 | 0.074 | 0.052 | 0.056 | ### References 1. D. Verbin, P. Hedman, B. Mildenhall, T. Zickler, J. T. Barron, and P. P. Srinivasan. Ref-nerf: Structured viewdependent appearance for neural radiance fields. In CVPR, 2022. 2. Y. Liu, P. Wang, C. Lin, X. Long, J. Wang, L. Liu, T. Komura, and W. Wang. Nero: Neural geometry and brdf reconstruction of reflective objects from multiview images. ACM Trans. Graph., 2023. 3. K. Ye, Q. Hou, and K. Zhou. 3d gaussian splatting with deferred reflection. In SIGGRAPH, 2024. --- ## Re-R1 Thanks again for the reviewer's valuable time and efforts for reviewing our work, as well as the insightful comments. **Visibility modeling** Thanks for raising this issue which we should have elaborated and clarified in more details. By design, our methods excel for both reflective and non-reflective scenes, **making it as a unified soluton for modeling a variety of scenes with varying reflection**. This is because, as described in Eq.9, inter-reflection is incorporated just as an additional element into the specular component. Consequently, in less reflective scenes, the influence of inter-reflection diminishes naturally and accordingly as the metallic attribute decreases. In such cases, the diffuse component instead takes over, including the capture of any (usually substantially weak) inter-reflection effect to achive top performance. This behavior is evident in the decomposition results provided in Figure 5 and Figure 14 (appendix). In contrast, previous reflection focused alternatives such as 3DGS-DR ([1]) are designed specifically only for reflective scenes, leading to narrowed applications. To support this claim, we have included Table 8 in the appendix, which presents a per-scene quantitative comparison among Ref-Gaussian, 3DGS-DR ([1]), and 3DGS ([2]) on the nerf-synthetic dataset, predominantly composed of non-reflective scenes, for novel view synthesis. The results show that our Ref-Gaussian still excels over the alternatives, validating its generic and unified advantages. Regarding the roughness term of BRDF materials, we apologize for this confusion and misunderstanding. From Eq.8 to Eq.9, we extend the second term denoted by $L_\mathrm{dir}$ in Eq.8, by introducing the indirect light component $L_\mathrm{ind}$ to capture the inter-reflection effect. There is no approximation in this extension process, nor overlooking of roughness $R$. To improve readability, we have revised Eq.8 and Eq.9 to minimize such confusion in the revised paper. In a nutshell, we further summarize again how the roughness is involved in our model in case this helps for facilitating a holistic understanding. First, the shadowing-masking term $G$ in $f_s(\omega_i, \omega_o)$, part of BRDF in Eq.7, is a function of the roughness $R$. In the split-sum approximation described in Eq.8, the first term of Eq.8 is precomputed and stored in a 2D lookup texture map with $(\omega_i \cdot N)$ and roughness R as input conditions. During computing the second term of Eq.8, we also utilize roughness for estimating the integral of incident light. To efficiently represent environment lighting, we use trilinear interpolation across a series of pre-integrated cubemaps at varying roughness levels, with the reflected direction and roughness as interpolation parameters. Regarding Figure 12, thanks for the great observation. Per this comment, we find out that in this visualization experiment we mistakenly forgot to apply the normal smooth loss as stated in Section 3.3 (this mistake only applies here but no other experiments), leading to an inaccurate geometry. This causes this seemmingly overestimated indirect light component on the ground near the two diffuse objects. Now we have addressed this issue, and provided the corrected Figure 12 (appendix). As shown in revised visualization, the three components—*diffuse, specular, and indirect light*—obviously complement each other effectively and together produce an excellent final rendering result. We apologies for this mistake. Regarding the contribution of direct light in the case of previous Figure 12, under this bird eye view, all the lights cast towards that diffuse object will be reflected to the ground, which can be considered as a special object. As a result, its visibility is all zero, meaning no contribution from direct light. Note, direct light in out context means those whose reflection is not blocked by any scene elements. We have explicitly explained this in the revised paper.  **Lighting modeling** Overall, the inter-reflection and relighting components work in a plug-in manner in our approach, but not work simultaneously. We will clarify this further in revision. This flexible design is not our limitation, but a unique advantage over previous alternatives, which just tackles the reflection challenge alone and is unable to address the relighting problem in a unified framework as our method does. Integrating the two components would make our model more advanced, which we will investigate in the future work. Please note, we have demonstrated Ref-Gaussian's outstanding relighting performance even without inter-reflection in Figure 11 and the demo video included in the supplementary material. Also note, none of the existing 3DGS-based advancements have allowed for indirect lighting during relighting, remaining an open challenge to be tackled. **Figure improvment** Additionally, following the reviewer's suggestion to highlight Ref-Gaussian's performance in geometry reconstruction and novel view synthesis, we further provide Figure 13 and Figure 14 (appendix), which further demonstrate our excellent performance by providing more per-scene qualitative results and comparisons of normals with other advanced methods. Hope our responses clarify the thoughtful questions above, and it would be very much appreciated if the reviewer could kindly check our responses and provide feedback with further questions or concerns (if any). We would be more than happy to address them. Thank you! **References** >1. K. Ye, Q. Hou, and K. Zhou. 3d gaussian splatting with deferred reflection. In SIGGRAPH, 2024. >2. B. Kerbl, G. Kopanas, T. Leimkühler, and G. Drettakis. *3D Gaussian Splatting for Real-Time Radiance Field Rendering.* ACM Trans. Graph., 2023. --- ## Re2-R1  Thank you very much for the reviewer’s insightful feedback and for the time and effort dedicated to thoroughly reviewing our work. We deeply appreciate the thoughtful comments, which have provided valuable perspectives on our research. We are particularly grateful for the reviewer’s understanding of our method, especially the interpretation of our inter-reflection technique. The observation that Equation 9 introduces an additional element to the specular component to approximate ground truth values with high fidelity is indeed correct. As noted in the most recent comment, while modeling visibility through ray intersection achieves the highest accuracy for fully specular surfaces, our primary goal was not solely physical accuracy. Instead, we aimed to effectively reduce the perturbation caused by occlusion in environmental lighting estimation. This intention is evident in Figure 12 (appendix) and aligns well with the reviewer’s observations. Additionally, we sincerely appreciate the recognition of our Ref-Gaussian's strong performance in novel view synthesis. In our revision, we will ensure these points are explicitly clarified to provide a precise understanding of our approach. The reviewer’s constructive feedback has been instrumental in guiding these refinements, and we are sincerely grateful for this guidance. ## R2 ### W1: Contributions Lacking Novelty We thank the reviewer for raising this question! While 3DGS-DR ([1]) utilizes deferred rendering, it employs a simple shading model that does not consider physically based rendering, resulting in inaccurate estimations. Additionally, it is unable to model indirect lighting. GS-IR ([2]), which also uses split-sum approximation to address the rendering equation, relies on spherical harmonics to represent averaged occlusion and indirect lighting for the diffuse component. However, this approach leads to inaccurate diffuse estimations and does not account for indirect lighting in the specular term, making it unsuitable for accurately modeling highly reflective objects. DeferredGS ([3]) also uses split-sum approximation but lacks the ability to model indirect lighting. Furthermore, it requires joint training with an "Instant-RefNeuS" model, complicating the optimization process. On the other hand, R3DG ([4]) models indirect lighting using Monte Carlo sampling for the rendering equation. While this improves accuracy, it significantly slows down rendering speed. Moreover, its visibility computation relies on ray tracing within the 3DGS ([5]) framework, which is inherently imprecise due to 3DGS's lack of a clear definition in 3D space, as 3DGS is trained in 2D space. To address these limitations, our method proposes a unified approach that integrates deferred rendering, physically based rendering, and ray tracing on meshes for visibility, along with simultaneous modeling of indirect lighting. Additionally, we introduce several techniques to enhance geometry reconstruction, including a 2DGS ([6]) framework, an initial stage with per-Gaussian shading, and material-aware normal propagation. Other than a simple combination, each one of the techniques as mentioned above is properly innovated and adjusted to the best of Ref-Gaussian, enabling our method to handle highly reflective objects more accurately and more effectively. --- ### W2: Indirect Lighting for the Diffuse Term Thank you for this insightful comment. Since the diffuse component is not sensitive to viewing direction, we use spherical harmonics to directly model its outgoing radiance, as discussed in the implementation details. This approach enables the model to account for both occlusion and indirect lighting effects. For the specular component, our method focuses on highly reflective objects and computes visibility by tracing a single ray along the reflected direction. For glossy objects, Monte Carlo sampling within the specular lobe could be employed to trace multiple rays for improved visibility estimation. However, this approach would come at the cost of reduced rendering speed. --- ### W3: Integral of the Specular Term We apologize for the confusion. As explained in our paper, to avoid the high computational cost of Monte Carlo sampling, we adopt the split-sum approximation as described in Eq. 8. In this formulation, the left term depends solely on $(\omega_i \cdot N)$ and roughness $R$, while the right term, denoted as $L_\mathrm{dir}$, represents the integral of radiance over the specular lobe. To incorporate indirect lighting, we modify the right term $L_\mathrm{dir}$ in Eq. 8 to $L_\mathrm{dir}V + L_\mathrm{ind}(1-V)$, where $L_\mathrm{ind}$ models the indirect lighting for the reflected direction only. Here, $V$ is the visibility term computed via ray tracing on the extracted mesh in the reflected direction, and $L_\mathrm{ind}$ is calculated using Eq. 10. This approach provides an efficient balance between computational cost and the modeling of indirect lighting. --- ### Q1: Geometry Improvements Using 2DGS Representation Thank you for raising this important question! Accurate geometric reconstruction is crucial for the effective performance of the proposed physically based deferred rendering and inter-reflection methods. To this end, we implemented a series of geometric optimizations, including replacing 3DGS with 2DGS. Our existing quantitative(Table 3,4) and qualitative(Figure 7,9) ablation studies in our revised manuscript demonstrate that both the physically based deferred rendering and inter-reflection methods significantly enhance rendering quality. Following your suggestion, we have added quantitative ablation specifically related to the impact of 2DGS(Table 7, Appendix of our revised manuscript). In this additional analysis, we used 3DGS as the representation while keeping the rest of the pipeline unchanged to evaluate the rendered results. Metric comparisons for both geometric reconstruction and novel view synthesis performance have been included in the revised manuscript, as shown in Table 3 and Table 4 which are also shown below. --- ### Q2: Building Indirect Lighting When Relighting Thank you for your valuable suggestions. Recent progress based on 3DGS has not yet achieved the ability to build indirect lighting when relighting. Relighting requires querying arbitrary light in the space, which is difficult to achieve using the splatting technique due to its limitations inherited from rasterization. However, the Gaussian ray tracing method proposed by 3DGRT ([7]) suggests an alternative to rasterization for solving this problem. This marks a breakthrough in rendering ray-based effects and provides clear guidance for our future work. --- ***Table 3: Comparison of MAE values across components of Ref-Gaussian.*** | **Model** | **MAE (avg)** | |-----------------------------|---------------| | **Ref-Gaussian** | 2.15 | | **w/o 2DGS** | 4.45 | | **w/o Initial stage** | 3.53 | | **w/o Material-aware** | 3.86 | | **w/o PBR** | 3.81 | **Table 4: Ablation studies on the components of Ref-Gaussian.** *w/o Material-aware: use previous normal propagation following 3DGS-DR.* | **Model** | **Ref-Gaussian** | **w/o PBR** | **w/o Inter-reflection** | **w/o Deferred rendering** | **w/o Initial stage** | **w/o Material-aware** | |---------------------------|-------------|--------------------------|--------------|----------------------------|-----------------------|-------------------------| | **PSNR↑** | 30.33 | 29.84 | 30.14 | 28.93 | 28.85 | 29.94 | 29.68 | | **SSIM↑** | 0.958 | 0.956 | 0.957 | 0.935 | 0.956 | 0.952 | | **LPIPS↓** | 0.049 | 0.052 | 0.050 | 0.074 | 0.052 | 0.056 | **Table 7: Per-scene PSNR comparison on synthesized test views** *w/o 2DGS: Use 3DGS as representation and keep the rest of the pipeline unchanged.* | **Datasets** | Shiny Blender | | | | | | Glossy Synthetic | | | | | | | | | ------------ | ------------- | ----- | ------ | ------ | ------ | ------- | ---------------- | ----- | ----- | ----- | ----- | ------ | ----- | ------ | | **Scenes** | ball | car | coffee | helmet | teapot | toaster | angel | bell | cat | horse | luyu | potion | tbell | teapot | | ENVIDR | 41.02 | 27.81 | 30.57 | 32.71 | 42.62 | 26.03 | 29.02 | 30.88 | 31.04 | 25.99 | 28.03 | 32.11 | 28.64 | 26.77 | | 3DGS-DR | 33.43 | 30.48 | 34.53 | 31.44 | 47.04 | 26.76 | 29.07 | 30.60 | 32.59 | 26.17 | 28.96 | 32.65 | 29.03 | 25.77 | | w/o 2DGS | 36.10 | 30.65 | 34.51 | 33.29 | 44.25 | 27.03 | 28.33 | 30.60 | 33.14 | 26.70 | 29.35 | 32.94 | 29.17 | 26.31 | | **Ref-Gaussian** | 37.01 | 31.04 | 34.63 | 32.32 | 47.16 | 28.05 | 30.38 | 32.86 | 33.01 | 27.05 | 30.04 | 33.07 | 29.84 | 26.68 | ### References 1. K. Ye, Q. Hou, and K. Zhou. *3D Gaussian Splatting with Deferred Reflection.* In SIGGRAPH, 2024. 2. Z. Liang, Q. Zhang, Y. Feng, Y. Shan, and K. Jia. *GS-IR: 3D Gaussian Splatting for Inverse Rendering.* In CVPR, 2024. 3. T. Wu, J.-M. Sun, Y.-K. Lai, Y. Ma, L. Kobbelt, and L. Gao. *DeferredGS: Decoupled and Editable Gaussian Splatting with Deferred Shading,* 2024b. 4. J. Gao, C. Gu, Y. Lin, H. Zhu, X. Cao, L. Zhang, and Y. Yao. *Relightable 3D Gaussian: Real-Time Point Cloud Relighting with BRDF Decomposition and Ray Tracing.* arXiv preprint, 2023. 5. B. Kerbl, G. Kopanas, T. Leimkühler, and G. Drettakis. *3D Gaussian Splatting for Real-Time Radiance Field Rendering.* ACM Trans. Graph., 2023. 6. B. Huang, Z. Yu, A. Chen, A. Geiger, and S. Gao. *2D Gaussian Splatting for Geometrically Accurate Radiance Fields.* In SIGGRAPH, 2024. 7. N. Moënne-Loccoz, A. Mirzaei, O. Perel, R. de Lutio, J. M. Esturo, G. State, S. Fidler, N. Sharp, and Z. Gojcic. *3D Gaussian Ray Tracing: Fast Tracing of Particle Scenes.* arXiv preprint, 2024. --- ## Re-R2 Thanks again for the reviewer's valuable time and efforts for reviewing our work, as well as the insightful comments. We want to express our sincere gratitude for the recognition of our work.  Regarding the question about indirect component $L_{ind}$ raised in the most recent comment, we believe the reviewer's observation to cancel the $1-V$ term is also indeed correct. Eq.9 also writes: $$ L_s'(\omega_o) \approx ( \int_{\Omega} f_s(\omega_i, \omega_o) (\omega_i \cdot \mathbf{N}) d\omega_i ) \cdot [L_\mathrm{dir} \cdot V + L_\mathrm{ind}]. $$ In such cases, when the visibility is set to 1, it differs from the Eq.9 in the paper. However, the same effect can be achieved by relying on the value of spherical harmonics in this direction being optimized to 0. In our previous design, we consider the visibility a mask that controls the value selection and gradients propagation, reaching fast convergence speed and satifactory rendering quality. We believe the way proposed by the reviewer can work well too. Hope our responses clarify the thoughtful questions above, and it would be very much appreciated if the reviewer could kindly check our responses and provide feedback with further questions or concerns (if any). We would be more than happy to address them. Thank you! ## R3 ### W1: Mesh Extracted Steps Interval During optimization, we extract the object’s surface mesh every 3000 steps using truncated signed distance function (TSDF) fusion. We have updated this in the revised manuscript. --- ### W2: Unclear Explanations We apologize for the confusion. Our further explanations are as follows: #### Q1: Inter-reflection Rendering Details 1. We perform ray tracing on the extracted mesh for the reflected ray **R**: $$ \mathbf{R} = 2(w_i \cdot N)N - w_i $$ to determine whether it is occluded, where $w_i$ represents the light direction and $N$ represents the normal. The mesh is extracted using truncated signed distance function (TSDF) fusion. Ray tracing is conducted for ray-triangle intersections rather than ray-Gaussian intersections, as the latter can be significantly slower due to the need for alpha-blending to accumulate opacity. 2. Regarding N in Eq. 10, the formula has been clarified and updated in the revised manuscript as: $$ L_\mathrm{ind} = \sum_{i=1}^{N} l_\mathrm{ind} \alpha_i\prod_{j=1}^{i-1} (1 - \alpha_j), $$ #### Q2: Initial Stage with Per-Gaussian Shading In a typical rendering process, the integral in Eq. 9 is computed using aggregated pixel-level feature maps (e.g., albedo, metallic, roughness, and normal). However, this approach can hinder geometry convergence in the early stages of optimization due to less effective gradients introduced by deferred shading. To address this, we propose an initial stage where the rendering equation is applied directly to each Gaussian, using the material and geometry properties associated with it to compute the outgoing radiance. The outgoing radiance is then alpha-blended during rasterization to produce the final physically based rendering (PBR). This adjustment significantly accelerates geometry convergence and improves overall quality, as shown in the ablation studies (Table 3 and Table 4, in the revised manuscript, also shown below). #### Q3: Hyperparameters Our hyperparameters remain consistent across all datasets. --- ### W3: Missing Baselines of More NeRF-Based Methods Thank you for this valuable suggestion. In addition to 3DGS-based competitors, we have included two NeRF-based competitors, RefNeRF ([1]) and ENVIDR ([2]), in Table 1. To address your concern, we have added comparisons in Table 5 and Table 6 (Appendix of the revised manuscript). These include metrics such as PSNR, SSIM, LPIPS, and FPS on the Glossy Blender dataset ([3]) with NeRO ([3]) and NeuS ([4]), as well as with NDE ([5]) on the Shiny Blender dataset ([1]). The results demonstrate that **Ref-Gaussian** significantly outperforms NeRO and remains comparable to NDE in rendering quality while achieving much better rendering speed and training efficiency. --- ### W4: Limitation of Gaussian-Grounded Inter-reflection Thank you for your insightful comments. **Ref-Gaussian** primarily focuses on reconstructing highly reflective objects. For efficiency, we compute indirect lighting using a single reflected ray. While it is possible to trace additional rays using Monte Carlo sampling for more accurate representation of indirect lighting, this approach would significantly reduce efficiency. This limitation has been discussed in the revised manuscript. --- ### W5: Further Qualitative Results of the Extracted Mesh Following your suggestion, we have included a visualization of the extracted mesh in Figure 5 of the revised manuscript. --- ***Table 3: Comparison of MAE values across components of Ref-Gaussian.*** | **Model** | **MAE (avg)** | |-----------------------------|---------------| | **Ref-Gaussian** | 2.15 | | **w/o 2DGS** | 4.45 | | **w/o Initial stage** | 3.53 | | **w/o Material-aware** | 3.86 | | **w/o PBR** | 3.81 | **Table 4: Ablation studies on the components of Ref-Gaussian.** *w/o Material-aware: use previous normal propagation following 3DGS-DR.* | **Model** | **Ref-Gaussian** | **w/o PBR** | **w/o Inter-reflection** | **w/o Deferred rendering** | **w/o Initial stage** | **w/o Material-aware** | |---------------------------|-------------|--------------------------|--------------|----------------------------|-----------------------|-------------------------| | **PSNR↑** | 30.33 | 29.84 | 30.14 | 28.93 | 28.85 | 29.94 | 29.68 | | **SSIM↑** | 0.958 | 0.956 | 0.957 | 0.935 | 0.956 | 0.952 | | **LPIPS↓** | 0.049 | 0.052 | 0.050 | 0.074 | 0.052 | 0.056 | ### References 1. D. Verbin, P. Hedman, B. Mildenhall, T. Zickler, J. T. Barron, and P. P. Srinivasan. *Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields.* In CVPR, 2022. 2. R. Liang, H. Chen, C. Li, F. Chen, S. Panneer, and N. Vijaykumar. *ENVIDR: Implicit Differentiable Renderer with Neural Environment Lighting.* In ICCV, 2023. 3. Y. Liu, P. Wang, C. Lin, X. Long, J. Wang, L. Liu, T. Komura, and W. Wang. *NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multi-View Images.* ACM Trans. Graph., 2023. 4. P. Wang, L. Liu, Y. Liu, C. Theobalt, T. Komura, and W. Wang. *NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-View Reconstruction.* arXiv preprint, 2021. 5. L. Wu, S. Bi, Z. Xu, F. Luan, K. Zhang, I. Georgiev, K. Sunkavalli, and R. Ramamoorthi. *Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling.* In CVPR, 2024. --- ## Re-R3 Thanks again for reviewer's valuable time and efforts to review our work as well as the insightful comment. **1. Limitation of Gaussian-grounded inter-reflection** Thanks for raising this issue which we should have elaborated and clarified in more details. As mentioned in Section 3.2 of the main paper, Ref-Gaussian calculates indirect lighting using a single reflected ray, making it particularly effective and efficient in reflective scenes. Also, as described in Eq.9, inter-reflection is incorporated as additional element into the specular component. Consequently, in less reflective scenes, the influence of inter-reflection diminishes naturally and accordingly as the metallic attribute decreases. In these cases, the diffuse component takes over and captures most of the inter-reflection effects. This behavior is evident in the decomposition results provided in Figure 5 and Figure 14 (appendix). As a result, **Ref-Gaussian maintains robust rendering quality for less reflective scenes, providing a unified solution for both reflective and non-reflective scenes**. In contrast, previous reflection focused alternatives such as 3DGS-DR ([1]) are designed specifically only for reflective scenes, leading to narrowed applications as mentioned. To support this claim, we have included Table 8 in the appendix, which presents a per-scene quantitative comparison among Ref-Gaussian, 3DGS-DR ([1]), and 3DGS ([2]) on the nerf-synthetic dataset, predominantly composed of non-reflective scenes, for novel view synthesis. The results show that our Ref-Gaussian still excels over both alternatives, validating its generic and unified advantages. **2. Request for more qualitative results** Following the reviewer's suggestion, we provide further illustration of our advantage on geometry reconstruction in Figure 13 and 14 of the appendix. The qualitative comparison in Figure 13 demonstrates Ref-Gaussian's comprehensive grasp of details over the alternatives (such as the tires in car and the water surface in coffee). To be clear, please refer to our video demo in the supplementary material for best view of our qualitative results. Hope our responses clarify above thoughtful questions, and it is very much appreciated if the reviewer can kindly check our responses and provide feedback with further questions/concerns (if any). We would be more than happy to address them. Thank you! **References** > 1. K. Ye, Q. Hou, and K. Zhou. 3d gaussian splatting with deferred reflection. In SIGGRAPH, 2024. > 2. B. Kerbl, G. Kopanas, T. Leimkühler, and G. Drettakis. *3D Gaussian Splatting for Real-Time Radiance Field Rendering.* ACM Trans. Graph., 2023. --- ## R4 ### W1: Non-Differentiable Approximation of Mesh Thank you for your insightful comments. Extracting a mesh is non-trivial and crucial for visibility approximation, further highlighting the importance of accurate geometry reconstruction. To achieve this, we leverage a 2DGS ([1])framework (Figure 8), an initial stage with per-Gaussian shading, and material-aware normal propagation (Figure 10). Quantitative comparisons of surface normals (Table 3 in the revised manuscript, also shown below) further demonstrate their effectiveness. Since the non-differentiable approximation, TSDF, is independent and replaceable in our approach, we are pleased that we can find out more accurate and efficient techniques in our future work. --- ### Q1: Geometry Improvement Introduced by PBR We believe that our physically based rendering (PBR) method has contributed significantly to geometry improvement. Quantitative comparisons, specifically the MAE value of the model without PBR (w/o PBR in Table 3 of the revised manuscript), clearly demonstrate its notable geometric shortcomings compared to PBR. --- ### W2 & Q2: LPIPS Limited Over 3DGS-DR We appreciate the question being raised. The real-world dataset is highly complex but has relatively poor reflective details, which contrasts with the typical scenarios involving highly reflective surfaces that **Ref-Gaussian** focuses on. As such, this dataset is not ideal enough for testing reflectiveness. Furthermore, LPIPS is very sensitive to subtle color and texture mismatches that are insignificant for reconstruction, such as ground details. Other than LPIPS on the real-world dataset, **Ref-Gaussian** outperforms 3DGS-DR ([2]) across all other metrics and datasets. --- ***Table 3: Comparison of MAE values across components of Ref-Gaussian.*** | **Model** | **MAE (avg)** | |-----------------------------|---------------| | **Ref-Gaussian** | 2.15 | | **w/o 2DGS** | 4.45 | | **w/o Initial stage** | 3.53 | | **w/o Material-aware** | 3.86 | | **w/o PBR** | 3.81 | **Table 4: Ablation studies on the components of Ref-Gaussian.** *w/o Material-aware: use previous normal propagation following 3DGS-DR.* | **Model** | **Ref-Gaussian** | **w/o PBR** | **w/o Inter-reflection** | **w/o Deferred rendering** | **w/o Initial stage** | **w/o Material-aware** | |---------------------------|-------------|--------------------------|--------------|----------------------------|-----------------------|-------------------------| | **PSNR↑** | 30.33 | 29.84 | 30.14 | 28.93 | 28.85 | 29.94 | 29.68 | | **SSIM↑** | 0.958 | 0.956 | 0.957 | 0.935 | 0.956 | 0.952 | | **LPIPS↓** | 0.049 | 0.052 | 0.050 | 0.074 | 0.052 | 0.056 | ### References 1. B. Huang, Z. Yu, A. Chen, A. Geiger, and S. Gao. *2D Gaussian Splatting for Geometrically Accurate Radiance Fields.* In SIGGRAPH, 2024. 2. K. Ye, Q. Hou, and K. Zhou. *3D Gaussian Splatting with Deferred Reflection.* In SIGGRAPH, 2024.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.