# Machine learning-based depth estimation and view synthesis for immersive video _by Smitha Lingadahalli Ravi (Orange & IETR) - 2024.02.02_ ###### tags: `VAADER` `Seminar` ## Abstract This talk addresses three key challenges in novel view synthesis. The first challenge pertains to Depth Image-Based Rendering (DIBR), which relies on depth information for view generation. Although depth estimation methods are well-explored, their specific performance in rendering remains understudied. We conducted a comprehensive comparative study on various depth estimation techniques within DIBR, aiming to inform the community about their real-world performance and identify the most effective method for immersive video transmission. In contrast, Image-Based Rendering (IBR) methods rely solely on image data for synthesizing views and they suffer from temporal instability in videos, which constitute the second challenge. It questions whether existing IBR methods can be adapted to address temporal inconsistencies without architectural changes. Two approaches are presented: the first introduces an intra-only framework to fine-tune temporal artifact regions, enhancing temporal consistency. The second involves integrating consecutive frames with the input frames to mitigate temporal artifacts in the network input. The third challenge explores enhancing temporal consistency in IBR by leveraging consecutive frames through architectural adjustments. We suggest incorporating an additional feature extraction network designed for extracting features from the frame at time 't+1.' These features are then integrated into the frame 't' processing pipeline to effectively reduce temporal inconsistencies between the consecutive frames.