# Loop Closure And Robot Kidnapping [PoGO-Net: Pose Graph Optimization with Graph Neural Networks](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9709983) - A view graph is constructed with unknown absolute camera orientations as vertices and edges being relative measurements between pair of vertices. - View graph is passed through GNN architecture with loss functions depicting the global consistency of the output pose-graph and evaluating the predition of absolute camera orientations. - Iterative edge dropping scheme is used to remove outlier edges before passing of edge messages using network layers. [Pose Graph Optimization for Unsupervised Monocular Visual Odometry](https://arxiv.org/abs/1903.06315) [Online Visual Place Recognition via Saliency Re-identification IROS 2020- Video](https://www.youtube.com/watch?v=gc-LMaEUL3M) [OverlapNet: Loop Closing for LiDAR-based SLAM](https://arxiv.org/pdf/2105.11344.pdf) [Scan Context: Egocentric Spatial Descriptor for Place Recognition Within 3D Point Cloud Map](https://ieeexplore.ieee.org/document/8593953) - A Scan context is a global spatial descriptor which is used for place recognition (loop detection/feature matching). - The matching is executed in two stages, Calculating a vector from Ring-Keys which is a rotation invariant descriptor for robust matching. Vectors are used for constructing a KD - Tree that is used to pass a query keys. These keys are passed through a cosine similarity the yields the cosine distance. The minimum of the distance is selected as a matach and passed as a positive loop detection if the minimum disstance is lesser than the set threshold. [Intensity Scan Context: Coding Intensity and Geometry Relations for Loop Closure Detection](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9196764) [Semantic loop closure detection based on graph matching in multi-objects scenes](https://www.sciencedirect.com/science/article/pii/S1047320321000389) Paper proposes to use the semantic information through object detection. The method relies on semantic and geometric information which calculates image similarity from graph matching which is generated from the coordinates of object detected and the depth information. This is called a semantic sub-graph which is matched using the KM algorithm. [Map Partition and Loop Closure in a Factor Graph Based SAM System ](https://liu.diva-portal.org/smash/get/diva2:1509264/FULLTEXT01.pdf) [Robust Loop Closing Over time for Pose Graph SLAM](http://webdiis.unizar.es/~ylatif/papers/IJRR.pdf) [SA-LOAM: Semantic-aided LiDAR SLAM with Loop Closure](https://arxiv.org/pdf/2106.11516.pdf) Uses semantic point clouds which are registered as pose graph nodes and then estimates poses via LOAM package(FLOAM). If loop closure detected (uses their algoritm to do so) then updates the poses just like any SLAM algorithm [Loop detection by using CNNs](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8082072) Rather than the features extracted through classical methods, this approach relies on the features obtained through a CNN. The output of the network is passed through PCA to get a vector with reduced dimensions. After this a similarity matrix is used to calculate distance between CNN feature vectors for different frames to get a similarity score. through which a loop closure is decided. SLAM BOOK https://github.com/gaoxiang12/slambook-en https://www.youtube.com/watch?v=DK1DIcPwCOU [Adaptive Object Detection in adverse weather conditions](https://github.com/wenyyu/Image-Adaptive-YOLO) [SORNet: Spatial Object-Centric Representations for Sequential Manipulation](https://arxiv.org/pdf/2109.03891.pdf) # Robot Kidnapping [New approach in solving the kidnapped robot problem ](https://www.researchgate.net/publication/224232498_New_approach_in_solving_the_kidnapped_robot_problem) - Compares PCs from SURF features extracted during offline map building and after kidnapping using MSAC algorithm(variant of RANSAC). # Pose Graph Base * RTAB SLAM * [LeGO LOAM-SC](https://github.com/irapkaist/SC-LeGO-LOAM) / [LeGO LOAM](https://github.com/RobustFieldAutonomyLab/LeGO-LOAM) * [VINS Mono](https://github.com/HKUST-Aerial-Robotics/VINS-Mono) * [Graph Slam visualiser](https://github.com/jaejunlee0538/graph_slam_visualizer) * [Visual SLAM packages](https://github.com/klintan/vo-survey) # Object Representation * [Weakly-supervised Object Representation Learning for Few-shot SemanticSegmentation](https://openaccess.thecvf.com/content/WACV2021/papers/Ying_Weakly-Supervised_Object_Representation_Learning_for_Few-Shot_Semantic_Segmentation_WACV_2021_paper.pdf) 1. Although the task here is sematic segmentation they have created a Object representation generator which takes a feature map and the label map and creates a kind of attention map from this. 2. Simultaneously they also pass the image (original) to a U-Net type archi along with the attention map and create a score for comparison 3. The ORG can be used by us as well if we plan to detect local features and aggregate them. ![](https://i.imgur.com/1NqXzSH.png) ![](https://i.imgur.com/NyoxZEc.png) * [Region Similarity Representation Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Xiao_Region_Similarity_Representation_Learning_ICCV_2021_paper.pdf) 1. Initially augements images and then creates feture maps of the cropped images (RRC) 2. They do the sliding window algo on these cropped maps only and then do this twice or thrice. 3. They also use the feature pyramid network during this (expt) 4. They propose a region-level similarity learning component which enforces that the same regions within the views encode to the same spatially and semantically consistent feature representations. ![](https://i.imgur.com/JwCU2aI.png) * [Scene Image Representation by Foreground, Background and Hybrid Features](https://arxiv.org/pdf/2006.03199.pdf) 1. Uses VGG-16 for scene labelling for all the three cases, Foreground Background and Hybrid Images. 2. Paper proposes to fuse the information from all the three cases by extracting features at the end of every VGG16 model. 3. These features are aggregated together using different methods. (Max,Min,Mean,ConCat) 4. Tested their model on MIT-67 and SUN-397 and comapred their results with the aggregation methods. * [Unsupervised Visual Representation Learning by Context Prediction](https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Doersch_Unsupervised_Visual_Representation_ICCV_2015_paper.pdf) 1. Proposes a model that learns a representation of object in the context of spatial configuration. 2. The model is trained with on random patches from an object and predicting the orientation of one patch wrt with other. 3. The model is inspired from the AlexNet Architecture and fuses the patches information in the fully connected layer in some way which is reduced to a 8D vector that is passed through softmax. ![](https://i.imgur.com/F6azLfQ.png) 3. This learned representation is applied to find similar looking patches in different images. ![](https://i.imgur.com/n55KvtR.png) Dk if this is relevant https://openaccess.thecvf.com/content_ICCV_2017/papers/Wang_Orientation_Invariant_Feature_ICCV_2017_paper.pdf