Scripts - HackMD

# Scripts ## Pipeline scripts ### `create_nodes_in_the_point_cloud.py` #### Description: Stage 1 of the pipeline. It performs postprocessing on masks created by `create_truview_scene_data.py` and saves nodes to JSON and pickle files. Postprocessing steps are geometry fitting, classification correction, instance matching correction. Returns visualization, pointcloudpaths object (without paths) in pickle and JSON, and also config.json that was used to produce nodes. #### Arguments: * `--output_dir` directiory to save files generated by `create_nodes_in_the_point_cloud.py` * `--config_path` path to config which is used to generate nodes. You can find default config in `hexagon-pid3d/pid3d/configs/defaults/default_point_cloud_nodes_config.json` ### `create_paths_in_the_point_cloud.py` #### Description: Stage 2 of the pipeline. This script finds connections on nodes created by `create_nodes_in_the_point_cloud.py`. It also creates connections using unknown nodes. Returns visualization, pointcloudpaths (now with paths) object in pickle and JSON, and also config.json that was used to produce paths. #### Arguments: * `--output_dir` directiory to save files generated by `create_paths_in_the_point_cloud.py` * `--config_path` path to config which is used to generate nodes. You can find default config in `hexagon-pid3d/pid3d/configs/defaults/default_point_cloud_paths_config.json` ### `match_point_cloud_paths_to_pid.py` #### Description: Stage 3 of the pipeline. This script runs matching between paths generated by `create_paths_in_the_point_cloud.py` and PIDDatapoint generated by `predict_and_visualise_digitized_pid.py`. Returns JSON file of predicted matches. #### Arguments: * `--output_path` directiory to save files generated by `match_point_cloud_paths_to_pid.py` * `--point_cloud_paths_pickle_path` path to pickle generated by `create_paths_in_the_point_cloud.py` * `--pid_datapoint_pickle_path` path to pickle generated by `predict_and_visualise_digitized_pid.py` * `--features_config_path` path to config which defines parameters for features used in matching. You can find default config in `hexagon-pid3d/pid3d/configs/defaults/default_features_config.json` * `--matcher_config_path` path to config which is used to match nodes. You can find default config in `hexagon-pid3d/pid3d/configs/defaults/default_matching_config.json` * `--pid_starting_node_id` id of starting node in pid file * `--point_cloud_starting_node_id` id of starting node in point_cloud_paths file ## Out of pipeline scripts ### `predict_and_visualise_digitized_pid.py` A script that generates PID predictions using phase1, which are used in stage 3. PID predictions are saved in json and pickle. #### Description: #### Arguments: * `--metadata_path` "Input file storing manually digitized PID - sloth annotations. JSON format." * `--save_path` Output directory path to store digitized PID * `--pids` PID id(s) to process * `--save_gt` If True saves GT labels. * `--save_predictions` If True saves predictions. ### `create_truview_scene_data.py` #### Description: A script that generates segmentation masks using phase 2, which are the main component that is used in the pipeline. The most important file that this script return is prediction.pickle. Which is directly used in `create_nodes` stage of the pipeline. #### Arguments: * `--ground_truth` Whether to use ground-truth data (ground truth data are masks which were labeled by us but they are joined automatically by instance matching algorithm) * `--prediction` Whether to use data that is predicted by mask detection models and joined by instance matching algorithm. * `--locations` Provided as consecutive numbers corresponding to the location numbers, e.g 400 401 * `--visibility_distance` Float determining the visibility distance * `--output_dir` Directory to dump crated point_cloud scene data * `--sample_frac` Fraction of point clouds to be dumped, between 0-1 ### `filter_pointcloud.py` #### Description: Read a raw scene pointcloud and filter the ground and predictions, to obtain pointcloud of undetected objects only. Save the filtered pointcloud to csv format, corresponding to the input one. If segmented pointcloud is not specified, only ground is filtered out. #### Arguments: * `--raw_pointcloud_csv` Path to a scene raw pointcloud csv generated by `get_raw_truview_scan_data.py` * `--thres` Distance between the highest deleted point and the lowest point in pointcloud * `--segmentation_csv` Path to file with UNSAMPLED segmentation pointcloud csv file created by create_truview_scene_data.py * `--output_path` Directory to save pointcloud * `--dump_voxelized_prediction` Whether to also save a voxelized predictions pointcloud" * `--voxelsize` A voxelsize to use (if unspecified, voxelization is not being performed). In case of filering using an already voxelized prediction, remember to use a sufficiently large voxelsize here ### `fit_cylinder_to_a_pipe.py` #### Description: A Script that helps to visualize individual pipes and also shows how cylinder fitting algorithm fit to them. Script returns html file with 3d vizualization of pipe and cylinder that was fitted to it. #### Arguments: * `--input_file` File produced by `create_nodes` stage of pipeline or `create_nodes_in_the_point_cloud.py` script * `--output_dir` Directory to save vizualization * `--node_id` Id of the node from `--input_file` that we want to look at * `--sample_size` Number of points of `--node_id` node to use to vizualize ### `get_raw_truview_scan_data.py` #### Description: A Script that generates raw truview scan - point cloud without predicted masks. Saves raw pointcloud in csv file. #### Arguments: * `--locations` Provided as consecutive numbers corresponding to the location numbers, e.g 400 401 * `--output_dir` Directory to save raw pointcloud * `--sample_frac` Fraction of point clouds to be dumped, between 0-1, it is advised not to exceed 10proc. in case of multiple locations. * `--voxel_size` Size of the voxel used for denoising algorithm, the smaller it is, the longer it takes to perform denoising. It has quadratic behavior with respect to the shrinking of the voxel size (two times smaller, four times longer for the algorithm) ### `calculate_matching_metrics.py` #### Description: A script that calculates F1 score of matched nodes. Matching of two pairs, (gt_pid, gt_pc) and (pred_pid, pred_pc) gives: * true positive (tp): gt_pid.ID == pred_pid.ID AND condition_func(gt_pc, pred_pc) returns True, which corresponds to point clouds being matched in a sense of condition_func specified by the user * false positive (fp): gt_pid.ID == pred_pid.ID AND condition_func(gt_pc, pred_pc) returns False, so point clouds do not match * false negative (fn): gt_pid.ID != pred_pid.ID, so basically there is no corresponding PID in the ground truth data * true negative (tn): in case of matching we do not have negative examples in the ground truth Default condiditon_func check if ground truth node is inside predicted node point cloud. #### Arguments: * `--gt_path` Path to excel (.xlsx) file with ground truth matches * `--gt_sheet_name` Name of a sheet inside of the excel file * `--pred_path` Path to pickle file with predicted matches * `--scene_id` Id of a scene, from 1 to 6 ## Exploratory Phase Scripts ### `analyse_digitized_pid_object.py` #### Description: Analyse digitized PID predictions and / or Ground Truth labels. Outputs descriptions of PID objects. #### Arguments: * `--input_path` Digitized PID directory path * `--pid` PID id to read and analyse. * `--gt` Read GT - pickled PID_DataPoint objects. * `--predictions` Read predictions - pickled PID_DataPoint objects. ### `find_missing_connections_in_point_cloud_path.py` #### Description: Script for postprocessing the precomputed point cloud paths. It requires a filtered point cloud csv path, and tries to find missing connections between previously unconnected nodes which are not further than a threshold. #### Arguments: * `--output_dir` Directory to dump appended PointCloudPaths object, visualization and used config * `--point_cloud_path_object_path` Path to the pickle file with the PointCloudPath object, created with the `create_paths_in_the_point_cloud.py` script * `--config_path` path to config which is used to find missing connetions. You can find default config in `hexagon-pid3d/pid3d/configs/defaults/default_missing_connections_config.json` ### `traverse_point_cloud_with_marching_cubes.py` #### Description: Based on raw point cloud obtain a path that covers the points. Traversal policy: BFS #### Arguments: * `--save_path` Output directory path to store output files: pickled traversal object and its visualization * `--config_path ` path to config which is used to run marchin cube. You can find default config in `hexagon-pid3d/pid3d/configs/defaults/default_marching_cubes_config.json * `--input_file_path` Input raw point cloud file path. CSV format. generated by `get_raw_truview_scan_data.py`