# Narayanan Knowledge Transfer
## Building Object Detection
### Clone ALiVE
```bash=
git clone git@bitbucket.org:alive_iiitd/alive-dev.git
git checkout trajectory_generation
```
### Download Libtorch and set environment variables
- [Libtorch 1.12 - CUDA 11.6 - (cxx11 ABI)](https://download.pytorch.org/libtorch/cu116/libtorch-cxx11-abi-shared-with-deps-1.12.0%2Bcu116.zip)
- Unzip
```bash=
unzip libtorch-cxx11-abi-shared-with-deps-1.12.0+cu116.zip
```
- Place in ~
```bash=
mv libtorch ~/
```
- Set Torch_DIR in .bashrc
```bash=
echo 'export Torch_DIR=/home/komal/libtorch' >> ~/.bashrc
```
### Building object detection
- vision_opencv
```bash=
git submodule update --init --remote
cd src/vision_opencv
git apply ../../vision_opencv_melodic.patch
cd ../..
catkin init
catkin build vision_opencv
```
- object_detection
```bash=
export PATH=$HOME/cmake-install/bin:$PATH
export CMAKE_PREFIX_PATH=$HOME/cmake-install:$CMAKE_PREFIX_PATH
catkin build object_detection
```
- Create models directory
```bash=
cd src/object_detection/src
mkdir models
```
Download models from [here](https://drive.google.com/drive/folders/1qfFAZnGiz7t5qzPcoDY0M5kAr98ztcaQ?usp=sharing) and place the models in models directory
## Components

### Object Detector

#### ImageConverter->imageCb
Inputs
```
sensor_msgs::ImageConstPtr&
```
Outputs
```
results - std::vector<std::vector<Detection>>
Detection - struct - {cv::Rect bbox, float score, int class_idx}
```
Psuedocode -
```
Preprocess - Resize, Divide by 255
Create tensor
Call model.forward
Extract the detection from the tuple
Postprocess - nms
return result
```
### Tracker

#### ROS summary
#### image_converter.boxcallback
Inputs
```
alive_msgs.msg.DetectedObjectArray
```
Outputs
```
alive_msgs.msg.DetectedObjectArray
```
Pseudocode -
```
if num_objects==0:
publish empty message
return
if last_processed_timestamp == current_message_timestamp:
This means that the Bboxes have already been processed in boxcallback_lidar
return
call update function of sort_tracker
create output message and publish
```
#### image_converter.boxcallback_lidar
Inputs
```
alive_msgs.msg.DetectedObjectArray
PointCloud2
Image
```
Outputs
```
alive_msgs.msg.DetectedObjectArray
```
Pseudocode -
```
if num_objects==0:
publish empty message
return
update_bbox = True
if last_processed_detection_timestamp >= current_message_detection_timestamp:
This means that the Bboxes have already been processed in boxcallback
update_bbox = False
Use object_distance_calculation function to retrieve distance of each detected object
Call update_with_distance with update_bbox, distance_array detections and timestamps as arguments
create output message and publish
```
#### object_distance_calculations
Inputs
```
alive_msgs.msg.DetectedObjectArray
pointcloud - numpy array
image
```
Outputs
```
Array with the depth for each object in input
```
Pseudocode
```
Filter groundplane in poincloud
Convert Lidar points to camera frame using T matrix - transforms.lidarToCam3D
Convert points in camera frame to camera pixel coordinates using intrinsic matrix K - getPointsOnImage
Check if the coordinates are within the image bounds
Take mean of all Lidar points mapped within a bounding box to represent the depth of that particular bounding box - create_distance_viz
return array of distances
```
#### KalmanBoxTracker
- State - [u, v, s, r, d , u_dot, v_dot, s_dot, d_dot]
u = x coordinate of center of box
v = y coordinate of center of box
s = area of box
r = aspect ratio of box
d = distance (m)
u_dot = velocity of the box (x coord) ( pixels/s)
v_dot = velocity of the box (y coord) ( pixels/s)
s_dot = rate of change of area
d_dot = velocity of the object (m/s)
- #### update_bbox
Inputs
```
bbox - [x1, y1, x2, y2, conf, class]
```
Outputs
```
Updates the state of the Kalman Filter and the associated matrices related to the bounding box
```
- #### update_distance
Inputs
```
bbox - [x1, y1, x2, y2, conf, class]
distance - distance associated with bounding box
```
Outputs
```
Updates the state of the Kalman Filter and the associated matrices related to distance and velocity
```
- #### update_bbox_and_distance
Inputs
```
bbox - [x1, y1, x2, y2, conf, class]
distance - distance associated with bounding box
```
Outputs
```
Updates the state of the Kalman Filter and the associated matrices
```
- #### predict functions
Inputs
```
Timestamps
```
Outputs
```
Updates the state and associated matrices via predicting till the given timestamp
```
#### associate_detections_to_trackers
Inputs
```
detections - array of [x1, y1, x2, y2, conf, class]
trackers - array of sort trackers made for each object encountered till now and still alive
```
Outputs
```
indices of matched detections[index detection, index tracker], unmatched detcetions and unmatched trackers
```
Pseudocode -
```
Calculate iou
Do linear assignment (scipy.optimize function) and find matched indices
return matched detcetions, unmatched detections and unmatched trackers
```
#### Sort
Maintains one sort tracker for each uniquely id'd object
- #### update and update_distance
The only difference is the functions being called and that can be seen from the code
Inputs for update
```
dets_timestamp - ROS timestamp
dets - array of [x1, x2, y1, y2, conf, class]
```
Inputs for update_distance
```
distances - numpy array of distances
dets_timestamp - ROS timestamp
lidar_timestamp - ROS timestamp
update_bbox - Bool - To see whether bbox update has to be perfomed or if update has already been called with these same detection in which case only distance is updated
dets - array of [x1, x2, y1, y2, conf, class]
```
Outputs
```
Array of [x1, y1, x2, y2, distance, id, distance, velocity]
```
Pseudocode -
```
Run prediction step for each existing tracker
Call associate_detections_to_trackers to get matched and unmatched detections and trackers
For matched detections call the update function
Initialize unmatched detections with new trackers
Reduce time to live of unmatched trackers and discard trackers whose time to live is 0
return above stated output array
```
### Obstacle Mapper
Used to map output of object detector by calculating the intersection of the bottom of the bounding box with the ground plane.

The code runs two asynchronous streams and the only difference between the two streams is the groundplane segmentation. The stream with LiDAR as input will do the groundplane computation whereas the other uses the groundplane equation computed by the LiDAR stream.
## Pointcloud Registration with Lanelet
[Link to doc](https://hackmd.io/r9MnF_JlQwajUAZWg-eZDQ)
Things left to document
- Traffic Light Classification
- OSM to Lanelet
- Visualization code
- Estimator
- FCW
- Calibration