# vesseltype_identification_dae ###### tags: `TSAR` On my laptop: ``` git clone https://gitlab.com/simula_ais_message/vesseltype_identification_dae ``` The source has been fetched at: ``` /Users/annef/Documents/T-SAR/AF/vesseltype_identification_dae/script/data_processing ``` ## 1st step: preprocessing See `/Users/annef/Documents/T-SAR/AF/vesseltype_identification_dae/script/data_processing` ``` srun conda run -n ais python -m src.data_processing -a -o 1.h5 -d 1 -m ```` -a --> compute distance to closest anchorage. -m filter MMSI (not sure what that means...) -d month to process. ``` python -m src.data_processing -o 1.h5 -d 1 -m -a ``` Then for all months: ``` for f in {1..12}; do python -m src.data_processing -o $f.h5 -d $f -m -a; done ``` ## 2nd step: create training, test and validation datasets This step creates datasets for "train, validation and test". ``` python datasets_creation.py -m 1 -M 1 ``` -m start month -M end month The result of this step is the creation of 3 datasets for training, validation and testing.