# vesseltype_identification_dae
###### tags: `TSAR`
On my laptop:
```
git clone https://gitlab.com/simula_ais_message/vesseltype_identification_dae
```
The source has been fetched at:
```
/Users/annef/Documents/T-SAR/AF/vesseltype_identification_dae/script/data_processing
```
## 1st step: preprocessing
See `/Users/annef/Documents/T-SAR/AF/vesseltype_identification_dae/script/data_processing`
```
srun conda run -n ais python -m src.data_processing -a -o 1.h5 -d 1 -m
````
-a --> compute distance to closest anchorage.
-m filter MMSI (not sure what that means...)
-d month to process.
```
python -m src.data_processing -o 1.h5 -d 1 -m -a
```
Then for all months:
```
for f in {1..12}; do python -m src.data_processing -o $f.h5 -d $f -m -a; done
```
## 2nd step: create training, test and validation datasets
This step creates datasets for "train, validation and test".
```
python datasets_creation.py -m 1 -M 1
```
-m start month
-M end month
The result of this step is the creation of 3 datasets for training, validation and testing.