# long/lat data
###### tags: `progress update`
**Goals:**
- To extract the long/lat data from the dataset
- As part of the initial exploration of the data, to look at the spatial clustering of the already classified images
- To compare the distributions of the already classified images to the distributions of the images we classify
- To also produce a map of different classes' distributions to add context to the report
## Extracting long/lat data from the dataset
## Preparing the dataset & visualising it in QGIS
- Import the csv file in to qgis as delimited text file using the 'GPSLongitude' as the 'X field' and 'GPSLatitude' as the 'Y field'. Make sure the Geometry CRS is set as EPSG:4326 - WGS 84. Rename the attribute as 'plankton_points'.
- Import the csv which has columns referring to the class types and the file name. <- name this file 'classifications'
- In the properties of the 'plankton_points' attribute, join the 'classifications' file to the 'plankton_points'' attributes, using 'FileName' as the common feature to join on.
- One the joined is complete, play around in with the symbology in the 'properties' of 'plankton_points' (example results in next step)
## Explore the spatial clustering of the already classified images
(I WILL MAKE THIS A MORE ATTRACTIVE MAP FOR THE FINAL REPORT, THIS IS JUST FOR QUICK ILLUSTRATIVE PURPOSES)
### The three levels of classification are visualised below:
**Distribution of debris (brown) vs plankton (green):**

**Distribution of copepods (pink), detritus (green), & non-copepods (yellow):**

**Distribution of 3rd level of classification (I need to look into a better way of visualising it):**

density plot b/c can't tell if detritus are overshadowed by plankton points.
non metric dimensional scaling
## Density of the plankton/detritus/etc.
Spatial density plots (using kernel density estimation) were created to enable the distinction between plankton and detritus - this was done to overcome the limitations of the point plots, noted in previous group discussions; the kde plots were done to overcome the limitation that the points in point plots overshadow one another making it possible to only see the distribution of the top points in the map.
### The three levels of classification are visualised below:
#### Level 1 Class (Debris/Plankton)
##### Heatmaps (kernel density estimation)
Distribution of plankton:

Distribution of debris:
