Ticket
Data:
- ship radar measurements
- recorded in 1 min time intervals
- Dec 19 - May 20 -> 6 months -> ~180 days -> ~4320 measurements
- stored in 2 week periods -> ~12 datasets
- in total ~1.2 Tb (-> ~ 7Gb per day?)
Info:
- each day can be processed independent of other days, one day includes 1440 measurements
- within one day processes depend on previous processes within same day
- one tcsh script runs multiple binaries for one measurement
- tested: one hour runtime processes 2.5 hours worth of data, ie 210 measurements -> ~10 hours processing time for 1 day of data
- process always takes two timepoints/images/measurements and compares them -> results stored in 1 temporary text file per timepoint
Questions:
- I/O load: how often is data read and written during process of one measurement/ one days worth of measurements? -> read in once, one temporary output file per measurement, 2 final output files per day worth of measurements (one with trajectories, one with timepoints)
- Memory requirements: How much memory is needed for processing one measurement/one days worth of measurements? ->?
Suggestions:
- store compressed data in projects scratch directory; faster to transfer to local scratch (nvme) for processing
- data storage in scratch or in object storage Allas from where data can be copied directly to local scratch for processing (faster I/O)
"Thinking out loud":