# Sythetic Data - Colaboration
Recent colaboration with INESC TEC - Portugal
Team:
- Yusuke Tanimura
- Haga Jason
- Joao Paulo 
- Martinez Edgar
The working field: Masive storage for large system such as BeeGFS technology. Middleware at OS level for high speed file retrieval.
The problem: They want to profile HPC applications to leaverage PFS in order to make more efficient the usage of large systems. These applications range from weather prediction, molecular dynamics, and Deep Learning.
The SOTA: There is no real benchmark to profile these large systems. LAPACK and other Matrix Multiplications does not reflect the profiling correctly when real-use cases applications are executed.
Proposal: I can create a similar synhtetic data application suchs as the Fractal Generator, which is optimized and can generate "Real Data Like" datasets.
Fractals ---> 
Weahter Prediction ---> 
## Progress 20/01/2025
Not so much progress in this end. We are schedule to have a meeting at the end of the January.
- I will try to push a long stay (2 moths) in Portugal on August next year.
- I will contact Jens to talk about HPCI applications and its benchmark procedures.
## Progress 20/02/2025
We had a first meeting to prepare test on ABCI soon. We started sharing code.
- I am getting familiar with their technology: https://github.com/marianasamiranda/trace-collector
- This is a tool like BeeGFS.
- I am doing test with the tracer. This tracer can profile I/O load and try to make automatic adjust like chaceh on SSDs.
- I will try to generate some synhtetic data to help profiling this tracer on images.