## Papers Possible benchmarks, well written and worth mentioning: - TabDDPM: Modelling Tabular Data with Diffusion Models - Time-series Generative Adversarial Networks CTAB-GAN: Effective Table Data Synthesizing A Review of Tabular Data Synthesis Using GANs on an IDS Dataset Tabular Transformers for Modeling Multivariate Time Series CTAB-GAN: Effective Table Data Synthesizing Modeling Tabular Data using Conditional GAN Tabular Transformers for Modeling Multivariate Time Series Conditional Tabular GAN-Based Two-Stage Data Generation Scheme for Short-Term Load Forecasting TabFairGAN: Fair Tabular Data Generation with Generative Adversarial Networks Awesome-timeseries-spatiotemporal-lm-llm#large-language-models-and-foundation-models-llm-lm-fm-for-time-series-and-spatio-temporal-data https://github.com/qingsongedu/awesome-timeseries-spatiotemporal-lm-llm ## Reviews: Time-series Generative Adversarial Networks - autoregressive models for sequence prediction - GAN-based methods for sequence generation - time-series representation learning - Let $S$ be a vector space of static features - $X$ of temporal features $P(S,X_{1:t-1})=p(S)\prod_tp(X_t|X_{1:t-1})$ ![](https://hackmd.io/_uploads/SJGDnSTZa.png) ## TabDDPM: Modelling Tabular Data with Diffusion Models Кажется, TabDDPM дает нормальный план для статьи и бенчмарки по чуть чуть вырисовываются ## TabDDPM: Modelling Tabular Data with Diffusion Models 1. Conditional generation on class or real value quantile (need to think how to avoid leakage in embedding ) 2. Time embeddings with SILU? 3. Generated dataset distributions? 4. Privacy analysis - focus on privacy in the сделать более глубокий чем в TABDPPM (Use a model based approach to find specific users) Benchmarks: 4. Compare with SMOTE approach / adapt SMOTE for sequences (generation) 5. Compare with Time-series Generative Adversarial Networks (generation + cls) paper. 6. CoLESS (cls only) ![](https://hackmd.io/_uploads/H1Whe3RZ6.png) ![](https://hackmd.io/_uploads/ByrbWnRZ6.png) ![](https://hackmd.io/_uploads/B1JgQ2C-6.png) ![](https://hackmd.io/_uploads/BkEgE3RZ6.png) Кажется, TabDDPM дает нормальный план для статьи и бенчмарки по чуть чуть вырисовываются ## TabDDPM: Modelling Tabular Data with Diffusion Models 1. Conditional generation on class or real values quantiles (need to think how to avoid leakage in embedding ) 2. Time embeddings with SILU? 3. Generated dataset distributions? 4. Privacy analysis - focus on privacy in the (Use a model based approach to find specific users) Benchmarks: 4. Compare with SMOTE approach / adapt SMOTE for sequences (generation) 5. Compare with Time-series Generative Adversarial Networks (generation + cls) paper. 6. CoLESS (cls only)