owned this note
owned this note
Published
Linked with GitHub
# Go Big Planning
Forward model code: https://github.com/maho3/ltu-cmass/tree/main
Configuration code: https://github.com/maho3/ltu-gobig/tree/main
## Goals
Study cosmology constraints as a function of:
* Different simulators
* Different summaries
* Number of training simulations
Validate the forward model w.r.t. Abacus, MTNG:
* How far off are we?
* What can we do to fix it?
Apply to real (blinded?) CMASS data.
## Current Forward Model

| Stage | Current | Desired |
| -------------- | ---------------------------------- | --- |
| Gravity Solver | BORG2LPT, jax-lpt, pmwd | |
| Halo biasing | TruncatedPowerLaw | CHARM, CNN-NPE |
| Galaxy biasing | Zheng07, Zheng07ex | Flow models? |
| Lightcone | cuboid remapping | BORG lightcones |
| Survey effects | fiber collisions, BOSS survey mask | |
| Filtering | FKP weighting | Custom weighting schemes |
## TODO [Deprecated since the 02/24 hack week]
### Gravity Solvers
- [x] Fix discrepancy in intial transfer function between pmwd and BORG ([Issue #2](https://github.com/maho3/ltu-cmass/issues/2))
- [ ] Remove jaxlpt (outdated)
- [ ] Implement [ALPAGA](https://bitbucket.org/aquila-consortium/alpaga/src/main/) for 1LPT and 2LPT. Includes lightcone [Lucas: 1LPT and 2LPT is now implemented]
### Halo biasing
- [x] Fit halo biasing models on emulated DM fields, not on Quijote DM fields
- [ ] Fit halo biasing as a function of redshift
- [ ] Implement CHARM
- [ ] Implement CNN NPE
### Lightcone
- [ ] Use pre-implemented lightcone effects in Borg and ALPAGA
- [ ] Extend [cuboid-remapping](https://github.com/maho3/cuboid_remap_jax) to interpolate between snapshots
### Summaries
- [ ] Adapt [ili-summarizer](https://github.com/florpi/ili-summarizer) for survey data
### Logistical
- [x] Fix a latin hypercube of 5-10k cosmologies (Quijote Big Sobol https://quijote-simulations.readthedocs.io/en/latest/bsq.html)
- [ ] Port MTNG and Abacus simulations for validation
- [ ] Estimate storage requirements
## Reference Simulations
### Quijote
LH (standard, fixed)
* 2,000 simulations at different cosmologies in 3 sets (standard/fixed/high-res)
* Matched ICs (fixed) and Different ICs(standard/high-res)
* Standard/Fixed: 1 Gpc/h at $512^3$; 2Mpc/h
* High-res: 1 Gpc/h at $1024^3$; 1Mpc/h
* FoF and Rockstar halos
* $z$ : 0, 0.5, 1, 2, 3
Big Sobol Sequence
* 32,768 simulations at different cosmologies
* Each with different seeds
* 1 Gpc/h at $512^3$; resolution; 2 Mpc/h
* FoF and Rockstar halos. Rockstar merger trees.
* $z$ : 0, 0.2, 0.5, 0.7, 1, 1.5, 2, 3, 4, 5, 6
Access: https://quijote-simulations.readthedocs.io/en/latest/access.html
### AbacusSummit
Emulator Grid (AbacusSummit_base_c{130-181}_ph000)
* 51 simulations at different cosmologies
* Matched ICs
* 2 Gpc/h at $6912^3$; 0.3 Mpc/h
* CompaSO Halos. FOF -> Spherical Overdensities
* $z$ : 0.1, 0.2, 0.3, 0.4, 0.5, 0.8, 1.1, 1.4, 1.7, 2.0, 2.5, 3.0
Access: https://abacussummit.readthedocs.io/en/latest/data-access.html
### Millenium TNG
Large volume with neutrinos
* 2 simulations w/neutrinos at 1.5 Gpc/h and 2 Gpc/h. Same cosmology.
* 1 Gpc/h at $5120^3$ or 2 Gpc/h at $10240^3$; 0.2 Mpc/h
* Galaxy catalogs from SAMs
* $z$ : 0
Reference: https://arxiv.org/pdf/2210.10059.pdf
Access: Internal
### Outer Rim
Large volume lightcone
* 1 simulation at WMAP-7 cosmology
* 3 Gpc/h at $10240^3$; 0.3 Mpc/h
* Halo lightcone out to z=3
Acces: https://cosmology.alcf.anl.gov/outerrim
## Simulation suites (to be run!)
### Quijote calibration (2k sims)
* **Purpose**:
* to calibrate the AFM halo population models using the Quijote LH-HR
* **ICs**: Phase-matched with Quijote ICs
* **Cosmo:**: 2000 simulations at Quijote LH
* **Volume**: 1 Gpc/h
* **Resolution**: 8 Mpc/h
* **Simulators**:
* 2-LPT, BORG-PM, pmwd
* **Data products**:
* Halo count field
* Trained NPE/CHARM
### 1 Gpc/h inference (~5k sims)
* **Purpose**:
* to train ILI models for parameter inference on 1 Gpc/h.
* to validate bias w.r.t Quijote
* **ICs**: Random phases
* **Cosmo**: ~5k simulations at Quijote Sobol
* **Resolution**: 8 Mpc/h
* **Simulators**:
* 2-LPT, BORG-PM, pmwd
* **Halo bias**: TruncatedPowerLaw, CNN-NPE, CHARM
* **Galaxies**: Zheng07, Zheng07-ex
* **Data products**:
* DM field
* Halo count field
* Galaxy lightcone (w/o observational selection)
* Summaries
### 2 Gpc/h inference (~5k sims)
* **Purpose**:
* to train ILI models for parameter inference on 2 Gpc/h.
* to validate bias w.r.t Abacus/MTNG
* **ICs**: Random phases
* **Cosmo**: ~5k simulations at Quijote Sobol + 51 Abacus cosmologies + 1 MTNG cosmology
* **Resolution**: 8 Mpc/h
* **Simulators**:
* 2-LPT, BORG-PM, pmwd
* **Halo bias**: TruncatedPowerLaw, CNN-NPE, CHARM
* **Galaxies**: Zheng07, Zheng07-ex
* **Data products**:
* DM field
* Halo count field
* Galaxy lightcone (w/o observational selection)
* Summaries
### 3 Gpc/h inference (~5k sims)
* **Purpose**:
* to train ILI models for parameter inference on 3 Gpc/h (CMASS NGC).
* to validate bias w.r.t OuterRim
* **ICs**: Random phases
* **Cosmo**: ~5k simulations at Quijote Sobol + 51 Abacus cosmologies + 1 MTNG cosmology
* **Resolution**: 8 Mpc/h
* **Simulators**:
* 2-LPT, BORG-PM, pmwd
* **Halo bias**: TruncatedPowerLaw, CNN-NPE, CHARM
* **Galaxies**: Zheng07, Zheng07-ex
* **Data products**:
* DM field
* Halo count field
* Galaxy lightcone (w/ observational selection)
* Summaries
## Notes from meeting
* Rank our simulation suites in terms of realism
* The focus should be to validate against Abacus, MTNG, etc. To get unbiased results first, then to measure constraining power.
* CHARM might not yet be ready for application. Still only trained on one cosmology
* Perhaps we can train to infer an EFT galaxy bias model?
* We should have different suites at different box sizes and resolutions, some at 1, 2, and 3 Gpc/h to compare to Quijote, Abacus, and MTNG
* Split observational sample into smaller volumes?
## Notes from BP meeting
In order to apply BP SZ signal:
* Mass resolution down to 10^12
* Need mass accretion history (if we want it to be differentiable)
* Classical methods don't need mass accretion history (mass and concentration, or just mass)
* 0<z<2
## AFM possibilities
* BACCO clustering emulator: Arico et al. (2020), Angulo et al. (2020)
* Emulator for lots of clustering stats using extended Subhalo Abundance Matching, compared to MilleniumTNG & LGalaxies SAM: Contreras, Angulo, et al 2023
* Emulator for 2pt statistics using Aemulus with hybrid Lagrangian biasing: Kokron et al (2021)
* GP emulators for funkier galaxy clustering stats in redshift space, Storey-Fisher et al (2024)
## Notes from meeting with Greg
* Go Small:
* What signal can we use? Not power spectrum. Graph stuff?
* Dust model inference.
*