---
title: ASTR 597 W23 HW 7
author: Andy Tzanidakis and friends
tags: LSST Class
---
[TOC]
# Homework 7 - ASTR 597 W23
- Author: Anastasios (Andy) Tzanidakis
- Course ID: ASTR 597 W23
# Problem 1
## Exercise 1: Searching for Supernovae
Without using the truth table information, identify a sample of 40 or more candidate Ia supernovae.
As advised, we began using the implementation to find possible SN Ia candidates from DiaSource from 07a without using the truth table. Generally, I adjusted the parameters a bit more from the tutorial, specifically the event duration I set to be slightly shorter and longer:
```python=
service = get_tap_service() # TAP
# Conditions for possible SN Ia event
redshift_min = 0.01
redshift_max = 1
snia_peak_mag = -19.0
snia_peak_mag_range = 1
snia_peak_mg_max = 24.0
snia_peak_mi_max = 24.0
snia_ampl_mr_min = 1
snia_ampl_mr_max = 1000
snia_nDiaSources_min = 10 # days
snia_nDiaSources_max = 150 # days
snia_duration_min = 10
snia_duration_max = 450
# Using cosmological distmod- what would be the typical luminosity:
snia_peak_mr_min = cosmo.distmod(redshift_min).value + snia_peak_mag - snia_peak_mag_range
snia_peak_mr_max = cosmo.distmod(redshift_max).value + snia_peak_mag + snia_peak_mag_range
results = service.search("SELECT ra, decl, diaObjectId, nDiaSources, "
"scisql_nanojanskyToAbMag(rPSFluxMin) AS rMagMax, "
"scisql_nanojanskyToAbMag(rPSFluxMax) AS rMagMin, "
"scisql_nanojanskyToAbMag(gPSFluxMax) AS gMagMin, "
"scisql_nanojanskyToAbMag(iPSFluxMax) AS iMagMin, "
"scisql_nanojanskyToAbMag(rPSFluxMin)"
" - scisql_nanojanskyToAbMag(rPSFluxMax) AS rMagAmp "
"FROM dp02_dc2_catalogs.DiaObject "
"WHERE nDiaSources > "+str(snia_nDiaSources_min)+" "
"AND nDiaSources < "+str(snia_nDiaSources_max)+" "
"AND scisql_nanojanskyToAbMag(rPSFluxMax) > "+str(snia_peak_mr_min)+" "
"AND scisql_nanojanskyToAbMag(rPSFluxMax) < "+str(snia_peak_mr_max)+" "
"AND scisql_nanojanskyToAbMag(gPSFluxMax) < "+str(snia_peak_mg_max)+" "
"AND scisql_nanojanskyToAbMag(iPSFluxMax) < "+str(snia_peak_mi_max)+" "
"AND scisql_nanojanskyToAbMag(rPSFluxMin)"
" - scisql_nanojanskyToAbMag(rPSFluxMax) < "+str(snia_ampl_mr_max)+" "
"AND scisql_nanojanskyToAbMag(rPSFluxMin)"
" - scisql_nanojanskyToAbMag(rPSFluxMax) > "+str(snia_ampl_mr_min)+" ",
maxrec=200)
DiaObjs = results.to_table()
```
Now we can filter the events based on the event duration:
```python=
# Figure out the duration of the event
DiaObjs['duration'] = numpy.zeros(len(DiaObjs), dtype='float')
for j,DiaObjId in tqdm(enumerate(DiaObjs['diaObjectId'])):
results = service.search("SELECT diaObjectId, midPointTai "
"FROM dp02_dc2_catalogs.DiaSource "
"WHERE diaObjectId = "+str(DiaObjId))
results = results.to_table()
DiaObjs['duration'][j] = numpy.max(results['midPointTai']) - numpy.min(results['midPointTai'])
# condition on duration of event with the above limits
tx = numpy.where((DiaObjs['duration'] > snia_duration_min)
& (DiaObjs['duration'] < snia_duration_max))[0]
#DiaObject ID's of potential SN Ia
lc_ids = DiaObjs['diaObjectId'][tx]
```
Using this technique, I found 113 potential SN Ia candidates that met all the criteria we set from above.
# Problem 2
## Exercise 2: Ia Lightcurve fitting.
Using `sncosmo`, fit a [SALT2](https://ui.adsabs.harvard.edu/abs/2007A%26A...466...11G/abstract) model to each of the candidate Ia SNe you identified in Exercise 1.
The fit to each lightcurve will yield parameters `z` (redshift), `t0` (time of peak brightness), `x0` (amplitude), `x1` (stretch), and `c` (color). Save them and their errors.
Note that the fit may fail for some of your lightcurves; that's okay.
Okay, we will first begin by writing a simple function that will take the `DiaObjectID` and return the light curve table in the ideal `SNCosmo` format:
We can write a light curve query and put it in the astropy.table format for SNCosmo:
```python=
def fetch_my_lc(_id):
"""Given DiaSourceID this function will return the light curve table in the required SNCosmo format.
Parameters:
-----------
_id (int): DiaObjectID
Returns:
--------
astropy.table: mjd, band, flux, flux_error, zp, zp_sys
"""
res = service.search("SELECT ra, decl, diaObjectId, diaSourceId, "
"filterName, midPointTai, psFlux, psFluxErr, "
"scisql_nanojanskyToAbMag(psFlux) AS psAbMag "
"FROM dp02_dc2_catalogs.DiaSource "
"WHERE diaObjectId = "+str(_id))
res = res.to_table()
# Instrumental AB Zeropoints
zp_u = 27.03
zp_g = 28.38
zp_r = 28.16
zp_i = 27.85
zp_z = 27.46
zp_y = 26.68
ZP = []
for z in res['filterName']:
if z=='u':
ZP.append(zp_u)
elif z=='g':
ZP.append(zp_g)
elif z=='r':
ZP.append(zp_r)
elif z=='i':
ZP.append(zp_i)
elif z=='z':
ZP.append(zp_z)
elif z=='y':
ZP.append(zp_y)
zp_sys = ['ab' for _ in range(len(mjd))] # AB zp system
return Table([res['midPointTai'], 'lsst'+flts, res['psFlux'], res['psFluxErr'], ZP, zp_sys],
names=('time', 'band', 'flux', 'fluxerr', 'zp', 'zpsys'))
return res
```
Using the above function, we can see what a general fit to a potential SN Ia light curve would look like:

We can now run this for our entire sample as we defined from above, and measure the measured parameters from the SALT2 model (z, t0, x0, x1, c):
```python=
master_params = []
chi2 = []
model = sncosmo.Model(source='salt2') # initialize SALT2 model
for j in tqdm(range(len(lc_ids))):
lc = fetch_my_lc(lc_ids[j])
# run the fit
try:
result, fitted_model = sncosmo.fit_lc(
lc, model,
['z', 't0', 'x0', 'x1', 'c'], bounds={'z':(0.001, 1)})
except:
continue
if np.log10(result.chisq)>3: # chi-2 quality cut
continue
master_params.append(result.parameters)
chi2.append(result.chisq)
```
We can also see the recovered redshift distribution we recover from the SALT-Ia models. Overall, I find that the SALT model is likely more biased toward higher redshift galaxies. It would also interesting to compare this to the photo-z associated with the host galaxy!

# Problem 3
## Exercise 3a: My First Hubble Diagram
Now we will use our SALT2 fit values to create a rudimentary Hubble diagram. We will plot the distance modulus $\mu$ derived from the SALT fits as a function of redshift.
<div class="alert alert-block alert-warning">
<b>Warning:</b> This analaysis approach is intended to be illustrative, and doesn't reflect the current state of the art for Ia cosmology. The interested reader is encouraged to consult the literature (e.g., <a href="https://ui.adsabs.harvard.edu/abs/2022ApJ...938..111B/abstract">Brout+22</a>, <a href="https://ui.adsabs.harvard.edu/abs/2022arXiv221107657D/abstract">Dhawan+22</a>, <a href="https://ui.adsabs.harvard.edu/abs/2022ApJ...934...96S/abstract">Sanchez+22</a>, and references therein) to learn more.
</div>
The [Tripp Law](https://ui.adsabs.harvard.edu/abs/1998A%26A...331..815T/abstract) can be [written](https://ui.adsabs.harvard.edu/abs/2011ApJ...740...72M/abstract) in terms of the SALT2 model parameters:
$\mu = -2.5 \log_{10}{x_0} + \alpha x_1 - \beta c - M_0$
The parameters $\alpha$, $\beta$, and $M_0$ are typically fit along with the cosmological model. For simplicity, we will adopt the DC2 values $\alpha = 0.137$, $\beta=3.21$ (see Table 1 of [Sanchez+22](https://ui.adsabs.harvard.edu/abs/2022ApJ...934...96S/abstract)).
Using the above formula, plot the distance modulus versus redshift for our Ia supernovae. Overplot the theoretical luminosity distance vs. redshift curve for a fiducial cosmology. Use "chi by eye" to adjust $M_0$ until the points overlap reasonably with the model.
First, I began by computing the $\mu$ estimates by keeping $\alpha$ and $\beta$ constant. Then I adjusted the $M_0$ value until I got good a good agreement with the data. I found a value of $M_0$=34 was a decent fit. Below I show how I estimate the luminosity distance via astropy:
```python=
mu_data = -2.5*np.log10(master_params[:,2]) + alpha*master_params[:,3] - beta*master_params[:,4] + 34
def lum_dist(z):
"""Calculate the cosmological luminosity distance. """
d = cosmo.luminosity_distance(z).value
return 5*np.log10(d/1e-5)
z_range = np.linspace(0.01, 1, 1000)
lum_dist_model = lum_dist(z)
```
In the Figure below, I plot the luminosity redshift plot for both the SN Ia sample, including the theoretical luminosity distance relationship. Each point is also colorcoded by the the logarithm of the $\chi^2$ statistic we obtained from the `SNCosmo` fit. We can see that there are a handful of outliers using this technique - likely originating from the SALT2 fitting or from our sample selection technique. Overall this is somewhat in decent agreement with the [Sanchez+22](https://ui.adsabs.harvard.edu/abs/2022ApJ...934...96S/abstract) study.

## Exercise 3b:
Compute the standard deviation of the residuals between the fit $\mu$ values and the luminosity distance expected at that redshift. How does that compare to the simulated "intrinsic scatter" of 0.15 mag in the DC2 simulations (Table 1 of [Sanchez+22](https://ui.adsabs.harvard.edu/abs/2022ApJ...934...96S/abstract))?
What steps could you imagine taking to reduce the scatter in your Hubble diagram? In what ways will the DC2 analysis tend to _underestimate_ the scatter that will be seen in real LSST analysis?
Basically, we want to trace the standard deviation of the residuals for bins of redshift. Below, I binned my redshift values using a running median (0.1 redshift bins) and estimated the standard deviation in each bin of the residuals. I found that there's one redshift bin z~0.4 that our standard deviation of the residuals was particularly high that was caused by a few outliers found in that bin. Generally, I found that the residual in most of the other bins to be less than scatter in ~1 mag.

While this is not exactly the precision that we'd want in a more realistic scenario - it is actually surprising to me that we can get somewhat close to a 0.15 mag scatter. There are generally a few flaws with our approach. Our selection technique for SN Ia is not perfect. We only based our criteria on the characteristics of the transients (i.e rise time, timescale, and peak magnitude). We can imagine there are other techniques we can use to find a more pure sample of Ia SNe.
Some improvements we can make to this study to get better accuracy:
- Stricter requirements on the number of detections per SNe light curve. This will likely lead to more confident SALT2 fits.
- Better quality cuts that would give us a more pure sample of SN Ia
- Higher number of events - it's likely we need thousands of SN Ia to get more robust statistics
It is evident in the DC2 analysis that they're working directly with the confirmed SN Ia sample. When looking at their Hubble diagram we see no obvious outliers, which likely indicates that their sample is "too pure". In the more realistic scenario like in this exercise, we come to realize that the penalty on the scatter will be much higher depending on the quality of the light curves, how well constrained they are, and the purity of the sample.
# Acknowledgements
Many concepts and ideas for the workflow of this project were inspired by conversations and code I shared with Tom Wagg, David Wang, Tobin Wainer, and Jake Kurlander, and John Franklin Crenshaw.