---
title: ASTR 597 W23 HW #2
author: Andy Tzanidakis and friends
tags: LSST Class
---
[TOC]
# Homework 2 - ASTR 597 W23
- Author: Anastasios (Andy) Tzanidakis
- Collaborators: None
- Course ID: ASTR 597 W23
# Problem 1
We will initialize our set-up for this problem and importing our favorite libraries
```python=
# Import general python packages
import numpy as np
import matplotlib.pyplot as plt
import pandas
from astropy import units as u
from astropy.coordinates import SkyCoord
import astropy.units as u
# Import the Rubin TAP service utilities
from lsst.rsp import get_tap_service, retrieve_query
%matplotlib inline
%config InlineBackend.figure_format = "retina"
from tqdm import tqdm
from matplotlib import rcParams
rcParams["savefig.dpi"] = 100
rcParams["font.size"] = 20
# create the TAP service client
service = get_tap_service()
# we will all use the same input coordinates and radii
center_coords = "62, -37"
radius_deg = 0.5 # degrees
```
## Exercise 1a. How many `isPrimary`
How many `isPrimary` objects are in all of DP0.2?
```python=
results = service.search("SELECT count(detect_isPrimary)"
"FROM dp02_dc2_catalogs.Object "
"WHERE detect_isPrimary = 1 ").to_table()
>> Total number of rows 161025835
```
Hooray, our query is complete. We find in the DP.0.2 object catalogue over 160,000,000 rows that satisfy `isPrimary`.
## Exercise 1b. Memory
How much memory do you need to load one row of the Object table?
*Hint: use the dtypes of all the columns.*
To calculate this problem, we can enumerate through each column and take into account the `dtype` where we can sum the number of bits and convert it to bytes:
```python=
job = service.submit_job("SELECT TOP 1 *"
"FROM dp02_dc2_catalogs.Object ")
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
# final table
res = job.fetch_result().to_table()
bites = 0
for k in res.keys():
d_format = res[f'{k}'].dtype
if d_format=='float32' or d_format=='int32':
bites += 32
elif d_format=='float64' or d_format=='int64':
bites += 64
elif d_format=='bool':
bites +=1
print ('Total row memory:' + (bites*u.bit).to(u.kilobyte))
>> Total row memory: 4.1725 kbyte
```
Using one row in DP.0.2 is roughly 4.2 kilobytes!
## Exercise 1c. Maximum rows on `daata.lsst.cloud`
What's the maximum number of full rows of the Object table can you load into a Medium instance on data.lsst.cloud?
Let's assume we can use up to 6.11 Gigabytes in memory RAM for a medium lsst.cloud instance:
```python=
print (((6.1*u.gigabyte).to(u.kilobyte))/(4.14*u.kilobyte))
```
$$\begin{equation}
\text{Max Rows} = \frac{\text{Max Memory Medium Instance}}{\text{Memory per Row}} = \frac{6.1 \ Gb}{3.312 \times 10^{-5} Gb} \approx 1 473 430 \ \text{rows}
\end{equation}$$
# Problem 2
## Exercise 2a. Extendedness
What values can the `{band}_extendedness` entry take? What fraction of the time do the values disagree between bands? What if you only look at rows where none of the `{band}_extendedness_flag` are set?
We will use the query from the first problem and look at the top 100000 entries for this problem:
```python=
# coords and radius (in deg)
center_coords = "62, -37"
rad = 0.5
results = service.search("SELECT TOP 100000 *"
"FROM dp02_dc2_catalogs.Object "
"WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec), "
f"CIRCLE('ICRS', {center_coords}, {rad})) = 1 ")
table = results.to_table()
```
To find out the unique entries for `{band}_extendedness`:
```python=
print (np.unique(table['z_extendedness'].data.data))
>> array([ 0., 1., nan])
```
Now we want to know what fraction of times do the values between the `X_extendedness` disagree between each band:
```python=
# coords and radis (in deg)
center_coords = "62, -37"
rad = 0.5
results = service.search("SELECT u_extendedness, g_extendedness, r_extendedness, i_extendedness, z_extendedness, y_extendedness"
"FROM dp02_dc2_catalogs.Object "
"WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec), "
f"CIRCLE('ICRS', {center_coords}, {rad})) = 1 "
"AND detect_isPrimary = 1")
table = results.to_table()
passed_1 = np.where((table['u_extendedness']==1) & (table['g_extendedness']==1) & (table['r_extendedness']==1) & (table['i_extendedness']==1) & (table['z_extendedness']==1) & (table['y_extendedness']==1))
passed_0 = np.where((table['u_extendedness']==0) & (table['g_extendedness']==0) & (table['r_extendedness']==0) & (table['i_extendedness']==0) & (table['z_extendedness']==0) & (table['y_extendedness']==0))
total = len(passed_1[0]) + len(passed_0[0]) # total number of agreed values
frac_total_extendedness = (100 - total/len(table) * 100) # 100 - total
print (f"Percentage of all rows that have have extrendedness (0 or 1): {frac_total_extendedness} percent")
>> 56.62 percent
```
We find that approximately 56% of each band disagrees with each other. We can also investigate how this fraction changes when the `X_extendedness_flag` is not set:
```python=
cond_1 = np.where((table['u_extendedness_flag']==False) & (table['g_extendedness_flag']==False) & (table['r_extendedness_flag']==False)
& (table['i_extendedness_flag']==False) & (table['z_extendedness_flag']==False) & (table['y_extendedness_flag']==False))
tab_new = table[cond_1] # new table
passed_1_ = np.where((tab_new['u_extendedness']==1) & (tab_new['g_extendedness']==1) & (tab_new['r_extendedness']==1) &
(tab_new['i_extendedness']==1) & (tab_new['z_extendedness']==1) & (tab_new['y_extendedness']==1))
passed_0_ = np.where((tab_new['u_extendedness']==0) & (tab_new['g_extendedness']==0) &
(tab_new['r_extendedness']==0) & (tab_new['i_extendedness']==0) &
(tab_new['z_extendedness']==0) & (tab_new['y_extendedness']==0))
print (f"Fraciton of X_extendedness when flag is not set: {1 - (len(passed_1_[0]) + len(passed_0_[0]))/len(tab_new)}")
>> Fraciton of X_extendedness when flag is not set: 0.18672121303107636
```
We find that the fraction is significantly smaller, about 19% when the `_flag` is not set.
## Exercise 2b. Signal to Noise
Make a plot of PSF magnitude vs. magnitude error for r-band point sources in our region of interest. Exclude sources with the PSF Flux general failure flag set. Set the axes extents to a reasonable range and label your axes!
Hints: Consult the [schema browser](https://dm.lsst.org/sdm_schemas/browser/dp02.html#Object) for column names. Consider using [hexbin](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hexbin.html) instead of a scatter plot with `bins='log'`...
We will first begin with our query and leverage the `nanojanskyToAbMag` to convert the PSF fluxes to AB magnitudes:
```python=
# convert to AB magnitues & error for the LSST-r
results = service.search("SELECT coord_ra, coord_dec, "
"scisql_nanojanskyToAbMag(r_calibFlux) as r_psf_mag, "
"scisql_nanojanskyToAbMagSigma(r_calibFlux, r_calibFluxErr) as r_psf_mag_err "
"FROM dp02_dc2_catalogs.Object "
"WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec), "
"CIRCLE('ICRS', "+center_coords+", 0.5)) = 1 "
"AND detect_isPrimary = 1 ",
maxrec=250_000)
# fetch table
table = results.to_table()
```
|<img src="https://i.imgur.com/6fcFe6p.png" alt="Image" width="500" height="500" style="display: block; margin: 0 auto" />|
|:--:|
| Figure 1. -- AB apparent magnitude and error estimated from the PSF flux estimates.|
## Exercise 3: Find a Transient or Variable
Using the `DIAObject` table, find an object with more than 1 magnitude of variability and plot its lightcurve.
One quick way to search through the light curves and check for >1 mag variability (i.e let's say we're interested in the LSST-r band magnitudes):
```python=
for i in tqdm(range(100)):
sel_objid = table[i]['objectId']
# From Tutorial 07b to fetch the light curve information
query = "SELECT src.band, src.ccdVisitId, src.coord_ra, src.coord_dec, "\
"src.objectId, src.psfFlux, src.psfFluxErr, "\
"scisql_nanojanskyToAbMag(psfFlux) as psfMag, "\
"visinfo.ccdVisitId, visinfo.band, "\
"visinfo.expMidptMJD, visinfo.zeroPoint "\
"FROM dp02_dc2_catalogs.ForcedSource as src "\
"JOIN dp02_dc2_catalogs.CcdVisit as visinfo "\
"ON visinfo.ccdVisitId = src.ccdVisitId "\
"WHERE src.objectId = "+str(sel_objid)+" "
# Query
results = service.search(query)
lc = results.to_table() # light curve table
# Select band
r = lc['band2']=='r'
diff = np.median(lc['psfMag'][r])-min(lc['psfMag'][r])
if diff>1:
print (f"Warning obj_id candidate {sel_objid} has >1 mag variability in the LSST-r bandpass")
# Plotting light curve
plt.figure(figsize=(10, 4))
plt.scatter(lc['expMidptMJD'][r], lc['psfMag'][r], color='crimson')
plt.axhline(np.median(lc['psfMag'][r]), color='k', ls='--')
plt.xlabel('MJD')
plt.ylabel('m$_{LSST-r}$ [AB mag]')
plt.ylim(plt.ylim()[::-1])
break
```
|<img src="https://i.imgur.com/ukNM9XM.png" alt="Image" width="500" height="230" style="display: block; margin: 0 auto" />|
|:--:|
| Figure 1. -- Sample LSST-r AB mag light curve with >1 mag variability.|