Remote Sensing Project
===
###### tags:
## Overview
The goal is to build a model which is able to estimate poverty given a region on a satellite image of Nigeria.
### Dataset
- **Images**: 1,252,984 daytime satellite images which cover all of Nigeria taken in 2020 at 400x400 pixel per image at 2.5m spatial resolution.
- **Targets**:
- **Survey data (~22k labels at household level)**:
- **rwi_Subset**: is over electricity, numasset_smartphone, numasset_regmobilephone, numasset_car, numasset_motorbike, numasset_fridge, numasset_tv, numasset_radio, main_water, improved_water, cookstovetype, refusetype, sanitation_type, floortype, wallstype, rooftype, numsleepingrooms
- **rwi_Full**: is over everything but hhid, interview__key, sector, zone, state, lga, ea, cluster, month_of_interview, year_of_interview, hh_gps_latitude, hh_gps_longitude, hh_gps_accuracy, hh_gps_altitude, popw, weight, totcons_pc, totcons_adj, pl
- **log(adj_consumption)**: is the logarithm of the consumption per capita adjusted for regional price differences/inflation between the beginning and end of the survey. log(totcons_adj)
- **Fraction of households below $1.90 per capita / day**: is the binary classification of the adjusted consumption per capita below and above $1.90 per capita/day which is equivalent to 247,679 Naira percapita. totcons_adj
- **Poverty indices from the DHS Survey**: ~9000 labels(with a 5km jittering from the true point).
- **Inferred wealth indices from another model**: ~1.8M labels(Although there are 1.2M satellite images, this dataset has 1.8M labels because inference was done on images of a different size). This model was trained on labels from the DHS Survey.
- **Building dataset**: Covering the entire country. There are buildings in 215766/1.2M satellite images Nigeria.
---
## Feature
### Regression
- Currently, the poverty index is represented by the rwi_Subset feature. This means that it was calculated using household assets, the
### Classification
```
refuse_type_map={'DISPOSAL IN A RIVER/STREAM':3,
'DISPOSAL IN THE BUSH':5,
'DISPOSAL WITHIN COMPOUND (INCL BURNING)':7,
'GOVT BIN OR SHED':1,
'HH BIN COLLECTED BY GOV':4,
'HH BIN COLLECTED BY PRIVATE FIRM OR INDIVIDUAL': 2,
'OTHER (SPECIFY)':0,
'UNAUTHORIZED REFUSE HEAP':6}
roof_type_map ={'CORRUGATED IRON SHEETS':3,
'ASBESTOS SHEET':0,
'ZINC SHEET':10,
'LONG/SHORT SPAN SHEETS':4,
'CONCRETE/CEMENT':2,
'THATCH (GRASS OR STRAW)':9,
'STEP TILES':8,
'OTHER (SPECIFY)':6,
'MUD':5,
'PLASTIC SHEET':7,
'CLAY TILES':1}
electricity_map={'yes':1,'no': 0}
slum_ea_map ={'yes':1,'no': 0}
slum_hh_map={'yes':1,'no': 0}
sanitation_type_map ={'FLUSH TO SEPTIC TANK':5,
'FLUSH TO PIT LATRINE':4,
'FLUSH TO SOMEWHERE ELSE':6,
'PIT LATRINE WITH SLAB':11,
'PIT LATRINE W/O SLAB/OPEN PIT':10,
'NO FACILITIES,BUSH, OR FIELD':8,
'HANGING TOILET/HANGING LATRINE':7,
'VENTILIATED IMPROVED LATRINE':12,
'COMPOSTING TOILET':1,
'FLUSH TO OPEN DRAIN':2,
'FLUSH TO PIPED SEWAGE SYSTEM':3,
'BUCKET':0,
'OTHER (SPECIFY)':9}
job_map ={'03 nfe': 2,
'01 wage':0,
'04 not working':3,
'02 agriculture':1}
rwi_full_map = {'upper':3,'mid_upper':2,'mid_lower':1,'lower':0}
continous variables= {"lon":float,"lat":float,"num_houses":float,}
```
---
### Methodology
#### Data preparation
---
## Models
- ### Using the 22k Dataset
- Auxilliary labels: 11 attributes
- Regression is on the continous PI value
| Reg_Model | MSE | R2 |
|-----------------|-------|-------|
| M1_PI_only | 0.516 | 0.483 |
| Pretrained model on satellite images_PI_only| 0.39 | 0.591 |
| | | |
| | | |
| Class_Model | Acc |
| --------------------------------------------- | ----- |
| M1_(PI_only) | 0.53 |
| Pretrained model on satellite images(PI_only) | 0.55 |
| M1_(PI_rooftypes) | 0.56 |
| M1_(PI_rooftypes_refuse_type_sanitation_type) | 0.54 |
| | |
| | |
- **M1-reg** :Using a simple cnn trained on the 22k Dataset. input(satellite images-400x400) and outputs(poverty index-float)[[1]](#M1-model)
- **M1-class**: Using a simple cnn trained on the 22k Dataset. input(satellite images-400x400) and outputs(poverty index-classes)[[1]](#M1-model)
<!---
##Methodology
- Use a **clustering technique** to group the images into x groups so that we can create models better suited for each group since we do not have enough lables to represent the entire country.
- ***YB: I am not at all convinced this is a good idea. Deep learning likes more data, and we deal with the issue of small amounts of labels of interest with multi-task learning here. Clustering loses too much information. At the very least, let us start by using standard deep learning approaches as baseline. Then try the clustering idea and compare (but personally I would not even do it).***
- Images that have buildings and road features are expected to be grouped together while those with features such as majority pixels as water bodies and open fields should be in a different group. This way, we can reduce the variance of the types of images the model trained on images of where people live sees(which is what we are interested in). This could be beneficial and more realistic since the groundtruth data almost/only covers images where people live.
- ***YB: if you have a way to cluster images in the way you say, just use those cluster categories as extra input (near the top level of the deep net hierarchy). Sharing parameters across data is most of the time winner. Not having a separate model for each category.***
---

**1. Create a baseline model**:
- Get a pretrained model and fine-tune it using the 22k dataset. This will be predicting the poverty index using just the mean.
- y = f(x), mse/CE loss
***YB: what is x here? an image? Why do you say it is just the mean?***
**2. Add the other attributes to the prediction**:
- Get a pretrained model and fine-tune it using the 22k dataset. This will be making predictions using the mean square error and the covariance matrix of the other attributes.
- ***YB: sorry but I don't follow your idea above. Please explain with math and clear specifications.***
- Since there could be multiple households per image, we will treat each household as an individual datapoint(i.e. the same satellite image can have multiple target values.). In addition, apart from the waelth index , when we went back to the original source of the data, we realized we have access to survey detailed such as monthly income of the household, occupation categories, number of bikes they have etc.... We realize these data could be fully utilzed.
***YB: yes! they should, and the best way according to me is to treat all those labels as auxiliary tasks, all sharing the same image-to-features pipeline but each having a different feature-to-prediction head branch.***
- 1) Add random noise to the loss
***YB: why? It will probably just slow the training. If you want to regularize, there are better ways, like L1 and L2 weight decay, dropout, early stopping, etc.***
- 2) Add more realistic noise to the MSE by introducing a prior of a covariance matrix in the form of a cholesky decomposition of the covariance matrix of our attributes.
***YB: I am completely lost as to why you are doing this. It does make sense to learn the joint distribution over all the labels, given the image, though. But I am not sure this is the intention here.***
- y = f(h(x)+ n(x,Lu))
- Attributes which are used at first will be those of real numbers and those that correlate with features on an image e.g number of buildings in an image, number of assets, wheter there is electricity, number of sleeping rooms etc.
-->
<!---
### Implementation
- #### Using the Pretrained network as a feature extractor.
:::info
**Model:** VGG16 pretrained on Big Earthnet
- **Model Info:** To construct BigEarthNet with Sentinel-2 image patches (called as BigEarthNet-S2 now, previously BigEarthNet), 125 Sentinel-2 tiles acquired between June 2017 and May 2018 over the 10 countries (Austria, Belgium, Finland, Ireland, Kosovo, Lithuania, Luxembourg, Portugal, Serbia, Switzerland) of Europe were initially selected. All the tiles were atmospherically corrected by the Sentinel-2 Level 2A product generation and formatting tool (sen2cor). Then, they were divided into 590,326 non-overlapping image patches. Each image patch was annotated by the multiple land-cover classes (i.e., multi-labels) that were provided from the CORINE Land Cover database of the year 2018 (CLC 2018).
- **Preprocessing:** We removed the last layer of the model.
- We passed each image through a pretrained VGG16 CNN model to extract 4096 features from the daytime satellite images.
:::
- #### Fine-tuning a Pretrained network.
:::info
**Models:** VGG16, Resnet101 pretrained on Big Earthnet
- **Model Info:** To construct BigEarthNet with Sentinel-2 image patches (called as BigEarthNet-S2 now, previously BigEarthNet), 125 Sentinel-2 tiles acquired between June 2017 and May 2018 over the 10 countries (Austria, Belgium, Finland, Ireland, Kosovo, Lithuania, Luxembourg, Portugal, Serbia, Switzerland) of Europe were initially selected. All the tiles were atmospherically corrected by the Sentinel-2 Level 2A product generation and formatting tool (sen2cor). Then, they were divided into 590,326 non-overlapping image patches. Each image patch was annotated by the multiple land-cover classes (i.e., multi-labels) that were provided from the CORINE Land Cover database of the year 2018 (CLC 2018).
- **Preprocessing:** Removed the last 6 layers of the model.
:::
- #### Cluster 1.2M images and then train a weakly supervised model.
:::info
**Clustering pipeline:**
- Load data
- Resize the images into 224x224 for VGG16
- Reshape the input array into batches
- Load feature extractor(pretrained model)
- Using VGG16 model: this should not matter much since we are only interested in grouping images which look similar together.
- Remove the last layer so that the output array is of size (4096,1)
- Use the model to extract features
- Dump features into a pickle file
- Reduce the dimensionality
- Use PCA to reduce the dimensionality of the feature vectors from (4096,1) to (100,1)
- Cluster images
- Decide on K: the number of clusters
- Cluster images, and store the list of labels on disk
-->
:::
---
## Evaluation
:closed_book: Tasks
---
==Importance== (1 - 5) / Name
### TODO:
- [x] ==5== Divide images into clusters/Groups
- [x] ==3== Find a suitable clustering technique and create a pipeline
- [x] ==3== Divide the images into clusters
- [x] ==3== Store the clusters on disk
- [ ] ==5== Configure pretrained network
- Configure big earthnet model trained using tf 1(**current task**)
- [ ] ==4== Add model to existing pipeline
- [ ] ==4== Fine-tune model
- [ ] ==3== Evaluate the model
- [ ] ==2== Train a Weakly supervised model using the clusters
---
## Supplementary Materials
<!-- Other important details discussed during the meeting can be entered here. -->
### M1 model

### M1_all_labels

### M1_trained from scratch_PI

### M2_pretrained_PI
