###### tags: `Paper Study`
# Crop Yield Forecasting
### [2017] Deep Gaussian Process for Crop Yield Prediction Based on Remote Sensing Data
**Highlights**
1. **Data**: MODIS satellite images 500m; USDA yearly average soybean yields at county-level; 11 states accounting for over 75% national soybean production; non-crop pixels are removed with the DAAC 2015 world wide land cover data.
**Github**: [https://github.com/JiaxuanYou/crop_yield_prediction](https://github.com/JiaxuanYou/crop_yield_prediction)
2. From raw imagesGenerate histograms as training data. Assumption of permutation invariance: only the number of different pixel types in an image (pixel counts) are informative.

3. Combine Gaussian Process with deep learning model for crop yield prediction. Many features relevant to crop growth are not revealed in RS images, such as soil types, fertilizer rate. These features are inherent to specific locations and can be learned by Gaussian Process (g_loc, g_year).
**<img src="https://i.imgur.com/ep6Nuxz.png" alt="Screen Shot 2020-08-24 at 1.12.38 PM" style="zoom:50%;" />**
**Ideas**
1. Important bands for yield prediction: band 2 (near infrared); band 7 (short-wave infrared); band 8,9 (land surface temperature).
<img src="https://i.imgur.com/pEdsZFl.png" alt="Screen Shot 2020-08-24 at 1.16.09 PM" style="zoom:50%;" />
2. Soybean planting starts on day 110 and ends on day 190, harvest starts on day 250. This indicates that the most useful data are collected during growing season, peak at days just before harvest.
<img src="https://i.imgur.com/Ywumpbb.png" alt="Screen Shot 2020-08-24 at 1.18.47 PM" style="zoom:50%;" />
### [2017] Understanding Satellite-Imagery-Based Crop Yield Predictions
**Highlights**
1. **Data**: MODIS images 500m; DAAC land cover mask; NASS county-level crop yield.
**Github**: [https://github.com/brad-ross/crop-yield-prediction-project](https://github.com/brad-ross/crop-yield-prediction-project)
2. Deeper CNN and better accuracy on yield prediction.
3. Compute saliency maps and rescale to raw image size.A forward pass is performed through the model and the gradient of the output are computed and plotted as an image.
<img src="https://i.imgur.com/bqgjuf5.png" alt="Screen Shot 2020-08-24 at 1.20.19 PM" style="zoom:50%;" />
4. Find that model distinguishes between different crops.
**Ideas**
1. Deeper CNN architecture
### [2018] Deep Transfer Learning for Crop Yield Prediction with Remote Sensing Data
**Highlights**
1. **Data**: MODIS images; Argentine Undersecretary of Agriculture, Brazilian Institute of Geography and Statistics county-level and province-level yield; after ignored bottom 5% production
2. Transfer learning from large dataset (Argentina) to small dataset (Brazil), and get better results.
<img src="https://i.imgur.com/Nb7Xmdh.png" alt="Screen Shot 2020-08-24 at 1.04.38 PM" style="zoom:50%;" />
<img src="https://i.imgur.com/964tD5y.png" alt="Screen Shot 2020-08-24 at 1.05.24 PM" style="zoom:50%;" />
3. Transfer learning method for LSTM yield prediction model: stripped out the last dense layer of the pre-trained model and replace it with an untrained dense layer of the same dimension before training the modified model.
**Ideas**
1. Transfer learning in deep learning yield prediction model in small dataset area.
### [2018] Estimating smallholder crops production at village level from Sentinel-2time series in Mali's cotton belt
**Highlights**
1. **Data**: Sentinel-2A 2016 L2A; MACCS algorithm detect clouds and shadows, correct atmospheric perturbations and retrieve aerosol optical thickness; WorldView-3 image ar 1.24-m multispectral resolution 2016; ICRISAT STARS project, including 2014 and 2015 seasonal time series of VHR DigitalGlobe imagery; tree mask from soil-specific NDVI thresholding of a 2014 early-season WorldView-2 image; soil types from farmer interviews.
2. Used supervised Random Forest classification to discriminate main crop types within the cropland area.
**Ideas**
1. Crop discrimination in cropland area.
### [2018] Machine learning methods for crop yield prediction and climate change impact assessment in agriculture
**Highlights**
1. **Data**: US NASS QuickStats database, county-level corn yields from 1979 through 2016; historical weather data from METDATA (4km^2 resolution, including minimum and maximum air temperature and relative humidity, precipitation, incoming shortwave radiation(sunlight), average wind speed); future weather simulation MACA data (4km^2 resolution); soil data SSURGO database (39 measures if soil physical and chemical properties)
2. Semiparametric deep neural network (SNN), OLS regression
**Ideas**
1. Future weather simulation for future yield prediction.
2. Importance measures
3. Weathers in mid-summer are most important (particularly daily precipitation and minimum relative humidity);
4. Soil variables, proportion of land irrigated and geographic coordinates have low important values. These would be more important in a model trained over a larger, less homogeneous area.

### [2019] Crop Yield Prediction Using Deep Neural Networks
**Highlights**
1. **Data**: 2018 Syngenta Crop Challenge (Crop genotype: 627 * {-1, 0, 1}, Yield performance, Environment: 8 soil variables and 72 weather variables)
2. Deep neural network structure.

3. Feature selection: we used guided back propagation method which back propagates the positive gradients to find input variables which maximize the activation of our interested neurons.



**Ideas**
1. Use feature selection method to find meaningful input features.
### [2018] Convolutional Neural Networks for Crop Yield Prediction using Satellite Images
**Highlights**
1. **Data**: USDA NASS Quck Stat soybean yield data; MODIS Satellite data (Surface Reflectance, Land Surface Temperature, Land Cover);

2. Existing You's Method: Histogram CNN


3. 3D Convolutional Neural Network

Paper method structure:

Input size is 24 * 10 * 64 * 64; (1) use channel compression module (2 2D convolutional layers) to reduce the channel dimension from 10 to 3; (2) 3D convolutional layers than fully connected layers for prediction.

4. Training and inference input data workflow:
Random crop for training and sliding crop for inference.


**Ideas**

1. 3D CNN structure and workflow
2. Time control experiment

Older years only bring noise as they are not representative of the soybean growth scheme anymore.

Feb-Sep generates slightly better results, confirm the You's finding that land surface temperature is correlated with crop growth especially in early months.
3. Location control experiment

(1) Crop yield prediction task with satellite image is very domain sensitive, it is essential to learn on a source region as closely resembling the target region as possible.
(2) Some regions can generalize to new target regions better than the others. It is useful to identify these regions and leverage their generalizability for transfer learning.