# Data reduction workflow for SANS-1 at MLZ with Mantid
The main goal is to integrate the data reduction workflow developed by instrument scientists and users of SANS-1 at MLZ into Mantid. A similar workflow has been developed at ILL for the D22 instrument. Most of ILL algorithms have a potential to be expanded (with little changes) to support the data reduction workflow at SANS-1.
Currently, data reduction at SANS-1 is done with the help of the BerSANS GUI. Our goal is to transfer the functuionality of the GUI into Mantid, which will also give access to additional features for users of SANS-1 because Mantid provides continuous user support and has many useful built-in algorithms (e.g. data can be easily written to a nexus format, automatic error propagation, etc).
## Brief overview of stages for data reduction with BerSANS
- **Load raw input files.** The software takes as an input "raw" ASCII type data files with extension `.001`. Such files contain counts information for each of the 128\*128 detector pixels. Each pixel is 8mm\*8mm in size. It should be noted that the actual instrument is made of 128 He-3 gas tubes and each tube is split into 960 pixels. This means that "raw" `.001`-type data files contain data that have been already processed and binned by an external (which?) software. However, at the moment, we treat `.001` files as files with raw data.
- **Reduce/correct raw input files.** Once all the necessary `.001` files are loaded to BerSANS, data reduction can be performed (detailed stages of data reduction are described in [this section](#How-to-reduce-raw-files)) and the software generates reduced files with extension `.002`. These files still contain data in a pixel format (128*128) but with processed (reduced) "counts" information.
- **Transform reduced data to momentum space.** `.002` files can be processed further by transforming the coordinate-space counts information to momentum space by taking into account the geometry of the sample-detector system. The output of this stage is saved to the files with extension `.003` and contains 2D picture of the scattering process in the momentum space ($q_x$, $q_y$). For details see [this secton](#How-to-transform-data-to-momentum-space).
- **Perform radial averaging.** `.003` files are then processed further by performing the [radial averaging](#How-to-perform-radial-averaging) (binning). The output is then written to the files with extension `.004` and contains 1D intensity function $I(q)$.
- **Merge intensities.** At the final stage, 1D $I(q)$ data, which are written to several `.004` files (they probe the same sample, but at different colimation and/or sample-detector distances, i.e. for different momentum ranges), are merged together and the output is written to the files with extension `.005`. This is the final output.
## Results
The SANS-1 loader algorithm was developed to load only raw `.001` files. All the processed files can then be saved in the NeXus format and loaded by Mantid using the ILL algorithms available in Mantid. Whenever necessary, the users can compare the processed data versus those legacy files (`.002`, `.003`, etc.) using the BerSANS software.
## How to reduce raw files
*data reduction described below corresponds to BerSANS tutorial*
To perform data reduction from `.001` to `.002` extension a user usually deals with 2 types of datasets (type-T and type-S):
- **Data for transmission evaluation (type-T).** The first type of data are collected for evaluation of transmission coefficients. Such data have to be measured without a beam stop, so the flux of neutrons for such measurements is lowered with the help of an atenuator. Typically, such measurments are characterized by a smaller measurement times and much smaller amount of counts compared to the second type of data. In general, 4 datafiles are required for estimation of 3 transmission coefficients: 1) Sample Transmission, 2) Empty Cell Transmission, 3) Transmission for Isotropic Scattering Sample (typically water or plexi glass). An additional Empty Beam measurement is required for estimation of transmission coefficients.
- **Data for sample reduction corrections (type-S).** The second type of data contains measurements with a higher flux of neutrons (typically x100 compared to the first type) for better statistics purposes and it is measured with a beam stop, to prevent the detector from being damaged.
Neutrons of the same wavelength must be employed for obtaining both types of data. The summary of required measurements for full reduction of data is provided below.
| Measurment name | Sample name | is it required (*T/S*) |
| :------------- | :--------------------------------------- | :-------------------: |
| Empty Beam | | yes / no |
| Empty Cell | | yes / yes |
| Dark Current | *'b4c'* or *'cadmium'* | yes / yes |
| Isotropic | *'h2o'* or *'plexiglas'* | yes / no |
| Sample | *'s1'* or *'Graphit, mesoporoes'* *etc.* | yes / yes |
<br/>
**The general formula** to perform data reduction from `.001` to `.002` extension looks like:
<br/>
<font size="6">
$I_{ij}=\frac{ \frac{S_{ij} - D_{ij}}{T_{s}} \cdot A_s - \frac{EC_{ij} - D_{ij}}{T_{ec}} \cdot(1-p_s) }{\frac{W_{ij} - D_{ij}}{T_w} - \frac{EC_{ij} - D_{ij}}{T_{ec}} } \cdot \frac{F_w}{F_s}$,
</font>
where each detector pixel is uniquely identified by specifying its *i* and *j* indices. Moreover,
$I_{ij}$ - Reduced intensity function
$S_{ij}$ - Normalised counts for scattering from sample
$D_{ij}$ - Normalised counts for scattering from shielding sample (for dark current correction)
$EC_{ij}$ - Normalised counts for scattering from empty cell (for background correction)
$W_{ij}$ - Normalised counts for scattering from water (for detector efficiency correction)
$T_{X}$ - Transmission coefficient from respective sample ($X = s, ec,$ or $w$ characterising sample, empty cell, or water, respectively)
$A_{s}$ - Sample attenuation factor (we assume for now $A_s = 1$)
$p_{s}$ - Probability factor (we assume for now $p_s = 0$)
$F_X$ - Scaling factor (depends on samplem geometry)
<br/>
to simplify the task we can split `dataReduction` to the further steps.
**Steps to calculate $I_{ij}$:**
1. Normalize both types (type-S and type-T) of raw data on monitor counts (or measurement time):
$\frac{X_{ij}}{mon2}$
3. *Calculate Transmission* using files from type-T data (described below) $\to T_X$ <font size="4">
$T_{s}=\frac{\sum (S_{ij} - D_{ij})}{\sum (EB_{ij} - D_{ij})}$
$T_{ec}=\frac{\sum (EC_{ij} - D_{ij})}{\sum (EB_{ij} - D_{ij})}$
$T_{w}=\frac{\sum (W_{ij} - D_{ij})}{\sum (EB_{ij} - D_{ij})}$
</font>
$EB$ represents a mesurement with empty beam
3. Dark Current correction $\to X_{ij} - D_{ij}$ for each measurement: $S_{ij}, EC_{ij}, W_{ij}$
4. Transmission Coefficiency correction $\to \frac{X_{ij} - D_{ij}}{T_X}$ for each measurement: $S_{ij}, EC_{ij}, W_{ij}$
5. Background correction (separate for sample and isotropic material) $\to \frac{S_{ij} - D_{ij}}{T_{s}} - \frac{EC_{ij} - D_{ij}}{T_{ec}}$, $\frac{W_{ij} - D_{ij}}{T_{s}} - \frac{EC_{ij} - D_{ij}}{T_{ec}}$
6. Detector efficiency correction <font size="5"> $\to \frac{\frac{S_{ij} - D_{ij}}{T_{s}} - \frac{EC_{ij} - D_{ij}}{T_{ec}}}{\frac{W_{ij} - D_{ij}}{T_w} - \frac{EC_{ij} - D_{ij}}{T_{ec}}}$ </font>
7. Scaling factor correction $\to$ step 6 $\times \frac{F_w}{F_s}$
For detailed information see: *BERSANS_Tutorial.pdf
5.3 Data Reduction / Absolute Scaling, some Basic know-how*.
[Example of dataReduction (Google Collab)](https://colab.research.google.com/drive/1YQKaDanooFNA8nCvvnxudblzplWE0VZR#scrollTo=6i3cN8qORpMF)
## How to transform data to momentum space
The scattering vector is defined as $\vec{q} = \vec{k_i} - \vec{k_f}$ and the scattering angle is $\theta$, so that:
$q = \frac{4 \pi}{\lambda} \sin \frac{\theta}{2}$.
If we choose $\vec{k_i} = k_i \cdot \vec{e}_z$ and $\vec{k_f} = k_f \cdot \vec{e}_r$, then:
$q_x = - \frac{2 \pi}{\lambda} \sin \theta \cos \phi$,
$q_y = - \frac{2 \pi}{\lambda} \sin \theta \sin \phi$,
$q_z = \frac{2 \pi}{\lambda} (1 - \cos \theta)$,
where $\phi$ is the corresponding azimuthal angle.
## How to perform radial averaging
First we will perform the radial binning, by drawing inscribed circles around the beam center and creating radial strips with the width that equal to the pixel size. Each strip ($i=1, 2, ..., N$) is described by the polar angle $\theta_{i_{min}} < \theta_i \le \theta_{i_{max}}$ with
$\theta_{1_{min}} = 0^{\circ}$,
$\theta_{(i+1)_{min}} = \theta_{i_{max}}$,
$\theta_{i_{max}} = \arctan \frac{i \cdot y_{width}}{z}$,
where $y_{width}$ is the width of one pixel. The corresponding values of $q_{i_{min}}$ and $q_{i_{max}}$ can be obtained using the first equation of the previous section. Ultimately, each bin is then prescribed a value of
$q_{i} = \frac{q_{i_{min}} + q_{i_{max}}}{2}$.
Once the list of $q_i$ values for $N$ bins is evaluated, each pixel is prescribed to the corresponding bin using the following strategy:
1. For each pixel with $q_{pixel}$, we calculate $dq_i = q_{pixel} - q_i$.
2. Find $dq_j = min([dq_1, dq_2, ..., dq_N])$.
3. Prescribed this pixel to the $j^{th}$ bin.
4. Find average "counts" information for each of the bins.
## Some thoughts about what's wrong whith ILL's algorithm:
Run SANSILLReduction() using the parameter "Runs":
1. add some SANS1 parameters to ILL [github commit](https://github.com/mantidproject/mantid/commit/68c839a8e85b6211d35e31410918f830b901d67c)
2. in line 1003 'remove' `self.apply_solid_angle(ws)` [github](https://github.com/mantidproject/mantid/blob/68c839a8e85b6211d35e31410918f830b901d67c/Framework/PythonInterface/plugins/algorithms/WorkflowAlgorithms/SANSILLReduction2.py)
After these manipulations we can run this algorithm, but before that we have to save raw data files to nexus format, because the parameter "Runs" can take only `.nxs` as input.
Run SANSILLReduction() using the parameter "SampleWorkspace" (we can use either "Runs" or "SampleWorkspace")
If we use the "SampleWorkspace" parameter then regardless of the parameter "ProcessAs", the algorithm will work in the same way as with "ProcessAs=Sample". It is not clear to me why we can stil choose "ProcesAs", beacuse it doesn't affect anything.
In my understanding "SampleWorkspace" should be re-named to "InputWorkspace" and the algorithm should allow user to choose "ProcessAs"; that will allow to work with already loaded workspaces and provides benefits: better performance, data visualisation, easier mask detectors. [github commit](https://github.com/mantidproject/mantid/commit/783ba15a02b793c611741937e66d387d4be3acfe)
### Should be tested
* `apply_solid_angle` - *SANSILLReduction2* line 1007 - Calculates solid angle and divides by it - now commited. Algorithm [SolidAngle_v1](https://docs.mantidproject.org/nightly/algorithms/SolidAngle-v1.html) - unexpected behavior during data reduction
* [CalculateDynamicRange](https://docs.mantidproject.org/v6.1.0/algorithms/CalculateDynamicRange-v1.html) - perform automatic calculation of qmin and qmax values (works in a different way then BerSANS)
* teory: with qmin and qmax