Checking features of interest

# Checking features of interest Following data analysis, you will be faced with a limited set of features-of-interest (FOI). This document covers necessary steps to manually check the quality of these FOIs. ## Revert clustered metabolite features back to analyte features Any FOIs that represent RAMClusters in reality constitute multiple analyte signals, corresponding to isotopes adducts and in-source fragments. RAMClust features consequently cannot be checked directly in the instrument data. Such RAMClust features must therefore be exploded back into its underlying analyte features. ``` functions code stuff ``` It is not always apparent which is the main analyte feature in a RAMCluster: It is easy to assume that the signal with the highest intensity constitutes the main feature. While this is true for H and C isotopes, this may not be the case for fragments. Remember that some analytes are only or mostly represented by in-source fragments and that higher m/z signals at lower intensity could instead correspond to the main feature. Knowing and looking for neutral losses can be helpful. You may thus want to extract multiple analyte feature for downstream checking and metabolite identification! Moreover, also remember that the main feature could be an adduct and not necessarily [M+H]+ or [M-H]-. ## Confirm correspondence, peak picking and integration The first check is a visualization of the extracted ion chromatogram (EIC) for the reported feature - with and without retention adjustment. This is easily performed using the `XCMS::featureChromatograms()` function. This function will automatically extract rt and mz ranges for the feature. The Boolean argument `adjustedRtime` will allow you to plot with or without retention adjustment. Normally, you'll also want to expand the retention time region to view the peak in context, using the `expandRt` argument. ``` # let's say you're interested in a feature with mz=201.113 and rt=222.95 # present in the xcms object `xdata` # Extract feature definitions fd <- featureDefinitions(xdata) # Find the feature of interest target_mz <- 193.131 target_rt <- 258.7440 whichFt <- which(abs(fd$mzmed - target_mz) < 0.001 & abs(fd$rtmed - target_rt) < 3) # Plot EICs without (top) and with (bottom) retention adjustment par(mfrow = c(2, 1)) feature_chroms <- featureChromatograms(xdata, features = whichFt, expandRt = 20, adjustedRtime = F) plot(feature_chroms, peakType = 'rectangle', peakBg = NA) feature_chroms <- featureChromatograms(xdata, features = whichFt, expandRt = 20, adjustedRtime = T) plot(feature_chroms, peakType = 'rectangle', peakBg = NA) ``` By direct plots of the EICs, you can get an indication of peakshape, signal-to-noise and an overall assessment whether the raw data likely contain peaks or not. By comparing EICs with and without retention adjustment, you can assess whether the peaks reported in the feature actually correspond to a feature, or if some type of misalignment has likely occurred. The rectangles highlight the integration borders, so you can confirm whether the peak picking was accurately performed or not. These plots can rapidly become cluttered and you may want to downsample to plot only a smaller set of injections. **TO CHECK: At present, I don't know if it's possible to keep the features and RT adjustment if you filter down?** If these plots indicate that peaks and correspondence look fit-for-purpose, you have now successfully validated that your feature is an accurate representation of the original data. ## Investigate potential contaminants for MS2 analysis Normally, FOIs will be identified by matching MS2 spectra to in-house and public/online databases. However, you'll need to remember that recorded MS2 spectra come from the collision of analytes (precursors) that have been isolated by a quadrupole. While the mass of the MS1 feature is recorded in the TOF with high accuracty and used to set the precursor isolation in the quadrupole, the quadrupole only has a mass resolution of approx 0.5 Da. **Consequently, any analyte that co-elutes with your FOI and has m/z within your target m/z ± 0.5 will contaminate your MS2 spectrum!** The easiest way to examine potential precursor contamination is by plotting EIC for the feature, but with an expanded m/z range, using the `expandMz` argument. ``` # Plot EICs with original feature m/z range (top) and with gradually increasing m/z ranges to investigate potential MS2 precursor contamination par(mfrow = c(4, 1)) feature_chroms <- featureChromatograms(xdata, features = whichFt, expandRt = 20) plot(feature_chroms, peakType = 'rectangle', peakBg = NA) feature_chroms <- featureChromatograms(xdata, features = whichFt, expandRt = 20, expandMz = 0.1) plot(feature_chroms, peakType = 'rectangle', peakBg = NA) feature_chroms <- featureChromatograms(xdata, features = whichFt, expandRt = 20, expandMz = 0.25) plot(feature_chroms, peakType = 'rectangle', peakBg = NA) feature_chroms <- featureChromatograms(xdata, features = whichFt, expandRt = 20, expandMz = 0.5) plot(feature_chroms, peakType = 'rectangle', peakBg = NA) ``` If these plots show no additional appearing peaks with expanding m/z up to the mass resolution of the quadrupole, then recorded MS2 spectra are likely to represent your FOI exclusively. If not, then you should be aware of contaminations, leading to possible effects in identification: - Not all MS2 spectal peaks will match to pure standards - Matching to data from pure reference standards will be harder The contribution of contaminating analytes will obviously vary according to the relative concentrations of analytes in the collision cell as well as their propensity for fragmentation.