Hi all,
I'm having trouble getting reproducible peak areas with findChromPeaks and MatchedFilterParam on centroided high-res MS data. While the peak height (maxo) is spot-on with the manual integration results from the instrument software (Thermo Freestyle), the peak areas are in may cases off by up to 50%.
It seems this is strongly dependent on the FWHM setting in the MatchedFilterParam object, but there is no setting that give consistent results over the range in peak widths.
Am I missing something or is this just how the algorithm models the peak (instead of measuring the area under the peak)?
PS: I tried CentWaveParam first, but this one was missing obvious peaks entirely (and had the same issue with peak areas)
Thanks a lot, -Y
Settings used:
Object of class: MatchedFilterParam
Parameters:
binSize: 0.01
impute: none
baseValue:
distance:
fwhm: 6
sigma: 2.547987
max: 50
snthresh: 100
steps: 1
mzdiff: 0.03
index: FALSE
What exactly do you mean with reproducible peak areas? Do you get different peak areas each time you run the
findChromPeaks
withMatchedFilterParam
on the same file?Hi Johannes,
What I mean is that compared to the manual integration performed on the EIC in the Thermo Fisher software, the 'into' results of Matched Filter are off bu up to 50%. The results shift around depending on the FWHM parameter, but never come close to the 'real' values (from the manual integration). Repeated call to findChromPeaks, however, do return the same values.
I would understand (and accept) if the offset was 10% or less, but this seems to be some bug to me (or a user problem on my end).
Thanks for the help!
Thanks for the clarification! What I suggest is to evaluate the detected chromatographic peak extracting the EIC (with the
chromatogram
function) and plot it. Might be that not the full peak is integrated. You might find some information on that in the package vignette or in this tutorial.Also, I would give centWave another try - the advantage over matchedFilter is that it can find chromatographic peaks with different rt widths. But you need to set the
peakwidth
parameter based on the expected size of your peaks. If you have peaks with an average peak width of e.g. 4 seconds you should set it topeakwidth = c(2, 10)
- but this really depends on your data. Also, I would suggest to useintegrate = 2
, which IMO better defines the peak boundaries/edges.