Hi there,
I am trying to filter some mgf files with MSnbase. After reading in the mgf file with readMgfData()
, I noticed that the output of writeMgfData()
contains missing values in the SCANS field.
I believe this is due to the missingness in the acquisition number of the spectrum (see below).
Shouldn't the scan field be preserved in this case? What would be the best way to preserve the scan value from the input mgf in the output?
> library(MSnbase)
>
> system('head ~/Downloads/example.mgf')
BEGIN IONS
TITLE=RawFile: YHE010_02_Slot1-1_1_2987 Charge: 2 FeatureIntensity: 107857 Feature#: 1601 RtApex: 132.65 Precursor: 16
INSTRUMENT=ESI-QUAD-TOF
PEPMASS=414.712204
CHARGE=2+
RTINSECONDS=132.65
SCANS=1601
204.1344 76
209.1016 921
209.1097 174
> mgf <- readMgfData(filename = '~/Downloads/example.mgf')
>
> mgf@featureData@data
TITLE INSTRUMENT PEPMASS
X1 RawFile: YHE010_02_Slot1-1_1_2987 Charge: 2 FeatureIntensity: 107857 Feature#: 1601 RtApex: 132.65 Precursor: 16 ESI-QUAD-TOF 414.712204
X2 RawFile: YHE010_02_Slot1-1_1_2987 Charge: 2 FeatureIntensity: 478 Feature#: 1602 RtApex: 121.43 Precursor: 16 ESI-QUAD-TOF 414.865694
X3 RawFile: YHE010_02_Slot1-1_1_2987 Charge: 2 FeatureIntensity: 59363 Feature#: 4574001 RtApex: 1956.81 Precursor: 45740 ESI-QUAD-TOF 813.431709
X4 RawFile: YHE010_02_Slot1-1_1_2987 Charge: 3 FeatureIntensity: 1022373 Feature#: 4976001 RtApex: 2009.84 Precursor: 49760 ESI-QUAD-TOF 1270.314164
CHARGE RTINSECONDS SCANS
X1 2+ 132.65 1601
X2 2+ 121.43 1602
X3 2+ 1956.81 4574001
X4 3+ 2009.84 4976001
> acquisitionNum(mgf)
X1 X2 X3 X4
NA NA NA NA
>
> writeMgfData(object = mgf,con = '~/Downloads/output.mgf')
> system('head ~/Downloads/output.mgf')
COM=Experimentexported by MSnbase on Tue Nov 3 11:51:16 2020
BEGIN IONS
SCANS=NA
TITLE=msLevel 2; retentionTime 132.65; scanNum NA; scanIndex 1601; precMz 414.7122; precCharge 2
RTINSECONDS=132.65
PEPMASS=414.7122
CHARGE=2+
204.1344 76
209.1016 921
209.1097 174
>
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] MSnbase_2.16.0 ProtGenerics_1.22.0 S4Vectors_0.28.0 mzR_2.24.0 Rcpp_1.0.5 Biobase_2.50.0 BiocGenerics_0.36.0
loaded via a namespace (and not attached):
[1] BiocManager_1.30.10 pillar_1.4.6 compiler_4.0.2 plyr_1.8.6 iterators_1.0.13 zlibbioc_1.36.0
[7] tools_4.0.2 digest_0.6.27 ncdf4_1.17 MALDIquant_1.19.3 lifecycle_0.2.0 tibble_3.0.4
[13] preprocessCore_1.52.0 gtable_0.3.0 lattice_0.20-41 pkgconfig_2.0.3 rlang_0.4.8 foreach_1.5.1
[19] rstudioapi_0.11 dplyr_1.0.2 IRanges_2.24.0 generics_0.1.0 vctrs_0.3.4 grid_4.0.2
[25] tidyselect_1.1.0 glue_1.4.2 impute_1.64.0 R6_2.5.0 XML_3.99-0.5 BiocParallel_1.24.0
[31] limma_3.46.0 ggplot2_3.3.2 purrr_0.3.4 magrittr_1.5 scales_1.1.1 pcaMethods_1.82.0
[37] codetools_0.2-16 ellipsis_0.3.1 MASS_7.3-53 mzID_1.28.0 colorspace_1.4-1 affy_1.68.0
[43] doParallel_1.0.16 munsell_0.5.0 vsn_3.58.0 crayon_1.3.4 affyio_1.60.0