I'm trying to summarize an MSnSet object from PSM-level to peptide-level, and then to protein-level. I do this in two steps because I want to normalize on the peptide level.
However, doing so
pepqnt <- combineFeatures(qnt, groupBy = fData(qnt)$sequence, fun = sum) pepqnt_S <- normalise(pepqnt, "sum") pepqnt_norm <- normalise(pepqnt_S, "quantiles.robust") protqnt <- combineFeatures(pepqnt_norm, groupBy = fData(pepqnt_norm)$accession, fun = sum)
results in the following error:
Error in value[[3L]](cond) : duplicate 'row.names' are not allowed AnnotatedDataFrame 'initialize' could not update varMetadata: perhaps pData and varMetadata are inconsistent? In addition: Warning message: non-unique values when setting 'row.names': ‘CV.TMT6.126’, ‘CV.TMT6.127’, ‘CV.TMT6.128’, ‘CV.TMT6.129’, ‘CV.TMT6.130’, ‘CV.TMT6.131’
Why isn't this possible? How can I resolve this problem?
FYI: it is possible to go from the PSM-level straight to the protein-level without grouping by sequence first, but that's not what I want. Also, I can't verify whether pData is consistent or not because I can't find it anywhere.
Full code sample below:
library("RforProteomics") library(MSnbase) library(mzR) library(Rcpp) library(rpx) px1 <- PXDataset("PXD000001") mztab <- pxget(px1, "PXD000001_mztab.txt") qnt_incl_NA <- readMzTabData(mztab, what = "PEP", version = "0.9") sampleNames(qnt_incl_NA) <- reporterNames(TMT6) qnt <- filterNA(qnt_incl_NA) pepqnt <- combineFeatures(qnt, groupBy = fData(qnt)$sequence, fun = sum) pepqnt_S <- normalise(pepqnt, "sum") pepqnt_norm <- normalise(pepqnt_S, "quantiles.robust") protqnt <- combineFeatures(pepqnt_norm, groupBy = fData(pepqnt_norm)$accession, fun = sum)
Thanks Laurent, your solution #2 works!
Remarkably enough, solution #1 does not. It produces the same error. I'll also comment on the GitHub issue.
Thank you for your answer - I'll investigate the
CV = FALSE
issue.It should be `cv = FALSE` in lower case - sorry for the confusion.