I would like to use the msmsTests package for differential expression analysis of proteins (spectral counts); however, the vignette only covers working with a MSnSet class and not how to get your data into that class.
While the pRoloc vignette does mention how to convert csv files to the R data class MSnSet using the readMSnSet constructor function (section 2.2.2) the problem is when trying to follow either the msmsTests or RforProteomics vignettes they use a different type of MSnSet object which produces different results, for example:
pRoloc creation of MSnSet:
f1 <- dir(system.file("extdata", package = "pRolocdata"), full.names = TRUE, pattern = "exprsFile.csv") f2 <- dir(system.file("extdata", package = "pRolocdata"), full.names = TRUE, pattern = "fdataFile.csv") f3 <- dir(system.file("extdata", package = "pRolocdata"), full.names = TRUE, pattern = "pdataFile.csv") tan2009r1 <- readMSnSet(exprsFile = f1, featureDataFile = f2, phenoDataFile = f3, sep = ",") pData(tan2009r1)
RforProteomics creation of MSnSet:
library("msmsEDA")
library("msmsTests")
data(msms.dataset)
## Pre-process expression matrix
e <- pp.msms.data(msms.dataset)
pData(e)
As you can tell the pData() result is different for both of these. It would be nice if I could figure out how to convert the msms.dataset (MSnSet class) to something I can investigate to see how these objects differ - they won't let you investigate using head() or tail(). If I could see what it looks like I could figure out how to convert my data into this class for use in differential analysis and visualization.
Hello and thank you for trying to help me Laurent.
I'll put this into two comments for clarity.
One thing I wanted to know was how to actually inspect the sample data provided in the pRoloc data set. I'll use hyperLOPIT-SIData-ms3-rep12-intersect.csv since this is what you showed above.
I figured out how to inspect it by doing the following:
Then you can simply read that location in via
read.csv()
as a data.frame and inspect what it looks like withhead()
,tail()
,str() etc.
What may not be clear to future people visiting this page is that when you did
8:27 are referring to the columns of NSAF (normalized) spectral counts.
Okay now that's clear to me let's move to the second part of the issue as this example you provided does not work with the msmsTests package (see comment below).
Just one clarification to avoid confusion to other readers - the protein expression data (in columns 8 to 27) aren't NSAF but TMT 10-plex data.