significance of changes in TMT proteomics data using vsn2 and limma
1
0
Entering edit mode
@lukasburger-7892
Last seen 9.2 years ago
Switzerland

I am working on TMT-labelled proteomics data (3 conditions in triplicates). Looking at the mean-variance relationship of my data on the peptide-level, it appears to be in good agreement with an additive-multiplicative error model, so I used vsn2 to transform the data. Now that the variance is stabilized, I believe I could directly use limma to asses the significance of changes on the peptide level. However, I would like to have a model for the significance on the protein level, which takes into account the number of detected peptides per protein (which is quite variable) and assigns higher significance to proteins with several peptides showing consistent changes than with just a single peptide showing the same change.  Is there a way to use limma (on the vsn-transformed data) to this end?

 

 

limma vsn2 • 3.2k views
ADD COMMENT
0
Entering edit mode

I can't speak for what happens with proteomics data, but in general, a variance-stabilizing normalization is not a prerequisite for analyses with limma. Instead, you can model the mean-variance relationship by running eBayes with trend=TRUE.

Edit: To be clear, I'm referring to the VSN procedure done by method="vsn". Most analyses start off with log-transformed intensities, which already stabilizes the variance a bit. My point is that we usually don't bother with more sophisticated stabilization procedures, and trust limma (or voom, for RNA-seq) to handle the modelling of the mean-variance relationship.

ADD REPLY
0
Entering edit mode

Aaron - that's a surprising statement, chapters 6 and 8 of the limma users guide recommend log-transformation and background correction, which together have an approximate variance-stabilising effect. Are you saying one shouldn't do this and just feed untransformed intensities into limma? (for microarrays, or any other technology)


 

ADD REPLY
0
Entering edit mode

You're right, my apologies; I was referring specifically to the method="vsn" option in normalizeBetweenArrays. Comment's been amended.

ADD REPLY
1
Entering edit mode
@ryan-c-thompson-5618
Last seen 10 weeks ago
Icahn School of Medicine at Mount Sinai…

Limma fits a separate linear model to each "feature", which in your case is peptides. However, you could use a self-contained gene set test like roast (also part of limma) to combine the results for proteins with multiple peptides, giving a p-value for each protein representing the null hypothesis that none of its peptides are differentially expressed. You would treat each protein as a "gene set" consisting of all the peptides associated with it.

Also, if your proteomics data consists of discrete counts of peptides, you may want to try either voom or edgeR if their assumptions match your data, since both are designed precisely for analyzing count data. (roast is also available for these analyses)

ADD COMMENT
0
Entering edit mode

Ryan, I think Lukas is looking for something more robust, and possibly more aware of technology-specific effects, than meta-analysis techniques from gene set enrichment analysis.

E.g. in the maxquant paper http://www.nature.com/nbt/journal/v26/n12/full/nbt.1511.html they quantify the protein with the median of the peptide-level data, in http://www.mcponline.org/content/9/9/1885.long we (vaguely) recommended trimmed mean. Others on this list probably know about more sophisticated summarisation methods.

vsn2 is designed precisely for data where the variance v depends on the mean m through a relationship of the form v(m) = c*m^2 + b. One can easily check the fit of this assumption on real data. Statements of the form 'method X was designed for count data' on the other hand seem less verifiable.

 

ADD REPLY

Login before adding your answer.

Traffic: 588 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6