Microarray PCA/MDS/SVD
1
0
Entering edit mode
Yannick Wurm ▴ 220
@yannick-wurm-2314
Last seen 10.2 years ago
Dear list, could someone recommend a review or book on PCA/MDS/SVD/Factor Analysis techniques & best practices for gene expression data? I want to visualize each of my samples (3 different conditions; 20 timepoints - no need to visualize replication because things will become too messy) . But I'm frankly a bit overwhelmed by the plethora of options. My doubts include: - when it's appropriate to use which technique - should I use it on my complete normalized gene expression data set? Or only on significant genes? Or on the covariance matrix between microarrays? - even for a simple PCA, there are an overwhelming number of implementations in R (ade's dudi.pca, prcomp, princomp, several in MASS, several in pcaMethods) thanks :o) yannick -------------------------------------------- yannick . wurm @ unil . ch Ant Genomics, Ecology & Evolution @ Lausanne http://www.unil.ch/dee/page28685_fr.html
• 1.3k views
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 3 months ago
EMBL European Molecular Biology Laborat…
Hi Yannick this one is a good start: The Elements of Statistical Learning Data Mining, Inference, and Prediction Series: Springer Series in Statistics Hastie, Trevor, Tibshirani, Robert, Friedman, Jerome 1st ed. 2001. Corr. 3rd printing, 2003, 552 p., Hardcover ISBN: 978-0-387-95284-0 A second edition is coming out early next year. Re your points - see below: > could someone recommend a review or book on PCA/MDS/SVD/Factor Analysis > techniques & best practices for gene expression data? > > I want to visualize each of my samples (3 different conditions; 20 > timepoints - no need to visualize replication because things will become > too messy) . I am not sure I understand: making sure that variation between replicates is small compared to the variation between your conditions and timepoints seems like a basic sanity test - without which anything that follows would be waste of time. > But I'm frankly a bit overwhelmed by the plethora of options. > > My doubts include: > - when it's appropriate to use which technique > - should I use it on my complete normalized gene expression data > set? Or only on significant genes? Depends on what you want to see. The two options look for different things. Best wishes Wolfgang ---------------------------------------------------- Wolfgang Huber, EMBL-EBI, http://www.ebi.ac.uk/huber > Or on the covariance matrix between > microarrays? > - even for a simple PCA, there are an overwhelming number of > implementations in R (ade's dudi.pca, prcomp, princomp, several in MASS, > several in pcaMethods) > > thanks :o) > > yannick > > > > -------------------------------------------- > yannick . wurm @ unil . ch > Ant Genomics, Ecology & Evolution @ Lausanne > http://www.unil.ch/dee/page28685_fr.html >
ADD COMMENT
0
Entering edit mode
Dank Wolfgang, I'll look into the book. >> could someone recommend a review or book on PCA/MDS/SVD/Factor >> Analysis techniques & best practices for gene expression data? >> I want to visualize each of my samples (3 different conditions; 20 >> timepoints - no need to visualize replication because things will >> become too messy) . > > I am not sure I understand: making sure that variation between > replicates is small compared to the variation between your > conditions and timepoints seems like a basic sanity test - without > which anything that follows would be waste of time. I tried several simple PCAs & see that variation between conditions & timepoints is much bigger than between replicates. Now I want to be able to make some general statements about the conditions - eg: which are more similar to each other. But to do that I'd like to understand what I'm doing. In some places I've seen that people simply disregard the first prinicipal component for microarray data. And I see that some distances (eg: using "maximum" distance with cmdscale) give quite different results than manhattan or euclidian distances... Thanks & regards, yannick >> But I'm frankly a bit overwhelmed by the plethora of options. >> My doubts include: >> - when it's appropriate to use which technique >> - should I use it on my complete normalized gene expression >> data set? Or only on significant genes? > > Depends on what you want to see. The two options look for different > things. > > Best wishes > Wolfgang > > ---------------------------------------------------- > Wolfgang Huber, EMBL-EBI, http://www.ebi.ac.uk/huber > > >> Or on the covariance matrix between microarrays? >> - even for a simple PCA, there are an overwhelming number of >> implementations in R (ade's dudi.pca, prcomp, princomp, several in >> MASS, several in pcaMethods) >> thanks :o) >> yannick >> -------------------------------------------- >> yannick . wurm @ unil . ch >> Ant Genomics, Ecology & Evolution @ Lausanne >> http://www.unil.ch/dee/page28685_fr.html >
ADD REPLY

Login before adding your answer.

Traffic: 854 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6