of limma and superfluous arrays

0

Entering edit mode

Yannick Wurm ▴ 220

@yannick-wurm-2314

Last seen 10.6 years ago

Thanks Gordon! cheers, yannick On Jan 31, 2008, at 0:21 , Gordon Smyth wrote: > Dear Yannick, > > From a statistical point of view, you should include in your limma > analysis any arrays you have which will share the same genewise > variances as the arrays involved in your contrasts. > > How do you know which arrays share the same genewise variances? In > practice, this means you should include arrays with very comparable > RNA samples (same tissue, similar treatment), same probe set, > collected and hybridised at the same time, i.e., arrays which really > are part of the same greater experiment. Arrays more different than > that should not be included. > > Best wishes > Gordon > >> Date: Tue, 29 Jan 2008 22:35:41 +0100 >> From: Yannick Wurm <yannick.wurm at="" unil.ch=""> >> Subject: [BioC] of limma and superfluous arrays >> To: bioconductor at stat.math.ethz.ch >> >> Dear List, >> >> I'm starting to do limma analyses on a small timecourse loop design >> with 2-color cDNA chips as follows: >> 0h vs 6h >> 6h vs 24h >> 24h vs 0h >> Four biological replicates -> and then four biological replicates dye >> balanced <- >> >> My targets file begins like this (only the first two sets of three >> listed): >> US22502600_F82_S01.gpr A_0h A_24h >> US22502600_F65_S01.gpr A_24h A_6h >> US22502600_F153_S01.gpr A_6h A_0h >> US22502600_F85_S01.gpr F_0h F_6h >> US22502600_F60_S01.gpr F_24h F_0h >> US22502600_F72_S01.gpr F_6h F_24h >> ... with eight such sets of three. >> >> But then I also have some chips -> against our labs "standard" >> reference RNA: >> US22502600_F67_S01.gpr A_24h Ref >> US22502600_F83_S01.gpr F_24h Ref >> ... and six more >> >> For my limma analysis, I have three options: >> *a*: use only the minimal number of chips (ie each loop of >> three, >> and nothing to connect the loops). In this case, limma is unable to >> estimate one parameter in each small loop (eg the 6h timepoint). I >> can ask how many genes are differentially expressed between 24h >> and 0h: >>> design.noref = modelMatrix(targets.noref, ref="A_0h") >>> fit.noref = lmFit(MA.noref.p, design.noref) >>> cont.matrix= makeContrasts(T24_T0 = >> (A_24h+C_24h+F_24h+K_24h+N_24h >> +Q_24h+R_24h+T_24h -C_0h-F_0h-K_0h-N_0h-Q_0h-R_0h-T_0h)/8, >> levels=design.noref) >>> fit.noref2= contrasts.fit(fit.noref, cont.matrix) >>> fit.noref2=eBayes(fit.noref2) >>> summary(topTable(fit.noref2,n=10000)$adj.P.Val<=0.05) >> >> ---> I get 3668 differentially expressed spots. >> >> *b*: provide my "24h" vs Ref chips as well >> using ref="Ref" in my design and >>> cont.matrix= makeContrasts(T24_T0 = >> (A_24h+C_24h+F_24h+K_24h+N_24h >> +Q_24h+R_24h+T_24h -A_0h-C_0h-F_0h-K_0h-N_0h-Q_0h-R_0h-T_0h)/8, >> levels=design) >> >> ---> I get 3796 differentially expressed spots. >> >> >> *c*: use those in *b*, as well as eight additional chips >> done in >> parallel, that are XXX vs Ref. The XXX samples don't connect to >> anything other than Ref (they're superfluous). >> >> ---> I get 3583 differentially expressed spots. >> >> Searching the archives, several posts mentioned that providing more >> chips gives limma a better estimation of variance. Thus it makes >> sense to provide more. And doing so finds more differentially >> expressed genes in *b* than in *a*. >> But so would it be defendable to input all the chips I did in that >> batch to limma? All the chips I've ever done? >> >> And then I get a smaller number of differentially expressed spots in >> *c* than in *b*. Which surprises me, because using more chips should >> make my estimation of variance more precise. Comparing *b* with *c* >> leads me to conclude that the chips I've added to the analysis in *c* >> are funky because they increase estimates of variance, or that the >> chips in *b* show artificially low variance. >> >> Does this make sense? >> Obviously, in this analysis my numbers of differentially expressed >> genes are quite similar in these three cases, and 5% more or less >> significant spots probably won't make a difference. But it would be >> good to know what is most valid for future analyses as well. >> >> >> Thanks and regards, >> >> yannick >> >> >> >> -------------------------------------------- >> yannick . wurm @ unil . ch >> Ant Genomics, Ecology & Evolution @ Lausanne >> http://www.unil.ch/dee/page28685_fr.html > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor

TimeCourse probe limma timecourse TimeCourse probe limma timecourse • 733 views

ADD COMMENT • link 17.2 years ago Yannick Wurm ▴ 220

Login before adding your answer.