Question

Experimental design with edgeR and DESeq packages (RNA-seq)

0

Entering edit mode

Yvan Wenger ▴ 50

@yvan-wenger-5608

Last seen 6.8 years ago

Hi everybody, I just started using edgeR and DESeq and am looking for a confirmation that I am not doing a silly thing. Basically, we have 7 conditions and for only 2 of these sample we have biological triplicates. Let us say that the samples are "A", "A", "A", "B" , "C" (most of the genes are NOT regulated in my experiment). Finally, let us say we just want to compare "B" to "C", but using all the information available. Can we use all the dataset for estimating the common and tagwise dispersion? Typically using the commands (note that I compare here "B" to "C", thus samples without replicates). edgeR: countTable=read.table('mytable',header=F,row.names=1) ; dge <- DGEList(counts=countTable,group=c("A","A","A,"B","C")) ; dge <- calcNormFactors(dge) ; dge <- estimateCommonDisp(dge) ; dge <- estimateTagwiseDisp(dge) ; et <- exactTest(dge, pair=c("B","C")) or DESeq: countTable = read.table('mytable.csv', header=F,row.names=1) ; design = data.frame(row.names = colnames(countTable),condition = c("A","A","A,"B","C")) ; condition = design$condition;cds=newCountDataSet(countTable,condition);cds=estimat eSizeFactors(cds);cds=estimateDispersions(cds);res=nbinomTest(cds,"B", "C") Is it ok to do so (to use samples not compared in the end to estimate the dispersion) Does this correspond to the example "working partially without replicates" from the DESeq manual) ? Or should I just consider that there is no replicates for sample B and C and proceed by ignoring other samples completely ? Many thanks ! Yvan

edgeR DESeq edgeR DESeq • 1.3k views

ADD COMMENT • link updated 12.0 years ago by Gordon Smyth 51k • written 12.0 years ago by Yvan Wenger ▴ 50

score 0 · Answer 1 · 2012-11-17

> Date: Thu, 15 Nov 2012 12:09:10 +0100 > From: Yvan Wenger <yvan.wenger at="" unige.ch=""> > To: bioconductor at r-project.org > Subject: [BioC] Experimental design with edgeR and DESeq packages (RNA-seq) > > Hi everybody, > > I just started using edgeR and DESeq and am looking for a confirmation > that I am not doing a silly thing. > > Basically, we have 7 conditions and for only 2 of these sample we have > biological triplicates. Let us say that the samples are "A", "A", "A", > "B", "C" (most of the genes are NOT regulated in my experiment). > Finally, let us say we just want to compare "B" to "C", but using all > the information available. Can we use all the dataset for estimating the > common and tagwise dispersion? Typically using the commands (note that I > compare here "B" to "C", thus samples without replicates). > > edgeR: > countTable=read.table('mytable',header=F,row.names=1) ; dge <- > DGEList(counts=countTable,group=c("A","A","A,"B","C")) ; dge <- > calcNormFactors(dge) ; dge <- estimateCommonDisp(dge) ; dge <- > estimateTagwiseDisp(dge) ; et <- exactTest(dge, pair=c("B","C")) Yes, this is a perfectly standard analysis. edgeR estimates the genewise dispersion values from the three replicates for Group A and uses these dispersions even though you are comparing B to C. The assumption here is obviously that A, B and C are similar populations, so that genes with higher biological coefficient of variation (BCV) in condition A also tend to have higher BCV in conditions B and C as well. Gordon > or > > DESeq: > countTable = read.table('mytable.csv', header=F,row.names=1) ; design > = data.frame(row.names = colnames(countTable),condition = > c("A","A","A,"B","C")) ; condition = > design$condition;cds=newCountDataSet(countTable,condition); > cds=estimateSizeFactors(cds);cds=estimateDispersions(cds); > res=nbinomTest(cds,"B","C") > > Is it ok to do so (to use samples not compared in the end to estimate > the dispersion) Does this correspond to the example "working partially > without replicates" from the DESeq manual) ? Or should I just consider > that there is no replicates for sample B and C and proceed by ignoring > other samples completely ? > > Many thanks ! > > Yvan ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}