Entering edit mode
Hello,
I'm using edgeR for analyzing RNA-seq data containing Control (CR) and
two
Treatments (HR and SR) with 2 replicates for each.
Based on the common dispersion, BCV and the MDS plot, it looks like
biological replicates of HR samples does not correlate well.
Therefore, the
Dispersion value I got is 0.39742 and BCV = 0.6304. Could someone
please
comment on whether if the dispersion is too large or OK to proceed
with DE
analysis?.
Here is the code I used and my sessionInfo().
Thank you !
Avinash
library(edgeR)
x <- read.delim(filetxt, row.names=1, stringsAsFactors=FALSE)
group = factor(c("CR", "CR", "HR", "HR", "SR", "SR"))
y <- DGEList(counts=x, group=group)
keep <- rowSums (cpm(y)>1) >= 2
y2 <- y[keep,]
colSums(newCountsTable) / 1e06
y2$samples$lib.size <- colSums(y2$counts)
y3 <- calcNormFactors (y2)
plotMDS(y3)
design <- model.matrix(~0+group, data=y3$samples)
colnames(design) <- levels(group)
design
colnames(design)
y3 <- estimateGLMCommonDisp(y3, design, verbose=TRUE)
#Disp = 0.39742 , BCV = 0.6304
y3 <- estimateGLMTrendedDisp(y3, design)
y3 <- estimateGLMTagwiseDisp(y3, design)
*sessionInfo()*
R version 3.0.2 (2013-09-25)
Platform: x86_64-redhat-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] splines parallel stats graphics grDevices utils
datasets
methods base
other attached packages:
[1] BiocInstaller_1.10.4 edgeR_3.4.2 limma_3.17.21
topGO_2.12.0 SparseM_1.03 GO.db_2.9.0
[7] RSQLite_0.11.4 DBI_0.2-7 AnnotationDbi_1.23.18
Biobase_2.21.6 BiocGenerics_0.7.3 graph_1.38.3
[13] plyr_1.8 reshape2_1.2.2 ggplot2_0.9.3.1
[[alternative HTML version deleted]]