Entering edit mode
Xin Davis
▴
30
@xin-davis-5278
Last seen 10.2 years ago
We use DESeq to normalize RNA-seq data for math. modeling purpose. Our
concerns is the length bias. After reading DESeq paper, I think the
data
normalized using DESeq should be be more balanced than EdgeR since the
estimate of dispersion is added. We need to calculate correlation
between
genes and construct gene regulatory network. What's your opinion on
this.
Questions:
1. How to arrange controls and treatments columns in a data set:
WT-2 WT-3 WT-4 4-3-1 4-4-2 4-5-1 4-6-2 7-1-2 7-2-3 7-3-1 8-3-1 8-3-2
This is in one row, 3 wild types, 3 groups of replicates. There are 4
rep.
in group 1 (4-3-1 4-4-2 4-5-1 4-6-2), 3 rep. in group 2 (7-1-2 7-2-3
7-3-1). 2 rep. in group 3 ( 8-3-1 8-3-2).
3 wild types are the controls for 3 groups. Similar to Multi-factor
designs???
How should the data columns arranged? I only know conditions but not
libType.
The read counts/differential expression should be compared with its
relevant controls. Even though the rep. are pooled for dispersion
estimates, I would not think rep. in different groups are pooled
together
for estimates.
2. The output file contains statistical analysis results, I would
think I
should select the genes with padj < 0.05, but I don't see how to get
the
normalized counts, which shuld be decimal instead of integer.
3. MA plot, sometimes I could get the plot, sometimes I could not get
it,
don't know why? I just copied code from the documentation.
plotDE <- function(res)
plot(res$baseMean,res$log2FoldChange,log="x",pch=20,cex=.3,col=ifelse
(res$padj
< .1, "red","black"))
This one doesn't work at all.
plotDispEsts <- function(cds){
plot(rowMeans(counts(cds, normalized=TRUE)),
fitInfo(cds)$perGeneDispEsts,
pch = '.', log="xy")
xg <- 10^seq(-.5,5,length.out=300)
lines(xg,fitInfo(cds)$dispFun(xg),col="red")}
Any guidance would be appreciated.
Thanks,
Xin
[[alternative HTML version deleted]]