DEseq2 extracting results from multifactor design
1
0
Entering edit mode
Akhil • 0
@3a5eb334
Last seen 3.1 years ago
Norway

Hi,

I need some help in building the results tables. I have a multi-factor design i.e, 3 seedlots (Org, Hol, Ves), 4 cultivars (C1, C2, C3, C4) and 3 temperature points (K, P1, P2) and 2 biological replicates(72 samples in total). "Org" and "K" are reference levels for seed lot & temperature respectively. Here is the col data. ID is the grouped variable of seedlot, time and temperature

 ID seedlot cultivar temp
OC1_K    Org   C1   K
OC1_P1   Org   C1   P1
OC1_P2   Org   C1   P2 
OC2_K    Org   C2   K
OC2_P1   Org   C2   P1
OC2_P2   Org   C2   P2 
OC3_K    Org   C3   K
OC3_P1   Org   C3   P1
OC3_P2   Org   C3   P2
OC4_K    Org   C4   K
OC4_P1   Org   C4   P1
OC4_P2   Org   C4   P2
HC1_K    Hol   C1   K
HC1_P1   Hol   C1   P1
HC1_P2   Hol   C1   P2 
HC2_K    Hol   C2   K
HC2_P1   Hol   C2   P1
HC2_P2   Hol   C2   P2
HC3_K    Hol   C3   K
HC3_P1   Hol   C3   P1
HC3_P2   Hol   C3   P2
HC4_K    Hol   C4   K
HC4_P1   Hol   C4   P1
HC4_P2   Hol   C4   P2
VC1_K    Ves   C1   K
VC1_P1   Ves   C1   P1
VC1_P2   Ves   C1   P2 
VC2_K    Ves   C2   K
VC2_P1   Ves   C2   P1
VC2_P2   Ves   C2   P2
VC3_K    Ves   C3   K
VC3_P1   Ves   C3   P1
VC3_P2   Ves   C3   P2
VC4_K    Ves   C4   K
VC4_P1   Ves   C4   P1
VC4_P2   Ves   C4   P2

What I did so far

DEsl <-DESeqDataSetFromMatrix(countData= round(ordta), colData = mdta, design = ~ 1)

keep <- rowSums(counts(DEsl) >= 12) >= 2 
DEsl <- DEsl[keep,]
estimateSizeFactors(DEsl)

1) For genes with DE over temperature across seedlots and cultivars 
DEsl$sc<- factor(paste0(DEsl$seedlot, DEsl$cultivar))
design(DEsl)<- ~ sc + temp +sc:temp
dds<-DESeq(DEsl, reduced= ~ sc+temp , test= "LRT")

2) For genes responding to temp
design(DEsl)<- ~ sc + temp 
dds<-DESeq(DEsl, reduced= ~ sc , test= "LRT")

3)To extract DE results of all interesting pairwise comparisons of a cultivar (within cultivar) 
design$group<- DEsl$id
design(DEsl)<- ~ 0+group
dds<-DESeq(DEsl)
resultsNames(dds)
 "groupHC1_K"  "groupHC1_P1" "groupHC1_P2" "groupHC2_K"  "groupHC2_P1" "groupHC2_P2"
 "groupHC3_K"  "groupHC3_P1" "groupHC3_P2" "groupHC4_K"  "groupHC4_P1" "groupHC4_P2"
  ......
resOC1_1 <-results(dds, contrast= c("group","OC1_P1","OC1_K"))
resOC1_2<-results(dds, contrast= c("group","OC1_P2","OC1_P1"))
resHC2_1<-results(dds, contrast= c("group","HC2_P1","HC2_K"))
...

Correct me if i am wrong in any of the above.

I would like to find genes with DE of a cultivar at a temp between seedlots eg: test DE genes between HC1_P1 and OC1_P1 with HC1_K and OC1_K as reference levels respectively. How do i set up a contrast matrix for this ?

results(dds, contrast= list(c("groupHC1_P1" - "groupHC1_K") , c("groupOC1_P1" - "groupOC1_K") ) ?
timecourse DESeq2 RNASeq • 1.2k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 9 hours ago
United States

Unfortunately, I have to restrict my time to software related questions here. Your question is one of how to set up the statistical analysis and interpret the results, and for that I recommend working with a statistical collaborator at your institute, or someone familiar with linear models in R.

ADD COMMENT
0
Entering edit mode

Hi, After reading about linear models I think that the following design would let me extract the results I need

design(DEsl)<- ~0+sc+sc:temp
dds_nes<-DESeq(DEsl)
resultsNames(dds_nes)
 [1] "scHC1"        "scHC2"        "scC3"        "scHC4"        "scOC1"        "scOC2"       
 [7] "scOC3"        "scOC4"        "scVC1"        "scVC2"        "scVC3"        "scVC4"       
[13] "scHC1.tempP1" "scHC2.tempP1" "scHC3.tempP1" "scHC4.tempP1" "scOC1.tempP1" "scOC2.tempP1"
[19] "scOC3.tempP1" "scOC4.tempP1" "scVC1.tempP1" "scVC2.tempP1" "scVC3.tempP1" "scVC4.tempP1"
[25] "scHC1.tempP2" "scHC2.tempP2" "scHC3.tempP2" "scHC4.tempP2" "scOC1.tempP2" "scOC2.tempP2"
[31] "scOC3.tempP2" "scOC4.tempP2" "scVC1.tempP2" "scVC2.tempP2" "scVC3.tempP2" "scVC4.tempP2"

res<-results(dds_nes, contrast= list("scHC1.tempP1","scOC1.tempP1")

is this approach right to get this contrast? : - results(dds, contrast= list(c("groupHC1_P1" - "groupHC1_K") , c("groupOC1_P1" - "groupOC1_K") )

Assuming that this is right, I was expecting to see similar results between the following two, but they vary a little bit is this difference expected?

resOC1_1 <-results(dds, contrast= c("group","OC1_P1","OC1_K"), alpha = 0.05, lfcThreshold = log2(1.5))
summary(resOC1_1)

out of 354034 with nonzero total read count
adjusted p-value < 0.05
LFC > 0.58 (up)    : 1447, 0.41%
LFC < -0.58 (down) : 223, 0.063%
outliers [1]       : 0, 0%
low counts [2]     : 219645, 62%
(mean count < 12)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

res_nesOC1_1<- results(dds_nes, name="scOC1.tempP1", alpha = 0.05, lfcThreshold = log2(1.5))
summary(res_nesOC1_1)
out of 354034 with nonzero total read count
adjusted p-value < 0.05
LFC > 0.58 (up)    : 1393, 0.39%
LFC < -0.58 (down) : 222, 0.063%
outliers [1]       : 0, 0%
low counts [2]     : 219645, 62%
(mean count < 12)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
ADD REPLY
0
Entering edit mode

Again, I'll recommend you take questions about statistical design of your experiment to a statistical collaborator. I just don't have sufficient time to field statistical consulting questions on the support site, I have to restrict to software related issues.

ADD REPLY

Login before adding your answer.

Traffic: 648 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6