I have a group line with 2 treatments and one control and I did DE using EdgeR. My design matrix is as below and I run contrast for those samples using pairwise and anova test. Although their LigFC are the same for DE genes, their FDR are different.
data
Ensembl_ID Gene_name H9_wt_B1 H9_wt_B2 H9_KO_B1 H9_KO_B2 H9_KI_B1 H9_KI_B2
## 1 ENSG00000000003 TSPAN6 4274 4233 3755 4937 4681 4061
## 2 ENSG00000000005 TNMD 116 117 106 154 105 86
## 3 ENSG00000000419 DPM1 2443 2391 2389 2597 3026 2453
design
## samples
## H9_wt_B1 H9_wt
## H9_wt_B2 H9_wt
## H9_KO_B1 H9_KO
## H9_KO_B2 H9_KO
## H9_KI_B1 H9_KI
## H9_KI_B2 H9_KI
all(rownames(design) %in% colnames(data))
## [1] TRUE
all(rownames(design) == colnames(data))
## [1] TRUE
design.matrix_H9 <- model.matrix(~0+samples,design)
design.matrix_H9
## samplesH9_KI samplesH9_KO samplesH9_wt
## H9_wt_B1 0 0 1
## H9_wt_B2 0 0 1
## H9_KO_B1 0 1 0
## H9_KO_B2 0 1 0
## H9_KI_B1 1 0 0
## H9_KI_B2 1 0 0
## attr(,"assign")
## [1] 1 1 1
## attr(,"contrasts")
## attr(,"contrasts")$samples
## [1] "contr.treatment"
H9_list <- DGEList(counts = data[c(3:8)], group=design$samples, genes=data[1:2])
H9_filter <- filterByExpr(H9_list)
H9_keep<-H9_list[H9_filter, , keep.lib.sizes=FALSE]
H9_keep_norm <- calcNormFactors(H9_keep)
H9_dispersion<- estimateDisp(H9_keep_norm, design.matrix_H9, robust=TRUE)
H9_fited <- glmQLFit(H9_dispersion, design.matrix_H9, robust=TRUE)
contrasts_H9 <- makeContrasts(H9.KOvswt = samplesH9_KO - samplesH9_wt,
H9.KIvswt = samplesH9_KI - samplesH9_wt,
H9.KIvsKO = samplesH9_KI - samplesH9_KO, levels=design.matrix_H9)
contrasts_H9
## Contrasts
## Levels H9.KOvswt H9.KIvswt H9.KIvsKO
## samplesH9_KI 0 1 1
## samplesH9_KO 1 0 -1
## samplesH9_wt -1 -1 0
qlf_H9.contrasts <- glmQLFTest(H9_fited, contrast=contrasts_H9)
topTags(qlf_H9.contrasts)
## Coefficient: LR test on 2 degrees of freedom
## Ensembl_ID Gene_name logFC.H9.KOvswt logFC.H9.KIvswt
## 16448 ENSG00000187325 TAF9B 0.047290741 -8.5215156
## 2694 ENSG00000102710 SUPT20H -0.009879572 -9.8090108
## 6131 ENSG00000129317 PUS7L 0.071584025 -13.0416646
## 7625 ENSG00000138161 CUZD1 1.072742182 -3.5320786
## 17959 ENSG00000198934 MAGEE1 -10.949547315 0.9180734
## logFC.H9.KIvsKO logCPM F PValue FDR
## 16448 -8.568806 5.044505 955.5740 1.059827e-11 1.005574e-07
## 2694 -9.799131 5.032961 949.3324 1.093252e-11 1.005574e-07
## 6131 -13.113249 4.246053 834.5035 1.670160e-10 1.024142e-06
## 7625 -4.604821 3.376219 261.1062 4.800669e-09 2.207828e-05
## 17959 11.867621 2.659453 347.6069 6.771092e-09 2.491220e-05
qlf_H9.KOvswt <- glmQLFTest(H9_fited, contrast=contrasts_H9[,"H9.KOvswt"])
topTags(qlf_H9.KOvswt)
## Coefficient: 1*samplesH9_KO -1*samplesH9_wt
## Ensembl_ID Gene_name logFC logCPM F
## 50201 ENSG00000268658 LINC00664 -8.590411 2.3669917 404.3259
## 17959 ENSG00000198934 MAGEE1 -10.949547 2.6594526 516.2982
## 5095 ENSG00000120738 EGR1 4.377344 4.6205615 218.4309
## 15758 ENSG00000184515 BEX5 -4.938635 0.6839495 188.3181
## 27893 ENSG00000228065 LINC01515 -4.375104 0.9611039 187.7014
## PValue FDR
## 50201 4.027064e-09 5.809191e-05
## 17959 6.315711e-09 5.809191e-05
## 5095 8.535940e-08 5.107783e-04
## 15758 1.367709e-07 5.107783e-04
## 27893 1.388286e-07 5.107783e-04
I want to know why my result in
qlf_H9.contrasts
is different thanqlf_H9.KOvswt
? e.g the LogFC ofMAGEE1
gene in theqlf_H9.contrasts
andqlf_H9.KOvswt
is the same but its FDR is different, and these kind of differences in FDR cause I lost lots of significant genes and had different list of genes when considerqlf_H9.contrasts
orqlf_H9.KOvswt
?which one (Anova or pairwise) is better for my comparisons?
Is it because of my colnames of design.matrix_H9
or the Levels name of contrast_H9
are not the same as the colnames of data!
This isn’t a DESeq2 question. Please don’t add unnecessary tags as it automatically emails package maintainers who are not relevant to your question.
It's hard to answer your question because it isn't clear what you are asking. The anova test and the single contrast tests are different and test different hypotheses, as you already seem to understand. Obviously different tests will give different results. Why would you expect different tests to give the same FDRs?
What scientific comparison do you want to test and what do think you have tested?
Thanks for your reply! In fact I would like to test which Genes differentially Express in H9.kI vs wt and H9.ko vs wt samples, then select common up-down regulated genes between those comparisons. I would like to know can I use Anova or I should consider single comparison, then find common Genes , as righ one?