EdgeR glmLRT vs glmQL - hundreds of DE genes with glmLRT, almost none with glmQL
2
0
Entering edit mode
@miyakokodama-11409
Last seen 3.6 years ago
Denmark

Hi,

I am trying to find DE genes using two groups (4 biological samples with two different types of tissues) using RNAseq. I obtained a raw gene count table from FeatureCount. I am using both  glmLRT and glmQF.

glmLRT identified ~600 DE genes, while glmQF identified only 3. What was also surprising was that there was a big difference in FDR values; FDR values obtained from the glmLRT method were around ~ 4.71E-10, while those from glmQF were around 0.05.

I am aware that glmQF is a lot more conservative than glmLRT so it is understandable that I found a fewer DE genes using glmQF, but I am surprised by the degree of differences, both by the number DE genes and FDR values.

Does anyone happen to know what might have caused these differences?

I have attached my code below, as well as the plots produced by plotBCV() and plotQLDisp().

Many thanks in advance,

Miyako

#################

> x <- DGEList(counts=countData, group=colData$Tissue_Type)

# Filter out genes with no count/lowly expressed genes
> cpm <- cpm(x)
> table(rowSums(x$counts==0)==8) # As I have 8 samples
> keep.exprs <- rowSums(cpm>1)>=2
> x <- x[keep.exprs,]
> dim(x)

[1] 19791     8

> des <- factor(colData$Tissue_Type)
> design <- model.matrix(~0+des)
> colnames(design) <- levels(des)
> design

   phloem xylem
1      1     0
2      0     1
3      1     0
4      0     1
5      1     0
6      0     1
7      1     0
8      0     1
attr(,"assign")
[1] 1 1
attr(,"contrasts")
attr(,"contrasts")$des
[1] "contr.treatment"

> contr.matrix <- makeContrasts(PholemvsXylem = phloem-xylem,levels = colnames(design))
> contr.matrix

        Contrasts
Levels   PholemvsXylem
  phloem             1
  xylem             -1

> x <- calcNormFactors(x)
> x <- estimateDisp(x, design, robust=TRUE)

plotBCV(x)

# glmLRT Method
> fit <- glmFit(x, design, robust=TRUE)
> lrt <- glmLRT(fit, contrast=contr.matrix)
> significant_glm <- topTags(lrt, n=Inf, p.value=0.05) # FDR ~ 4.71E-10
> nrow(significant_glm)
[1] 598

# glmQL Method
> fit <- glmQLFit(x, design, robust=TRUE)
> plotQLDisp(fit)
> qlf <- glmQLFTest(fit, contrast=contr.matrix)
> significant_glmQL <- topTags(qlf, n=Inf, p.value=0.05) # FDR ~ 0.04156011
> nrow(significant_glmQL) 
[1] 3

 

##################

plotBCV()plotQLDisp()

edgeR • 2.9k views
ADD COMMENT
0
Entering edit mode

Please use a separate tag for edgeR, otherwise the maintainers won't be notified.

ADD REPLY
0
Entering edit mode

I just corrected it - thank you for pointing that out :)

ADD REPLY
3
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 3 hours ago
The city by the bay

One of the good features of the QL F-test is that it accounts for the uncertainty of the dispersion estimates. This is affected by the amount of empirical Bayes shrinkage - if there's more shrinkage, the precision of the estimates increases, resulting in greater power to detect DE. In contrast, the LRT doesn't consider the uncertainty of the tagwise dispersion estimates, rendering it susceptible to loss of type I error control, i.e., more false positives.

In your case, it seems like there's not that much shrinkage based on your plotQLDisp plot, which means that you won't have a lot of power. I would guess that your prior degrees of freedom would probably be less than 5, perhaps? Now, there's two possibilities here. The first possibility is that the dispersions truly are variable, meaning that a low amount of shrinkage would be appropriate (as it doesn't make sense to try to squeeze dispersions together if they are truly different). The second is that there's a batch effect in your experimental design that you haven't accounted for. Variability in the batch effects across genes will manifest as variability in the dispersions (as well as inflated values for the dispersions themselves), resulting in reduced shrinkage and loss of power.

In the first scenario, there is nothing you can do to improve power other than to increase the number of samples, because the limiting factor (the variability of the dispersions) is an inherent property of the biological system. In the second scenario, you would improve power by adding a blocking factor for the batch effect, if one exists in your experimental design. Make an MDS plot with plotMDS, if you haven't already, and see whether there are xylem/phloem pairings that you should be considering.

ADD COMMENT
0
Entering edit mode
@miyakokodama-11409
Last seen 3.6 years ago
Denmark

Hi,

Thanks so much for explaining it all in detail, Aaron - it's so helpful! 

You're right, my prior degrees of freedom is less than 5 (they're around 4). I did make a MDS plot with plotMDS(), however did not see any batch effect (I have attached the image below); in short, biological sample 1 and 2 (1P, 1X, 2P, 2X; P and X referring to phloem and xylem) were extracted on day 1, while biological sample 3 and 4 were extracted on day 2. All samples were sequenced using one lane. 

While I cannot really see batch effect from my MDS plot, I did add the extraction date to the model out of curiosity to account for it - however the number of DE genes from glmLRT and glmQL did not change much.

I am afraid that this may be a power issue, as you indicated by the first scenario. In that case, what do you think would be an appropriate way to analyze it? Do you think estimateDisp() with robust=TRUE, and then glmLRT would be a good way to go?

Many thanks in advance,

Miyako

###########

 

ADD COMMENT
0
Entering edit mode

For future reference, please post your responses via "Add comment" rather than as an answer.

Anyway, there's clearly a sample effect in your MDS plot, as you can clearly see the pairing of the X/Y samples. Thus, I would go further than blocking on the extraction date, and actually block on the sample number itself:

sample <- factor(c(1,1,2,2,3,3,4,4))
design <- model.matrix(~0+des + sample)

In general, if I don't get a lot of DE genes, I relax the FDR threshold to ~10% rather than switching to another method. This is based on the fact that the QL framework will provide more accurate error control (i.e., "better" p-values) than the LRT. If I use a higher threshold, then I'll get more false positives, but at least I'd be aware of that. If I switch to a less accurate method, then sure - I might get more "discoveries" at a 5% FDR threshold, but the actual FDR may be much higher, and I'll just be fooling myself and others by pretending that it's 5%. This is especially true if the prior d.f. is low where QL and LRT differ the most.

ADD REPLY
0
Entering edit mode

Sorry - I was rather unfamiliar with the logistics here, I will keep that in mind for my future posts :)

I have added the sample effect along with the tissue type - I'll move on with QL instead of LRT then, especially given the low prior d.f. Just as a side note, QL gave out 29 genes at FDR<0.1 (0 at FDR < 0.05), so at least that gives me something to work with...

Thanks so much for taking the time to answer my question! :)

Miyako 

ADD REPLY

Login before adding your answer.

Traffic: 859 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6