Hi, I have some transcriptomics data from a microorganism collected at different locations. I've been trying to perform an LRT using edgeR to test the location effect on transcription. What I want is to make a full model with all the coefficients, and compare it with a null model in which location plays no role. I thought I had the correct formulas but after the test absolutely all the features/tags are significant (with absurdly small FDR), so I must be doing something wrong. A mockup of my code is as follows:
x <- normalized_counts
site <- factor(c("a","a","b","b","c","c","d","d","e","e"))
design <- model.matrix(~ 0 + site)
y <- DGEList(counts=x,group=site)
y <- calcNormFactors(y)
y <- estimateDisp(y,design)
fit <- glmFit(y,design)
lrt <- glmLRT(fit, coef=c(1:ncol(fit$design)))
In case anyone knows the sleuth package, I previously used the lrt implemented there with the following configuration:
so <- sleuth_prep(s2c
, ~ site
, target_mapping = t2g
, aggregation_column = "ens_gene"
, transformation_function = function(x) log2(x + 0.5)
)
so <- sleuth_fit(so)
so <- sleuth_fit(so
, ~1
, 'reduced')
so <- sleuth_lrt(so, 'reduced', 'full')
And wanted to make a similar test using edgeR. I would greately appreciate any insights.