logFC when counts are zero across all samples in the two groups
1
1
Entering edit mode
@aditya-bandla-24072
Last seen 3.9 years ago
Singapore/National University of Singap…

My dataset from a microbiome study consists of two factors, one with two levels and the other with three. I am interested in both the fixed effects and the interaction. The first factor is fire with levels burnt and non-burnt. The second factor is depth with levels 0-5, 35-40 and 95-100.

I used the phyloseq wrapper phyloseq_to_deseq2 to convert my phyloseq object to one compatible with DESeq2. I then ran an LRT test

fire_dds <-
  phyloseq_to_deseq2(btp_ps_fire_filt, ~ fire + depth + fire:depth) %>%
  DESeq(., sfType = "poscounts", test = "LRT", reduced = ~fire + depth)

One of the contrasts I was interested in was to look for differences between 35-40 & 0-5 for non-burnt, which I extracted using

resultsNames(fire_dds)
'Intercept' 'fire_non_burnt_vs_burnt' 'depth_35_40_vs_0_5' 'depth_95_100_vs_0_5' 'firenon_burnt.depth35_40' 'firenon_burnt.depth95_100'

results(fire_dds,
  contrast = c(0, 0, 1, 0, 1, 0), 
  test = "Wald"
)

However, when I inspected the results using plotCounts & also by examining the counts matrix accessed using counts(fire_dds, normalised = TRUE, I could see for some of the features which had a non-zero logFC & padj <= 0.05, the counts were infact zero in all the samples across both levels. One such feature was ASV_325. I am confused how to interpret this or what gives rise to such a result. I was expecting an NA to be returned in this case.

I also came across a handful of features which had counts only in one sample across both groups, yet had a padj <= 0.05 with a logFC. Would appreciate any help on how to interpret these results or any filtering that can suppress such features?

The dds object as an rds file is available here https://bit.ly/3jrcGx3

deseq2 • 1.1k views
ADD COMMENT
3
Entering edit mode
@mikelove
Last seen 5 days ago
United States

I've seen this happen in non-RNA-seq datasets when the counts are not plausibly negative binomial, e.g. bimodal within the groups. So then one approach would be to use non-parametric testing such as Wilcoxon-based methods.

But I think another practical approach that I've seen to work in these cases is to use lfcShrink on this comparison and then filter out any features with small shrunken LFC.

ADD COMMENT
0
Entering edit mode

Thanks, Mike. I gave it a go, but the problem still persists. The feature still has a large LFC after shrinkage.

I used lfcShrink(dds = dds, res = res, contrast = contrast, type = "ashr") as type = "normal" wasn't compatible with a design with interactions and I am not sure how to specify coef for a contrast that combines multiple coefficients.

Given these, my current solution is to do post-hoc filtering getting rid of features which are not present in a minimal number of samples in either of the groups. Looks like I might have to take a deeper look at the distributions to see what's going on.

ADD REPLY
1
Entering edit mode

Sorry, I think this would work with apeglm specifically.

ADD REPLY
1
Entering edit mode

Thanks, Mike. I used lfcShrink with apeglm & filtered out those with an absolute log FC <1 and it worked like a charm!

ADD REPLY

Login before adding your answer.

Traffic: 432 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6