Question

filtering out lowly expressed genes by groups

0

Entering edit mode

sohail.mbio ▴ 10

@335cd804

Last seen 17 months ago

United States

Hello DESeq2 community,

I'm working on a complex RNA-seq experiment with a large design, involving 5 different timepoints, 2 different stimulation conditions, and 2 disease states, resulting in a total of 96 samples. In my initial preprocessing steps, I've filtered out genes with low expression (<100 counts).

However, as I delve into differential expression analysis, I've encountered a situation where I'm observing many differentially expressed genes (DEGs) with adjusted p-values (padj) < 0.1 in certain comparisons. What's intriguing is that most of these genes have zero transcript counts in the majority of samples within those specific comparison groups. These genes were not filtered out initially because they exhibit expression in other groups. For example (plot below), Deseq2 DE show significance in first two groups, even there was only 1 sample with few transcripts. enter image description here

Now my question is how to get rid of these DEGs? As these are not really DEGs.

RNASeqData DESeq2 • 1.2k views

ADD COMMENT • link updated 18 months ago by James W. MacDonald 68k • written 18 months ago by sohail.mbio ▴ 10

0

Entering edit mode

Is there a question?

ADD REPLY • link 18 months ago James W. MacDonald 68k

0

Entering edit mode

My question is how to avoid these DEGs? There are so many DEGs that are not actually expressed at T0. Is there any filter ? my current code is `dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ Disease + Treatment + Timepoint)

dds <- dds[ rowSums(counts(dds)) > 100, ] dds <- DESeq(dds) dds$group <- factor(paste0(dds$Treatment, dds$Timepoint, dds$Disease)) design(dds) <- ~ group dds <- DESeq(dds)

resultsNames(dds)

res<- results(dds, contrast = c("group","unstimT0CRSwNP", "unstimT0HC"), independentFiltering=TRUE)

`

ADD REPLY • link 18 months ago sohail.mbio ▴ 10

score 0 · Answer 1 · 2023-09-22

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 1 day ago

United States

Something like

results(dds, c("condition","T0_unstim_CRWnP","T0_unstim_HC"))[rowMeans(assays(dds, "counts")[,colData(dds)$condition %in% c("T0_unstim_CRWnP","T0_unstim_HC")] > 10,]

Should do the trick, I imagine. You could even make a function

fun <- function(deseqobj, contvec, cutoff) 
        results(deseqobj, contvec)[rowMeans(assays(deseqobj, "counts")[,colData(dds)[,contvec[1]] %in% contvec[-1])] > cutoff,]

## and run it

fun(dds, c("condition","T0_unstim_CRWnP","T0_unstim_HC"))

None of this is tested, btw, so you might need to adjust.

ADD COMMENT • link 18 months ago James W. MacDonald 68k

0

Entering edit mode

You might want to filter first though.

fun <- function(deseqobj, contvec, cutoff) {
            lildds <- dds[rowMeans(assays(deseqobj, "counts")[,colData(dds)[,contvec[1]] %in% contvec[-1])] > cutoff,], 
             results(lildds, contvec)
}

Which won't invalidate your FDR values.

ADD REPLY • link 18 months ago James W. MacDonald 68k