Question

Using different filtering steps in different iterations of DEseq2

0

Entering edit mode

marinaw ▴ 20

@9c8b15cf

Last seen 14 months ago

Canada

Hi all,

I have a bit of a theoretical question here. I'm using DEseq2 for DGE analysis between controls and a disease group. Within both groups, I have males and females. I've done DGE analysis controlling for sex as a covariate (so as to omit sex-specific drivers) and now I want to repeat the whole analysis while including a contrast for each sex. The code itself isn't a problem, I'd be setting up a contrast like so:

contrast = c("sex_group", "FemaleSA", "FemaleCTRL")
contrast = c("sex_group", "MaleSA", "MaleCTRL")

For the analysis that omitted sex differences, I opted to filter out genes with less than 20 counts in 90% of subjects. My results are informative. Am I obligated to use the exact same filtering step when redoing the analysis (with sex differences considered)? Seems like 20 counts in 90% of subjects might be too stringent. Can I find a better filtering step (and use the same step for the male analysis versus female analysis), while keeping the steps done in my first analysis as they are?

Thanks!

DESeq2 • 1.1k views

ADD COMMENT • link 22 months ago marinaw ▴ 20

score 0 · Answer 1 · 2023-06-12

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 4 days ago

United States

Am I obligated to use the exact same filtering step when redoing the analysis

No you can use different filters on the two datasets, not an issue.

For consistency, you could specify at least 10 counts for X samples in males and X samples in females. For X you might choose the smallest number in table(dds$sex, dds$group).

ADD COMMENT • link 22 months ago Michael Love 43k

0

Entering edit mode

Thank you so much for your feedback!

Since I'm mildly surprised it's not an issue, I might as well ask you about a second concern that I think/thought I had the correct answer to.

The disease I'm looking at affects both sexes but affects a greater number of males - hence there's more control males and diseased males in my cohort than any females because we're quite literally limited in female samples. Because of this, I opted for the contrasts I stated above.

However, I know in your DEseq2 vignette that a more sophisticated approach is to use an interaction term like so (only showing relevant code steps):

DE_dataset_with_svs$Group # check which group is the reference, first one is reference (CTRL should be reference)
DE_dataset_with_svs$Sex # check which sex is the reference, first one is reference    
DE_dataset_with_svs$Sex = relevel(DE_dataset_with_svs$Sex, "Male") #to make male as the reference when investigating females

DE_results <- DESeq2::results(DE_analysis, alpha = 0.05, independentFiltering = FALSE, contrast=list( c("Group_SA_vs_CTRL", "SexFemale.GroupSA"))) #to look at females

Should I avoid this interaction approach when raising the question of sex differences, given the unequal numbers of males and females? I have 17 diseased males, 14 control males, 7 diseased females and 8 control females.

Thank you in advance!

ADD REPLY • link 22 months ago marinaw ▴ 20

0

Entering edit mode

It's hard to say if you're well powered enough to detect the sex-specific disease effects. Maybe consult with a Biostatistician.