Question

Overwhelmed by DESEq2 options and how to answer my questions.: Two conditions and Three families

0

Entering edit mode

lgspeight • 0

@ca24c8d6

Last seen 2.6 years ago

United States

I have been working with DESeq2 for the past couple of months analyzing my data, I have read over the vignette many times, found other workshops, read message boards, but I still second guess my decisions and the options I have chosen.

Basically my design is that I have multiple clam lines, lets say 3 (A,B,C), and two salinities I am comparing (35 ppt vs 15 ppt). Salinity 35 ppt is my control, but I do not have a control or reference clam line.

The questions I am asking are: 1) How does the Hard clam respond to low salinity (15 vs 35), so regardless of clam line, how does this species respond to 15 ppt? 2) Do different clam lines respond differently to low salinity/ what genes are differentially expressed between (A&B), (B&C), (C&A) in 15 ppt?

I have come up with many different ways to approach these questions, but which approach is best or the right one?

I have had suggestions that I need to flip these questions and first ask question number 2 then 1.

I have struggled with if I need an interaction term or just groups. Do I just put salinity in the model and leave clam line out and vice versa to answer these different questions? When I but multiple variables or an interaction term in, the coefficient start to get vary confusing, especially since I don't have a reference clam line, but DESeq2 make one of my clam lines the reference.

Then there is the decision of shrinkage estimators. I have decided that Apeglm is best with my data. Ashr leaves dispersion outliers among my significant genes. However, contrast statements cannot be used with Apeglm.

Do I need to run multiple models or can I use one? What is ethical? I am defending my thesis in January and am in the process of creating my results. But I am terrified of doing something wrong and in the very end, when I go to defend or publish, all my results are incorrect.

If you have any guidance or suggestions, that would be great. Please don't just point me to a link or the vignette, because I have most likely read it and feel like everyone has a different solution to similar problems.

My advisors have limited experience with RNA-Seq and neither have used DESeq2, so I have been figuring this all out on my own.

I appreciate your time reading this and responding.

Cheers,

Leslie

DESeq2 • 988 views

ADD COMMENT • link 2.6 years ago lgspeight • 0

score 0 · Answer 1 · 2022-09-07

I have struggled with if I need an interaction term or just groups

Well, you use different things for different questions.

Use the grouping way described at the beginning of the "Interactions" section A_15, C 35, etc to do a simple comparison of a subset of samples to another subset. While you could do this with interaction...it's way easier to understand with the grouping way. And while you could make a subset of dds with just one line..in general, it's better not to do this, and you don't have to, because you can specify the exact subset of samples you want to contrast, no matter what is set as the reference level.

To see if different lines respond differently to the salinity challenge, use interactions. Here, you are pretty much going to have to use names from ResultsNames, which means you might have to relevel to make sure that you get a ResultName that specifies the two lines you want to compare.

I think you should just try to tackle this empirically. Get all your normalized counts, get averages for the different groupings. The DESeq2 results should be fairly close to the difference of averages you work out in Excel. Its important to make sure that the answer DESeq2 is giving you matches the question you intended to ask.