Dear Michael,
I want to understand if what I did is correct.
Here is the condition_df:
disease_state condition
HC1_BND Healthy BND
HC1_MN Healthy MN
HC2_BND Healthy BND
HC2_MN Healthy MN
HC3_BND Healthy BND
HC3_MN Healthy MN
HC4_BND Healthy BND
HC4_MN Healthy MN
S1_BND Diseased BND
S1_MN Diseased MN
S2_BND Diseased BND
S2_MN Diseased MN
S3_BND Diseased BND
S3_MN Diseased MN
S4_BND Diseased BND
S4_MN Diseased MN
Now, my goal is to find genes that are different for "Diseased and BND" category as opposed to all others. To accomplish that, I did the following:
dds <- DESeqDataSetFromMatrix(countData = htseq_counts[rownames(condition_df)],
colData = condition_df,
design = ~ disease_state + condition + disease_state:condition)
dds <- DESeq(dds)
Running the following command gives me the bulleted list below
resultsNames(dds)
- "Intercept"
- "disease_state_Healthy_vs_Diseased"
- "condition_MN_vs_BND"
- "disease_stateHealthy.conditionMN"
Then I did this:
result_A <- results(dds, contrast=list( c("disease_state_Healthy_vs_Diseased","condition_MN_vs_BND") ))
Running "results
" function gave me a few (30) genes that clustered well for Diseased and BND option along with other(50) genes that clustered for other categories.
Now my questions for you are:
- Is my understanding of the above method correct?
- I am not quite sure if I understand how the
resultsNames
are generated and why isn't there a combination such as:disease_stateDiseased.conditionBND
- as this is what I would be interested in; or how could I get that? -
Is there a better way to just get the genes that are differentially expressed in the Diseased_BND category as opposed to any other?
Thanks much,
Ashu
The goal is to find differentially expressed genes in the group with diseased patients that have the MND condition as opposed to all others. I thought the average of all others would work but now I am unsure. I think I don't understand the right approach to consider to get those genes.
Hi Michael, Not sure if you got my previous message!
I didn't really see a direct question there with respect to the software. I see it as my responsibility to make sure users know how to use the software, but I can't also provide statistical consulting to all the DESeq2 users. The statistical approach is up to you.
Completely understandable!
Not sure why bother asking me to create a new thread.
Good day!!
If you pick among the three possibilities I listed, I can help by showing you an example, but I leave the actual choice of the question to the user when there are multiple different questions one might ask.
Dear Michael,
I would like to try the methods listed below and compare the results, but I don't know how to run DESeq2 for these, I just got confused when you listed so many methods before as I did not even know that DESeq2 can do all of that, may be that is why I sounded like asking for statistical consulting. I tried going through the DESeq2 tutorials and the examples as well earlier but I wanted a confirmation on what I understand from these is correct. English is not my first language and I sometimes misunderstand what people are asking me and I apologize for that.
I wish to try these methods based on your comments:
1.) Compare 1 group against the average of other 3
2.) Compare DE of 1 group against all the other 3
It will be good if you could just tell me how to run DESeq2 for these 2 scenarios. And what scenario would you categorize the method I followed above in my first message to you.
I really appreciate the help and taking the time.
Thanks,
Ashu
I'll make some example data with the example data generating function in DESeq2:
Four groups:
To compare one group against the average of the other three, you could do:
To compare one against all the other three, you would have to do three pair-wise comparisons and then take the intersection. There is not a single contrast which gives you a p-value that the one group is different than all other three in pairwise comparisons.
Great thanks, I just ran the average method you described above and it seems to be picking genes that expect. Just for my understanding, would you be able to explain what is the difference between using interactions while doing the design as opposed to just using 4 groups?
To be precise, what do these commands( also mentioned in my first message above) actually give me:
# two conditions, two genotypes, with an interaction term,
design = ~ disease_state + condition + disease_state:condition
result_A <- results(dds, contrast=list( c("disease_state_Healthy_vs_Diseased","condition_MN_vs_BND") ))
Thanks again,
Ashu
That doesn’t seem like a meaningful contrast to me. I don’t see a way to produce what you’re interested in from the interaction design.
I would still want to understand where could the interaction design be used, I will try to read more and find some examples.
Thanks for your help!