Question

edgeR heatmap construction and interpretation

0

Entering edit mode

ulin.1 • 0

@0e31959b

Last seen 18 months ago

United States

This might not be the right place to post this. If so, please point me in the right direction so I can solve my problem. Thank you!

So... The reason I'm posting this is because my graduate advisor wants me to fully understand how edgeR constructs heatmaps that have three comparative groups as opposed to two, and I agree. Now, I am not the most experienced in bioinformatics but I am learning.

Based on what I've researched, you read the map the same way as a map with two comparative groups. Am I incorrect in thinking that you read heatmaps the same way no matter how many groups you have?

Another thing that I need to understand is how edgeR constructs the heatmap itself. I have looked at the edgeR manual to see if that says anything and it doesn't seem to help much. After talking to other grad students, it seems like it does TMP, normalizes the data using logs, and then compares the groups across genes, correct? Don't you have to compare one group to another in order to get up and down regulated information? How would this work with three groups?

Any help is appreciated!

heatmaps • 2.4k views

ADD COMMENT • link updated 18 months ago by Gordon Smyth 52k • written 18 months ago by ulin.1 • 0

score 0 · Answer 1 · 2023-10-17

I suggest that you go through an edgeR workflow like this one: edgeRQL workflow. All the code is given and all the steps are explained in the detail. If you want any more explanation about any particular function used in the workflow, just consult the online help (for example) by help("cpm").

Comparing groups does not play any role in the construction of a heatmap, although sometimes one selects genes to display that we know will separate the groups. edgeR doesn't actually make heatmaps at all, it simply creates data suitable for a heatmap and passes that to a heatmap function.

Heatmaps can be exploratory or they can be used a presentation plots to illustrate the results of a DE analysis. In an exploratory heatmap, one plots genes that are variable between samples without regard to group information. In a presentation plot, one usually plots genes that have already been selected by edgeR as being DE. The presentation plot doesn't have a lot of interpretation except to show visually that DE genes are different between groups but consistent within groups. Which of course we knew anyway, otherwise the genes wouldn't be significantly DE. The process is much the same regardless of the number of groups in your experiment. The workflow that I linked to above gives an example of a presentation heatmap.

I suggest you go back to basics and try to clarify what scientific questions you are trying to answer and what role edgeR and heatmaps play in that process. That would be best done in the context of one of your own datasets.

By the way, the is no step called "TMP". Perhaps you meant "TMM".

You say you didn't find anything to help in the edgeR User's Guide, but the User's Guide has a section specifically on heatmaps, which says:

To draw a heatmap of individual RNA-seq samples, we suggest using moderated log-counts-per-million. This can be calculated by cpm with positive values for prior.count, for example
> logcpm <- cpm(y, log=TRUE)
where y is the normalized DGEList object.

The logcpm matrix is then input to a heatmap function. The edgeR Guide doesn't go into details about the heatmap itself because that is external to edgeR.