Volcano plot becomes flipped when I add a third condition
1
0
Entering edit mode
Michael • 0
@e828643d
Last seen 15 months ago
United States

I am playing around with some test data and thought to add a third condition (let's just say A & B are treatments, C is the control) to try getting pairwise comparisons but am stuck on two questions. 1) how do I get a lfcshrink for my third condition (i.e. A vs B) and 2) why does -log10p value vs log2 FC plot become flipped even though the reference is still C?

Additional info: I followed the instructions here on multiple comparisons (https://github.com/tavareshugo/tutorial_DESeq2_contrasts/blob/main/DESeq2_contrasts.md) and I duplicated and renamed A to make dataset B. So A vs C and B vs C should be the same while A vs B is 0.

Here is my code

dds <- DESeqDataSetFromMatrix(countData = my_cts,
                          colData = my_coldata,
                          design = ~ condition)
dds

#prefilter row sums with less than 10 reads
keep <- rowSums(counts(dds)) >= 10
dds <- dds[keep,]


#factor levels
dds$condition <- relevel(dds$condition, ref = "C") #isn't this how I set the control?

#do i need these?
    #dds <- estimateDispersions(dds)
    #dds <- nbinomWaldTest(dds) 

#deseq
dds <- DESeq(dds)

#check coefficients estimated by DEseq
resultsNames(dds)

#obtain results
res1_aa_ctrl <- results(dds, contrast = c("condition_A_vs_C")) #still backwards
res1_bb_ctrl <- results(dds, contrast = list("condition_B_vs_C"))
res1_aa_bb <- results(dds, contrast = list("condition_A_vs_C", 
                                                 "condition_B_vs_C"))

# define the model matrix
mod_mat <- model.matrix(design(dds), colData(dds))
mod_mat

# calculate coefficient vectors for each group
ctrl <- colMeans(mod_mat[dds$condition == "C", ])
aa <- colMeans(mod_mat[dds$condition == "A", ])
bb <- colMeans(mod_mat[dds$condition == "B", ])

#obtain results for each pairwise contrast
res2_aa_ctrl <- results(dds, contrast = ctrl - aa) #order can affect distribution (I reversed these to get the "correct" distribution but originally it was written the other way to show that res2 gets the same result as res1. 
res2_bb_ctrl <- results(dds, contrast = ctrl - bb)
res2_aa_bb <- results(dds, contrast = aa - bb)

# plot the results from the two approaches to check that they are identical (therefore you could use any of the appraches)
plot(res1_aa_ctrl$log2FoldChange, res2_aa_ctrl$log2FoldChange) 

#shrink results for better viewing
resLFC <- lfcShrink(dds, coef = "condition_A_vs_C", type = "apeglm")

Afterwards, I plot my graphs:

#volcano plot
data <- res1_aa_ctrl
library(EnhancedVolcano)
EnhancedVolcano(data,
    lab = rownames(data),
    x = 'log2FoldChange',
    y = 'pvalue')

gives me res1 aa ctrl

but what I expected/wanted was this (which I got yesterday without having to use this method and with only A and C datasets) res2 aa ctrl

Also when should I relevel the dataset to find A vs B? I tried using this code but it didn't work (cant set dds to another variable name for some reason)

dds2 <- dds
relevel(dds$condition, ref="B")
dds <- nbinomWaldTest(dds)
resultsNames(dds)
lfcShrink(dds = dds2, coef = 3, type = "apeglm")
DESeq2 volcanoplot deseq HELP contrast • 1.9k views
ADD COMMENT
0
Entering edit mode

If it helps clarify, my numerator should be A and denominator should be C. A is significantly downregulated compared to C, but for some reason I am seeing the opposite. Am I flipping the conditions somewhere in my code? I triple checked that I am loading the samples and controls correctly, so it's not that.

ADD REPLY
0
Entering edit mode
Basti ▴ 780
@7d45153c
Last seen 4 days ago
France

Hi,

When you set a reference level, the group is compared to the reference level.

Setting results(dds, contrast = c("condition","A","C")) is equivalent to results(dds, contrast = aa - ctrl) and the inverse of results(dds, contrast = ctrl - aa), so this is normal that you observe inverse volcanoplots because your reference levels are inverse.

Pay attention to your code syntax, you can either use results(dds, contrast = c("condition","A","C")) or results(dds, name = c("condition_A_vs_C")) but results(dds, contrast = c("condition_A_vs_C")) does not exist.

Concerning the lfcshrink part, you were on the right way but pay attention to your syntax :

dds2=dds
dds2$condition <- relevel(dds2$condition, ref = "B")  #pay attention here that you need to ASSIGN the variable with new levels to the current dds2
dds2 <- DESeq(dds2)
resultsNames(dds2)
lfcShrink(dds = dds2, coef = "condition_A_vs_B", type = "apeglm")
ADD COMMENT
0
Entering edit mode

I understand but the part I was confused by was that I thought the last element in contrast needed to be the reference and that my results should show that all the genes are downregulated in treatment A. I used results(dds, contrast = c("condition","A","C")) before on my 2 treatment dataset and got this result, but when I tried to make a 3 treatment dataset, this plot was flipped unless I made a new equation and changed the order results(dds, contrast = ctrl - aa). But I don't know why this happened.

ADD REPLY
0
Entering edit mode

Please include a small example dataset (cts and coldata), because I cannot reproduce your issue here (to add a dataset, use dput(head(cts)), etc)

ADD REPLY

Login before adding your answer.

Traffic: 511 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6