EnhancedVolcano - highlight specific points by changing shape
1
0
Entering edit mode
n.tear ▴ 20
@ntear-23651
Last seen 3.3 years ago
United Kingdom

Hi,

I am attempting to highlight a number of genes that intersect with a list of EMT genes I have.

When I run the below, any genes that are not EMT genes disappear.. How can I fix this?

top20padj <- head(res.smIHW.cp1df.rmNA.ord$symbol, n=20)
topLFCgenes <- res.smIHW.cp1df.rmNA.ord[order(-abs(res.smIHW.cp1df.rmNA.ord$shrunkLFC)), ]
top20LFCgenes <- head(topLFCgenes$symbol, n=20)

 # define genes that will show as shape 17
VolcanoEMTgenes <- as.vector(EMTgenes$`intersect(EMTdbgenes$GeneSymbol, res.smIHW.cp1df.rmNA.sig$symbol)`)

  # create custom key-value pairs for defined genes
  # this can be achieved with ifelse statements
  keyvals.shape <- ifelse(res.smIHW.cp1df.rmNA.ord$symbol %in% VolcanoEMTgenes, 17, 19)
  keyvals.shape[is.na(keyvals.shape)] <- 19
  names(keyvals.shape)[keyvals.shape == 17] <- 'EMT genes'


library(EnhancedVolcano)
EnhancedVolcano(res.smIHW.cp1df.rmNA.ord,
                lab = res.smIHW.cp1df.rmNA.ord$symbol,
                x = 'shrunkLFC',
                y = 'padj',
                title = '',
                subtitle = '',
                pCutoff = .05,
                FCcutoff = 2,
                pointSize = 3.0,
                labSize = 4.0,
                legendLabels=c('Not sig.','Log2 FC','p-value',
                               'p-value & Log2 FC'),
                selectLab = c(top20LFCgenes, top20padj),
               shapeCustom = keyvals.shape,
                drawConnectors = TRUE,
                widthConnectors = 0.2,
                colConnectors = 'grey30')
enhancedvolcano • 2.1k views
ADD COMMENT
0
Entering edit mode
Kevin Blighe ★ 4.0k
@kevin
Last seen 9 days ago
Republic of Ireland

In the code that you have presented, only genes specified in c(top20LFCgenes, top20padj) will be labeled, and it is expected that these will form part of res.smIHW.cp1df.rmNA.ord$symbol. It follows, also, that only those genes that can be fit onto the plot space without overlaps will be displayed. This part is okay, based on your question.

Regarding the use of shapeCustom, it seems that you simply need to set the names for the non-EMT genes via, for example:

names(keyvals.shape)[keyvals.shape == 19] <- 'non-EMT'

Kevin

ADD COMMENT
0
Entering edit mode

Hi Kevin,

Many thanks for your quick response. Your comments have fixed my issue.

I am however wondering if there is a way to bring these EMT genes to the front? And can I also simultaneously change the colour of these EMT genes without disrupting the colour scheme of LFC/pvalue/NS genes?

Many thanks,

Nathan

ADD REPLY
0
Entering edit mode

I am however wondering if there is a way to bring these EMT genes to the front?

This can be done by re-ordering your input data-frame such that these genes are positioned at the end, meaning that they will be plot last. This has to be done manually, unfortunately.

And can I also simultaneously change the colour of these EMT genes without disrupting the colour scheme of LFC/pvalue/NS genes

For this, you can try one of the new 'highligthing' features (for now, only available in the Development and GitHub versions), but it may not give you what you want, in which case you may have to manually encode the colour scheme too..

ADD REPLY
0
Entering edit mode

Hi Kevin,

Many thanks. I have tried to do as you say and reorder the original dataframe and then attempted to manually encode the colour scheme. However unfortunatly it appears to not be following what I coded.

I wonder if you would be able to easier spot the issue?

I have included the output I get in the link: https://ibb.co/Yfnc3PV

    #import reordered df
    library(readr)
        EMTdbgenes.ord <- read_csv("res.smIHW.cp1df.rmNA.EMTord.csv")
        toppadjgenes <- EMTdbgenes.ord[order(EMTdbgenes.ord$padj), ]
        top20padj <- head(toppadjgenes$symbol, n=20)
        topLFCgenes <- EMTdbgenes.ord[order(-abs(EMTdbgenes.ord$shrunkLFC)), ]
        top20LFCgenes <- head(topLFCgenes$symbol, n=20)

   # define genes that will show as shape 17 and black colour
  VolcanoEMTgenes <- as.vector(EMTgenes$`intersect(EMTdbgenes$GeneSymbol, res.smIHW.cp1df.rmNA.sig$symbol)`)

  keyvals.colour <- ifelse(abs(EMTdbgenes.ord$shrunkLFC) < 1 & EMTdbgenes.ord$padj > .05,'grey30', 
            ifelse(abs(EMTdbgenes.ord$shrunkLFC) >= 1 & EMTdbgenes.ord$padj > .05, 'forestgreen',
            ifelse(abs(EMTdbgenes.ord$shrunkLFC) < 1 & EMTdbgenes.ord$padj <= .05, 'royalblue',
            ifelse(abs(EMTdbgenes.ord$shrunkLFC) >= 1 & EMTdbgenes.ord$padj <= .05,'red2',
            ifelse(EMTdbgenes.ord$symbol %in% VolcanoEMTgenes & abs(EMTdbgenes.ord$shrunkLFC) >= 0 & EMTdbgenes.ord$padj <= 1, 'black', 'gold')))))
          keyvals.colour[is.na(keyvals.colour)] <- 'gold'
          names(keyvals.colour)[keyvals.colour == 'grey30'] <- 'Not sig.'
          names(keyvals.colour)[keyvals.colour == 'forestgreen'] <- 'Log2 FC'
          names(keyvals.colour)[keyvals.colour == 'royalblue'] <- 'p-value'
          names(keyvals.colour)[keyvals.colour == 'red2'] <- 'p-value & Log2 FC'
          names(keyvals.colour)[keyvals.colour == 'black'] <- 'EMT'

          # create custom key-value pairs for defined genes
              # this can be achieved with ifelse statements
              keyvals.shape <- ifelse(EMTdbgenes.ord$symbol %in% VolcanoEMTgenes, 17, 19)
              keyvals.shape[is.na(keyvals.shape)] <- 19
              names(keyvals.shape)[keyvals.shape == 17] <- 'EMT genes'
              names(keyvals.shape)[keyvals.shape == 19] <- 'non-EMT'


        library(EnhancedVolcano)
        EnhancedVolcano(EMTdbgenes.ord,
                        lab = EMTdbgenes.ord$symbol,
                        x = 'shrunkLFC',
                        y = 'padj',
                        title = '',
                        subtitle = '',
                        pCutoff = .05,
                        FCcutoff = 1,
                        pointSize = 3.0,
                        labSize = 4.0,
                        legendLabels=c('Not sig.','Log2 FC','p-value',
                                       'p-value & Log2 FC'),
                        selectLab = c(top20LFCgenes, top20padj),
                       shapeCustom = keyvals.shape,
                       colCustom = keyvals.colour,
                        drawConnectors = TRUE,
                        widthConnectors = 0.2,
                        colConnectors = 'grey30')

Output Rplot

ADD REPLY
0
Entering edit mode

Sorry, yes, this is getting complex, isn't it.

For the default colour scheme, in the main code, what I'm doing is:

toptable <- as.data.frame(toptable)
toptable$Sig <- 'NS'
toptable$Sig[(abs(toptable[[x]]) > FCcutoff)] <- 'FC'
toptable$Sig[(toptable[[y]] < pCutoff)] <- 'P'
toptable$Sig[(toptable[[y]] < pCutoff) &
  (abs(toptable[[x]]) > FCcutoff)] <- 'FC_P'
toptable$Sig <- factor(toptable$Sig,
  levels=c('NS','FC','P','FC_P'))

[ source: https://github.com/kevinblighe/EnhancedVolcano/blob/master/R/EnhancedVolcano.R#L262-L269 ]

Perhaps following that (but replacing with the colour names) would be easier, in this case - the order of these commands is critical for it to work. The following mappings are:

toptable[[x]] --> fold-changes
toptable[[y]] --> p-values

Then, do the final part, ifelse(EMTdbgenes.ord$symbol %in% VolcanoEMTgenes & abs(EMTdbgenes.ord$shrunkLFC) >= 0, as a separate command.

The legend position would also work better on the left or right, as there is not enough room horizontally.

You may also try not setting legendLabels to see how the function behaves, and then setting it again (it will require 5 values I think).

ADD REPLY
0
Entering edit mode

Hi Kevin,

Thank you for your continued responses. Im not sure I completely follow ( I consider myself still a novic at this)

Are you meaning something like this?

I get the error 'Error: Aesthetics must be either length 1 or the same as the data (26242): colour'

perhaps its an easy fix?

library(readr)
EMTdbgenes.ord <- read_csv("res.smIHW.cp1df.rmNA.EMTord.csv")
toppadjgenes <- EMTdbgenes.ord[order(EMTdbgenes.ord$padj), ]
top20padj <- head(toppadjgenes$symbol, n=20)
topLFCgenes <- EMTdbgenes.ord[order(-abs(EMTdbgenes.ord$shrunkLFC)), ]
top20LFCgenes <- head(topLFCgenes$symbol, n=20)

 # define genes that will show as shape 17
VolcanoEMTgenes <- as.vector(EMTgenes$`intersect(EMTdbgenes$GeneSymbol, res.smIHW.cp1df.rmNA.sig$symbol)`)

EMTdbgenes.ord$Sig <- 'grey30'
EMTdbgenes.ord$Sig[(abs(EMTdbgenes.ord$shrunkLFC) >= 1)] <- 'forestgreen'
EMTdbgenes.ord$Sig[(EMTdbgenes.ord$padj <= 0.05)] <- 'royalblue'
EMTdbgenes.ord$Sig[(EMTdbgenes.ord$padj < 0.05) & (abs(EMTdbgenes.ord$padj) > 1)] <- 'red2'
EMTdbgenes.ord$Sig <- factor(EMTdbgenes.ord$Sig, levels=c('grey30','forestgreen','royalblue','red2')) 

keyvals.colour <- ifelse(EMTdbgenes.ord$symbol %in% VolcanoEMTgenes, 'black', 'gold')

# create custom key-value pairs for defined genes
# this can be achieved with ifelse statements
keyvals.shape <- ifelse(EMTdbgenes.ord$symbol %in% VolcanoEMTgenes, 17, 19)
keyvals.shape[is.na(keyvals.shape)] <- 19
names(keyvals.shape)[keyvals.shape == 17] <- 'EMT genes'
names(keyvals.shape)[keyvals.shape == 19] <- 'non-EMT'


library(EnhancedVolcano)
EnhancedVolcano(EMTdbgenes.ord,
                lab = EMTdbgenes.ord$symbol,
                x = 'shrunkLFC',
                y = 'padj',
                title = '',
                subtitle = '',
                pCutoff = .05,
                FCcutoff = 1,
                pointSize = 3.0,
                labSize = 4.0,
                selectLab = c(top20LFCgenes, top20padj),
               shapeCustom = keyvals.shape,
               colCustom = keyvals.colour,
                drawConnectors = TRUE,
                widthConnectors = 0.2,
                colConnectors = 'grey30')
ADD REPLY
0
Entering edit mode

Oh no, I was just providing a sort of template. In place of EMTdbgenes.ord$Sig, you would have keyvals.colour. However, you'd have to start off with:

keyvals.colour <- rep('grey30', nrow(EMTdbgenes.ord))

..and then proceed with:

keyvals.colour[(abs(EMTdbgenes.ord$shrunkLFC) >= 1)] <- 'forestgreen'
... ...
ADD REPLY
0
Entering edit mode

Ah I see

So how can I avoid then overiding these colours with the ifelse statement?

to confirm, I just want to overwite the colour of the EMT genes black.

keyvals.colour <- rep('grey30', nrow(EMTdbgenes.ord))
keyvals.colour[(abs(EMTdbgenes.ord$shrunkLFC) >= 1)] <- 'forestgreen'
keyvals.colour[(EMTdbgenes.ord$padj <= 0.05)] <- 'royalblue'
keyvals.colour[(EMTdbgenes.ord$padj < 0.05) & (abs(EMTdbgenes.ord$padj) > 1)] <- 'red2'
keyvals.colour <- ifelse(EMTdbgenes.ord$symbol %in% VolcanoEMTgenes, 'black', '**I dont want to override previous colours here**')
ADD REPLY
0
Entering edit mode

I am not sure what you mean. The first 4 lines of your code account for every variable (gene) in your data. If you want to add an additional colour for EMT genes, then, by default, this will over-ride some of the original colours, and add a new [fifth] colour.

Perhaps you want some other way to do this, but your are already using a different shape for these EMT genes. There are a few suggestions in the vignette from this part onward: https://bioconductor.org/packages/devel/bioc/vignettes/EnhancedVolcano/inst/doc/EnhancedVolcano.html#encircle-highlight-certain-variables

ADD REPLY
0
Entering edit mode

Many thanks for your continued help Kevin.

Parhaps I am confised but doesnt 'gold' in

keyvals.colour <- ifelse(EMTdbgenes.ord$symbol %in% VolcanoEMTgenes, 'black', 'gold')

lead to overwiting all of the colours I have already set in lines 1-4 in my last message.

table(keyvals.colour)
keyvals.colour
black  gold 
  275 25967

Instead is there a way I can overwirte the positions at which

EMTdbgenes.ord$symbol %in% VolcanoEMTgenes is true to black .. somthing like

keyvals.colour[EMTdbgenes.ord$symbol %in% VolcanoEMTgenes] <- 'black'

(im not very good at base R arguments and would be greatful of some guidance on this)

ADD REPLY
0
Entering edit mode

Yes, there you would just instead need:

keyvals.colour[EMTdbgenes.ord$symbol %in% VolcanoEMTgenes] <- 'black'

This may also work, assuming that you already have set up keyvals.colour with the previous 4 colours / levels:

keyvals.colour <- ifelse(EMTdbgenes.ord$symbol %in% VolcanoEMTgenes,
  'black',
  keyvals.colour)

Then you still have to add the names, which should then not be too difficult.

ADD REPLY
1
Entering edit mode

Thank you Kevin that has worked.

ADD REPLY

Login before adding your answer.

Traffic: 848 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6