MAplot() uses ggplot2 to do the plotting, so it's not as direct to deal with as base graphics would be. You first need to capture the output.
p <- MAplot(genes(cuff_data),'LC','LT',logMode=T,pseudocount=1,smooth=F)
I think there is an iron clad law that all ggplot objects are named 'p', so consider yourself warned...
You then tell ggplot what else you want to do, thusly
p <- p + geom_point(aes(A, log2(M), colour = factor(ifelse(log2(M) < 2, 1,2))), size = 0.8) + geom_hline(yintercept = c(-2,2)) + theme(legend.position = "none") + scale_colour_manual(values = c("black","red"))
The geom_point business just adds the color specification, the geom_hline just adds the horizontal lines, theme just tells ggplot not to make a legend, and the scale_colour_manual sets the colors to black and red, instead of the defaults of salmon and aqua. See the ggplot help for more information.
Edit: I forgot - you won't get a plot until you type 'p' at the R prompt
Thanks a lot James! That works really well!
But I´ve got two more questions:
a- The script you wrote only paint in red the points log2(M)>2. Is there a way of making the ones with log2(M)<-2 also red? I´m trying to solve the problem but I have not solve it yet.
b- Is there a way of drawing the p<0.01 genes (points) in red, despite their log2(M)?
Thank you in advanced!
Pablo
The first part of the command is this:
And the part that sets the color is
So I obviously missed a step, because you want the absolute value of
log2(M).
Sometimes it is hard to find the function you want, so in this case you should use apropos(). Also note that it is better to use shorter arguments rather than longer for apropos():And I'll let you take it from there.
As for question (b), the answer is always yes. This is R after all. Unless of course you mean 'is there an argument that I can use to make that happen', in which case the answer appears to be no. However, as I said, this is R so you can do pretty much whatever you want, if you are willing to work for it.
The problem here is that MAplot() doesn't compute a p-value, so you need to supply one. Also, MAplot() just returns a 'gg' object, so we have to get all hackalicious. I'll use
example(MAplot)
for this next step.This creates data that I can now use, and also creates a 'gg' object (called 'd') that we are going to hack on. MAplot() doesn't output the M and A values directly - they are in the gg object, which is just a fancy list.
We don't really want to jam all sorts of things into d, but instead want to use the functions in ggplot2 to do it for us. So we extract the data object from d, and modify:
Note that the second step I am doing is not what you want to do. You instead want to put the p-values in z, in the correct order. I assume you have a data.frame floating about that has the transcript IDs as well as p-values? If so, let's say you called it d.f. I further assume this data.frame has a column called 'gene_id' that has the gene IDs, and a column called 'p.value' that has the p-values. This is probably not true, so you need to modify to suit. Then you want to do something like
EDIT - did this backwards the first time
You should look at ?match to figure out what I am doing there. Now you can plot the MAplot.
> d + geom_point(data = z, aes(A, log2(M), colour = factor(ifelse(p > 0.05, 1, 2)))) + theme(legend.position = "none") + scale_colour_manual(values = c("black","red"))
Note that I edited the last comment.
Thank you very much! That works well until the last comment. Sorry about the silly question, but I really do not know what do you mean about the data.frame. I understand that is the file where my data is, however I do not know which is in particular. This file was created when I wrote:
a<-readCufflinks(system.file("extdata", package="cummeRbund")) #Create CuffSet object from sample data
genes<-a@genes #Create CuffData object for all 'genes'
d<-MAplot(genes,'hESC','Fibroblasts')
(?)
I can't help you any further. I don't use cufflinks, nor cummeRbund, as none of my collaborators use either enough replicates nor sufficient depth for something like that to make sense.
If you want to color things using p-values then you will have to figure out how you generate p-values using cummeRbund (there is a vignette that may help), and then you will have to add the p-values to the data as I showed above.
THANKS James! I will try to do it!
CheerS!
Sorry I am writting again, but I´ve got a question about your script. I add a column of my p-values to the z data. I could make it. However, when I write
The pvalues number changes and have nothing to do with the real ones. So, when I draw the graph with
> d + geom_point(data = z, aes(A, log2(M), colour = factor(ifelse(p > 0.05, 1, 2)))) + theme(legend.position = "none") + scale_colour_manual(values = c("black","red"))
The red points did not correspond with the gene_id with p_value<0.05.
However, when I try to draw the plot without writing
Nothings gets red. Do you know why or how can I solve this?
Thanks in advanced
Yes. It's because you aren't paying close enough attention.
Here is the relevant part from my post:
Note that the second step I am doing is not what you want to do. You instead want to put the p-values in z, in the correct order.
The thing is I already put the p-values in z, in the correct order. However when I write:
> d + geom_point(data = z, aes(A, log2(M), colour = factor(ifelse(p > 0.05, 1, 2)))) + theme(legend.position = "none") + scale_colour_manual(values = c("black","red"))
No point gets red. I guess it is something related with colour = factor(ifelse(p > 0.05, 1, 2)); but I can´t realize what it is.
The z data is the following table:
gene_id A M p
1 XLOC_000001 0.3925539832 0.99284785 0.8965
2 XLOC_000002 0.0657089848 6.32477626 1
3 XLOC_000003 1.4985551631 1.19541239 0.0448
4 XLOC_000004 0.0000000000 NaN 1
5 XLOC_000005 0.6504746723 1.06764279 0.7078
6 XLOC_000006 0.0581513680 0.66861219 1
7 XLOC_000007 0.4789051940 0.68426498 0.00235
Thanks in advanced again!
Are the p-values you add to z numeric or factor?