The Bioconductor website offers an excellent RNA-Seq workflow vignette on Time-Course experiments however the ggplot2 only selects the gene with the smallest p-value, using the which.min() function:
data <- plotCounts(ddsTC, which.min(resTC$padj),
intgroup=c("minute","strain"), returnData=TRUE)
What I want to do is to either:
(A) make a plot showing multiple genes of interest (e.g. 4 genes from a gene family, in my case circadian genes)
(B) Ideally, create multi-panel plots where each plot is a gene of specific interest (e.g. isogroup298, isogroup4598)
Any help is greatly appreciated!
If the gene name is isogroup4222, then you use that! I think you are getting hung up on the way the gene name is being selected. If you type topGene at the R prompt, it should return "isogroup3584". As an example, from the workflow:
I could do either
and I would get the same exact results. If you use returnData = TRUE, you get
Now to get entirely explicit about what I was saying before, say we want the top 5 genes.
Don't get hung up on how I got that - it's just a vector of gene names. The rest of this is homework for you, to go through and figure out what I am doing.
Thank you very much James for your well written and well described answer. I don't have much computational support in my lab so figuring these things out can often be challenging - even with Google.
I wracked my brain for nearly two weeks on your "homework" but to no avail. I thought I may be able to do something with grep like:
But this grabbed me anything starting with isogroup3676 (isogroup367601, isogroup 36769, etc.)
Many Google searches later and I was still no closer to sub-setting my data.
Then I had the idea that I could look at DESeq2 contrasts results, find out how many DEGs there were then make alterations by upping the following code to 75 (for example):
The problem is that ggplot can only do a max of 1:25. So next I though maybe I run the analysis multiple times:
topGenes <- rownames(res)[order(res$padj)][1:25]1)
2)
topGenes <- rownames(res)[order(res$padj)][26:50]3) etc.
This still isn't ideal though.
I figured out how to subset vectors by gene names by doing the following:
interestingGene <- res[row.names(res) == "isogroup3676"]
Unfortunately, I can't figure out how to make it work in the bioconductor example they used for which.min()