GO annotation and gene set testing for plant dataset
3
0
Entering edit mode
@myprogramming2016-9741
Last seen 7.5 years ago

Hi,

 I am working on plant dataset which is a close relative of Arabidopsis. 

I performed differential expression analysis using an edgeR and found significant DE genes. I would like to identify associated GO terms. 

I have a transcript as a Scaffold, and not the ENTREZ ID.  I am just wondering whether it is possible to perform GO annotation and gene set testing in R for plant dataset.
If not, could you please suggests a different tool.  
 

Thanks in advance for your help.

 

go edger annotation topgo bioconductor • 1.4k views
ADD COMMENT
0
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 3 hours ago
The city by the bay

All of the standard GO-based analysis pipelines assume that there is an existing relationship between the features (e.g., genes, transcripts) and the GO terms. If you have this information in your dataset, then it's easy; just define all transcripts belonging to a single GO term as a gene set, and then use standard methods like roast, camera, etc. for gene set overrepresentation/enrichment testing. If you don't have the existing GO relationships... well, it gets a lot harder. I would probably suggest finding the homologous gene in Arabidopsis for each of your transcripts, and using the GO annotation of the homologs to define your gene sets. I rarely work with Arabidopsis, but I would assume that it's been studied thoroughly enough to have good GO annotation.

ADD COMMENT
0
Entering edit mode
@myprogramming2016-9741
Last seen 7.5 years ago

Hi Aaron,

I estimated homologous gene and GO annotation for each of the transcript using Arabidopsis genome. I am not clear with defining all transcripts belonging to a single GO term as a gene set. Could you please send me an example file and R code for roast and camera?

Thanks for your help!

 

ADD COMMENT
0
Entering edit mode

I would have thought it was fairly self-explanatory. For each GO term, find all Arabidopsis genes annotated with that term; then, find the transcripts in your species that are homologous to those genes. The set of homologs constitutes the gene set corresponding to that GO term in your species. If that's not clear enough for you, then perhaps you need to find a local bioinformatician to help you out. This site isn't the place to get wholesale code for your analysis - well, not for free, anyway.

ADD REPLY
0
Entering edit mode
@myprogramming2016-9741
Last seen 7.5 years ago

Thanks Aaron !

 

 

 

ADD COMMENT

Login before adding your answer.

Traffic: 808 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6