How to retreive topGO significant IDs of genes after enrichment test ?
1
0
Entering edit mode
David ROUX ▴ 20
@david-roux-11055
Last seen 5.8 years ago
France (Avignon University)

Hello, I ran a topGO enrichement on my RNAseq data, the output looks:

> allRes
        GO.ID                                        Term Annotated Significant Expected classicFisher
1  GO:0043531                                 ADP binding       406          21     4.05       6.5e-10
2  GO:0016760 cellulose synthase (UDP-forming) activit...        29           3     0.29        0.0029
3  GO:0015035 protein disulfide oxidoreductase activit...        71           4     0.71        0.0055
4  GO:0004089              carbonate dehydratase activity        17           2     0.17        0.0122
5  GO:0016765 transferase activity, transferring alkyl...        59           3     0.59        0.0210

Example: the first line shows 406 annotated and 21 significant genes.

According to the vignette, the topGO “sigGenes()” function appears to retrieve only the annotated genes. Here it fetches the 406 annoted genes’IDs.

The vignette then proposes to use the “printGenes()” function, but “only when the chip used has an annotation package available in Bioconductor”.

Here we are working on Prunus persicae with no available package. So, how can we get the 21 significant genes IDs in my example?

Many thanks in advance.

topGO significant enrichment genes IDs • 1.3k views
ADD COMMENT
1
Entering edit mode
David ROUX ▴ 20
@david-roux-11055
Last seen 5.8 years ago
France (Avignon University)

I am answering my own question (in case it will help someone later). :-)

I found the solution from other topics elsewhere (https://support.bioconductor.org/p/65856/ and https://www.biostars.org/p/239032/ ).

A simple way is to re-use the “genesOfInterest” list created earlier in the topGO pipeline i.e. here:

geneListTemp <- read.csv("Diff_Express_Genes_liste.csv",header=TRUE) 
genesOfInterest <- geneListTemp[,1]

Later, according to the topGO vignette, we do:

topGO_results <- GenTable(myGOdata, etc… )

And finally with the following statement, we can produce the list of significant genes IDS for each significant GO node highlighted via GenTable():

topGO_results$genes <- sapply(topGO_results$GO.ID, function(x)
{
  genes<-genesInTerm(myGOdata, x) 
  genes[[1]][genes[[1]] %in% genesOfInterest]
})
View(topGO_results)

And these last lines will produce a nice looking CVS table !

topGO_results$genes = as.character(topGO_results$genes)
topGO_results$genes <- gsub("[c()]","",topGO_results$genes)
topGO_results$genes <- gsub("[)]","",topGO_results$genes)
topGO_results$genes <- gsub("[\"]","",topGO_results$genes)
topGO_results <- as.data.frame(topGO_results)
write.csv(topGO_results, file = "out.csv")

Best.

ADD COMMENT

Login before adding your answer.

Traffic: 755 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6