usage of topGO after limma
0
0
Entering edit mode
I.V. Lynn • 0
@iv-lynn-20357
Last seen 5.8 years ago
Russia/Irkutsk/SIPPB SB RAS

Hi!

I'm trying to carry out GO-enrichment analysis of microarray data. I can't understand how can I adopt data after limma for topGO input. For example: I have a data.frame, which contains gene symbols, expression values, t, B, adjusted p.values - basically the results of

>trgts <- readTargets("targets4.csv", sep = ";")
>rough <-  read.maimages(trgts, source="agilent", 
                        columns = list(R ="rDyeNormSignal", G = "gDyeNormSignal",rIsFeatNonUnifOL = "rIsFeatNonUnifOL", >gIsFeatNonUnifOL="gIsFeatNonUnifOL",rIsBGNonUnifOL= "rIsBGNonUnifOL",gIsBGNonUnifOL="gIsBGNonUnifOL",
                                       rIsFeatPopnOL="rIsFeatPopnOL",gIsFeatPopnOL="gIsFeatPopnOL",rIsBGPopnOL= "rIsBGPopnOL",
                                       gIsBGPopnOL="gIsBGPopnOL", rIsSaturated="rIsSaturated",gIsSaturated="gIsSaturated"), 
                        other.columns =  c("rIsFeatNonUnifOL","gIsFeatNonUnifOL", "rIsBGNonUnifOL","gIsBGNonUnifOL",
                                           "rIsFeatPopnOL","gIsFeatPopnOL", "rIsBGPopnOL",
                                           "gIsBGPopnOL", "rIsSaturated","gIsSaturated"), 
                        annotation = c("accessions","chr_coord","Sequence", 
                                       "ProbeUID", "ControlType", "ProbeName", "GeneName","SystematicName"
                                       , "Description"))
roughbet = normalizeBetweenArrays(rough,method="Aquantile")
roughave <- avereps(roughbet,ID=roughbet$genes$ProbeName)
design <- modelMatrix(trgts, ref="Col0")

>fitRC <- lmFit(roughave, design)
>fitRC <- eBayes(fitRC)

>signifC = topTable(fitRC, coef = "mut1", lfc = 1, p.value = 0.05,adjust.method = "BH", number = Inf)
>signifCC = signifCC <- signifC[signifC$ControlType == 0,]

#the next function makes annotation from agilent database and cbind info about probes, including GO_IDs.
>agilentannC <- function(x) {
  for (i in 1:nrow(x)) { x$ID[i] <- (which(AGIDB2$ID == x$ProbeName[i]))}
  AGIDBcutC <- AGIDB2[x$ID,]
  XannotateC <<- cbind(x, AGIDBcutC)
}
agilentannC(signifCC)
CCC <- XannotateC
row.names(CCC) <- CCC$GENE_SYMBOL
PREPAREDC <<- CCC
mut1data <- PREPAREDC

So I have all this - a table contains genes of interest, selected by lfc and p.values, their p.v.'s, LogFC, aveExp, and even GO IDs - and really can't understand how to make topGOdata of it! Please help!!

> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=Russian_Russia.1251  LC_CTYPE=Russian_Russia.1251    LC_MONETARY=Russian_Russia.1251
[4] LC_NUMERIC=C                    LC_TIME=Russian_Russia.1251    

attached base packages:
 [1] grid      stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] limma_3.38.3         Rgraphviz_2.26.0     hgu95av2.db_3.2.3    org.Hs.eg.db_3.7.0   topGO_2.34.0        
 [6] SparseM_1.77         GO.db_3.7.0          AnnotationDbi_1.44.0 IRanges_2.16.0       S4Vectors_0.20.1    
[11] Biobase_2.42.0       graph_1.60.0         BiocGenerics_0.28.0 

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0         bit_1.1-14         lattice_0.20-38    blob_1.1.1         tools_3.5.3        DBI_1.0.0         
 [7] matrixStats_0.54.0 yaml_2.2.0         bit64_0.9-7        digest_0.6.18      BiocManager_1.30.4 memoise_1.1.0     
[13] RSQLite_2.1.1      compiler_3.5.3     pkgconfig_2.0.2   
limma topGO microarray • 1.5k views
ADD COMMENT
0
Entering edit mode

I go over a working example on Biostars, here: https://www.biostars.org/p/350710/

If your genes are not HGNC symbols, then you can use biomaRt to convert them. topGO also works with Ensembl gene IDs and Entrez identifiers.

ADD REPLY
0
Entering edit mode

Thank you for your reply. But I already have GO ID's from my annotation function. And my experiment deals with Arabidopsis thaliana agilent microarray. So it's TAIR gene symbols, like AT5G15324, there. The problem is I don't understand how to put my data in topGOdata format.

GOdata <- new("topGOdata", ontology="BP", allGenes=???, annot = ???, GO2genes=???, geneSel=selection, nodeSize=10)

If I run it this way:

GOdata <- new("topGOdata", ontology="BP", allGenes=named vector of genes' p.values, annot = ???, genes2GO= data.frame contains genesymbol and goid columns, geneSel=I don't think I need It. my genes data is already a selection, nodeSize=10)

it doesn't work at all.

It would be the best, if there is some way to construct topGOdata manually.

ADD REPLY
0
Entering edit mode

You may follow this previous example on Biostars: https://www.biostars.org/p/250927/#250936

ADD REPLY

Login before adding your answer.

Traffic: 858 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6