Question

Need help with FoldGO input -> FuncAnnotGroupsTopGO function

0

Entering edit mode

lessismore ▴ 20

@lessismore

Last seen 2.6 years ago

Italy

Hi there!

I have been looking at foldGO today, and wanted to try its functions on some of my own data, following the workflow published on Bioconductor (https://www.bioconductor.org/packages/release/bioc/vignettes/FoldGO/inst/doc/vignette.html).

However, I quickly ran into an error, although I believe all my inputs have the right format.

I have a dataframe containing Gene.IDs and fold changes, along with p and q values and corresponding GO.IDs.

> str(minimal_GOfiltered)
'data.frame':   6670 obs. of  5 variables:
 $ ProteinID                  : chr  "PMI19_RS05545" "PMI19_RS05540" "PMI19_RS05525" "PMI19_05982" ...
 $ Student.s.T.test.Difference: num  0.276 15.231 11.235 4.967 0.216 ...
 $ Student.s.T.test.p.value   : num  9.92e-02 1.07e-05 4.20e-06 8.09e-05 4.48e-01 ...
 $ Student.s.T.test.q.value   : num  0.0541 0 0 0 0.4264 ...
 $ GO.IDs                     : chr  "GO:0009056; GO:0009058; GO:0034641; GO:0003677; GO:0003700; GO:0005737" "GO:0009056; GO:0009058; GO:0016491; GO:0043167" "GO:0006520; GO:0006629; GO:0006950; GO:0009056; GO:0009058; GO:0034641; GO:0016491; GO:0005576; GO:0005737" "GO:0000003; GO:0006950; GO:0030154; GO:0040007; GO:0048646; GO:0022857; GO:0005764; GO:0005768; GO:0005886" ...

From this data frame i am retaining all significantly upregulated genes and create gene groups like so:

foldGO_groups_up <- GeneGroups(minimal_GOfiltered %>% filter(Student.s.T.test.Difference > 0 & Student.s.T.test.q.value < 0.005), quannumber=6, logfold=T)

Then, I create a vector of background genes from the original data frame

foldGO_background_genes <- minimal_GOfiltered[,1]

and retrieved a map of GOterms to Gene.IDs created (and working) previously in a topGO analysis

> head(GO2ID_map)
$ProteinID
[1] "GO.IDs"

$PMI19_RS05545
[1] "GO:0009056"

$PMI19_RS05545
[1] "GO:0009058"

$PMI19_RS05545
[1] "GO:0034641"

Now I thought I have all the necessary input for the next step in the workflow

oldGO_annotobj_up <- FuncAnnotGroupsTopGO(genegroups = foldGO_groups_up, bggenes = foldGO_background_genes, namespace = "BP", mapping = "custom", customAnnot = GO2ID_map, annot = topGO::annFUN.GO2genes)

However, I cant run this command as I am getting this error message:

Building most specific GOs .....
    ( 0 GO terms found. )

Build GO DAG topology ..........
    ( 0 GO terms and 0 relations. )
Nothing to do:
Error in split.default(names(sort(nl)), f.index) : 
  first argument must be a vector

I cant figure out where I am deviating from the published workflow? Can anybody help? Thanks so much!

FoldGO GSEA • 1.1k views

ADD COMMENT • link 4.4 years ago lessismore ▴ 20

score 0 · Answer 1 · 2020-12-03

Nevermind, I found out what the mistake was.

By default, FoldGO uses the reverse GOterm/GeneID mapping system of topGO.

Although FoldGO uses topGO functions, it requires something like this:

str(head(ID2GO_map))

List of 6

$ GO:0000003: chr [1:117] "PMI19_05982" "PMI19_05614" "PMI19_05222" "PMI19_04833" ...

$ GO:0000014: chr "PMI20_01122"

$ GO:0000027: chr [1:4] "PMI19_01612" "PMI20_03196" "PDC04_CDS_03654" "PP_RS23815"

while topGO by default requires this:

str(head(GO2ID_map))

List of 6

$ ProteinID : chr "GO.IDs"

$ PMI19_RS05545: chr [1:6] "GO:0009056" "GO:0009058" "GO:0034641" "GO:0003677" ...

$ PMI19_RS05540: chr [1:4] "GO:0009056" "GO:0009058" "GO:0016491" "GO:0043167"

$ PMI19_RS05525: chr [1:9] "GO:0006520" "GO:0006629" "GO:0006950" "GO:0009056" ...