topGO question
1
0
Entering edit mode
H. Stotz • 0
@9dc04a2c
Last seen 4.0 years ago

I am trying to use the topGO package but I get this error message "Error in .local(.Object, ...) : allGenes must be a named vector" when I execute the following command.

# Data preparation of reference dataset
selGenes <- genefilter(fitted, filterfun(pOverA(0.20, log2(100)), function(x) (IQR(x) > 0.25)))
eSet <- fitted[selGenes, ]
AllNames <- rownames(eSet)
head(AllNames)
as.factor(AllNames)

## My genes of interest
IntGenes <- read.csv("D1D0_genes2.csv", header = TRUE) # 2-fold or more
## Convert dataframe to matrix with row and column names
IntGenes2 <- IntGenes[,-1]
rownames(IntGenes2) <- IntGenes[,1]
GeNamen <- rownames(IntGenes2)
head(GeNamen)
as.factor(GeNamen)

## Set up connection to ensembl database
ensembl <- useMart(biomart = "plants_mart", dataset = "bnapus_eg_gene",
                   host = "plants.ensembl.org")
# list the available datasets (species)
listDatasets(ensembl) %>% filter(str_detect(description, "Brassica"))
# specify a data set to use
ensembl = useDataset("bnapus_eg_gene", mart=ensembl)

#Get Ensembl gene IDs and GO terms
GTOGO <- getBM(attributes = c("external_gene_name", 
                              "go_id"),
                              mart = ensembl)
head (GTOGO)
#Remove blank entries
GTOGO <- GTOGO[GTOGO$go_id != '',]
# convert from table format to list format
geneID2GO <- by(GTOGO$go_id,
                GTOGO$external_gene_name,
                function(x) as.character(x))
# examine result
head(geneID2GO)

GOdata <- new("topGOdata",
              description = "GO analysis of 1 dpi vs mock",
              ontology = "BP",
              allGenes = AllNames,
              geneSel = GeNamen,
              annot = geneID2GO,
              nodeSize = 5)

I looked at the Ensembl annotations and noticed that the gene names that are commonly used in publications correspond to "external_gene_name" not the "ensembl_gene_id". Is this why it is not working? Do I have to access the "ensembl_gene_id"?

Thank you,

Henrik

Ensembl ensembldb • 1.6k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 17 hours ago
United States

The short answer is that you need to read the error message. It says that allGenes has to be a named vector. This is referring to your object AllNames. Is AllNames a vector? Is it a named vector?

It's often instructive when working with topGO, which has IMO suboptimal help pages, to do things like

library(topGO)
data(geneList)
head(geneList)
1095_s_at   1130_at   1196_at 1329_s_at 1340_s_at 1342_g_at 
1.0000000 1.0000000 0.6223795 0.5412240 1.0000000 1.0000000

To see what the data used in the vignette look like, so you can emulate that.

ADD COMMENT

Login before adding your answer.

Traffic: 817 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6