Converting gene symbol list to Entrez IDs
2
0
Entering edit mode
@imalumberjack-15042
Last seen 6.4 years ago

Hello all, 

I'm not very experienced with bioconductor and R, and I am struggling with converting a list of gene symbols I've read in from a .csv file into R into their relevant ENTREZ ID(s). I was wondering if anyone had any tips for how to address this? The code I'd been attempting to use was the following:

>prog<-read.csv(file="mydata.csv," header=TRUE, sep="/")

> gns<-select(org.Hs.eg.db, prog, c("ENTREZID","GENENAME"))

Error in .testForValidKeys(x, keys, keytype, fks) :

  None of the keys entered are valid keys for 'ENTREZID'. Please use the keys method to see a listing of valid arguments.

Many thanks for your help!

org.hs.eg.db entrez gene identifiers genesymbols • 14k views
ADD COMMENT
0
Entering edit mode

Where did you get the list of gene symbols from? From a published paper? I ask this many published sources include gene symbols that are no longer current official symbols.

Your file has a "csv" extension, suggesting that it is a comma-separated file, but then you specify sep="/". What gives with that? Can you show us the first few lines of your file? Does your data file have a column containing gene symbols?

What will you do with the Entrez Gene Ids when you get them? What will be the next step?

ADD REPLY
1
Entering edit mode
@james-w-macdonald-5106
Last seen 2 hours ago
United States

You are passing a data.frame to select, rather than a character vector. Presumably one of the columns of prog contains the Entrez Gene IDs, so you should subset to that column. Also note that the default of read.csv is to convert strings to factors, so you should probably include stringsAsFactors = FALSE to your call to read.csv.

ADD COMMENT
1
Entering edit mode
mat149 ▴ 80
@mat149-11450
Last seen 5 days ago
United States

Here is a code chunk that I use to convert zebrafish gene symbols to Entrez gene ID's:

("t" in this case is of class character with random genes that I'm interested in, but you can use your "read.csv" object)

library(org.Dr.eg.db)
keytypes(org.Dr.eg.db)
library(clusterProfiler)

t <- c("lepa","lepr","lepb","leprot")
et <- bitr(t, fromType="SYMBOL", toType=(c("ENTREZID","PATH","GO","ALIAS","GENENAME")), OrgDb="org.Dr.eg.db")
head(et)

and the reverse:

tt<-c("100150233","567241","564348","550484")
ett <- bitr(tt, fromType="ENTREZID", toType="SYMBOL", OrgDb="org.Dr.eg.db")
head(ett)
ADD COMMENT

Login before adding your answer.

Traffic: 557 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6