Question

Get Gene IDs for a list of Gene Names in R

0

Entering edit mode

Bayram Sarilmaz • 0

@bayram-sarilmaz-16272

Last seen 6.8 years ago

turkey

I have a huge list of gene names, and I'd like to map corresponding gene IDs to each name. I've tried using this R library: org.Hs.eg.db, but it creates more IDs than names, making it hard to map the results together, especially if the list is long.

Example of an input file (7 gene names):

RPS6KB2
PSME4
PDE4DIP
APMAP
TNRC18
PPP1R26
NAA20

Ideal output would be (7 IDs):

Current output (8 IDs !!):

6199
23198
9659
57136
27320 *undesired output ID*
84629
9858
51126

Any suggestions on how to solve this issue? how to get rid of such multiple maps?

This is the code I'm using:

library("org.Hs.eg.db") #load the library

input <- read.csv("myfile.csv",TRUE,",") #read input file

GeneCol = as.character(input$Gene.name) #access the column that has gene names in my file

output = unlist(mget(x = GeneCol, envir = org.Hs.egALIAS2EG, ifnotfound=NA)) #get IDs

write.csv(output, file = "GeneIDs.csv") #write the list of IDs to a CSV file

r bioinformatics org.Hs.eg.db genetics • 2.2k views

ADD COMMENT • link updated 6.8 years ago by Martin Morgan 25k • written 6.8 years ago by Bayram Sarilmaz • 0

score 0 · Answer 1 · 2018-06-26

0

Entering edit mode

Martin Morgan 25k

@martin-morgan-1513

Last seen 3 months ago

United States

Please see my (updated, in response to your code) answer on StackOverflow.

ADD COMMENT • link 6.8 years ago Martin Morgan 25k