Your call to getBM
has an extra '#' in it that is probably not helping things. Plus you are asking a database for a lot of things in one query, which will often result in tons of results coming back. It's usually better to ask for just a few things. If you just want the protein name, I would get that. Also, you want to always include your filter as an attribute, because a database won't return things in the same order that you provided. Also also, if you have NCBI Gene IDs it's probably better to query a database that is based on them rather than something like the Biomart server, which is based on Ensembl. So, a few examples.
> library(biomaRt)
> mart <- useEnsembl("ensembl","hsapiens_gene_ensembl")
> library(org.Hs.eg.db)
## get some random Gene IDs
> egids<- head(keys(org.Hs.eg.db), 20)
> z <- getBM(c("entrezgene_id","ensembl_gene_id","uniprotswissprot","description","hgnc_symbol"), "entrezgene_id", egids, mart)
## check for duplicates
> table(z$entrezgene_id)
1 2 9 10 12 13 14 15 16 18 19 20 21 22 23 24 25
2 2 2 2 2 2 2 2 2 2 2 2 2 2 14 2 2
> head(z)
entrezgene_id ensembl_gene_id uniprotswissprot
1 23 ENSG00000225989 Q8NE71
2 23 ENSG00000225989
3 23 ENSG00000236149 Q8NE71
4 23 ENSG00000236149
5 10 ENSG00000156006 P11245
6 10 ENSG00000156006
description
1 ATP binding cassette subfamily F member 1 [Source:HGNC Symbol;Acc:HGNC:70]
2 ATP binding cassette subfamily F member 1 [Source:HGNC Symbol;Acc:HGNC:70]
3 ATP binding cassette subfamily F member 1 [Source:HGNC Symbol;Acc:HGNC:70]
4 ATP binding cassette subfamily F member 1 [Source:HGNC Symbol;Acc:HGNC:70]
5 N-acetyltransferase 2 [Source:HGNC Symbol;Acc:HGNC:7646]
6 N-acetyltransferase 2 [Source:HGNC Symbol;Acc:HGNC:7646]
hgnc_symbol
1 ABCF1
2 ABCF1
3 ABCF1
4 ABCF1
5 NAT2
6 NAT2
## Alternative method
> library(UniProt.ws)
> ws <- UniProt.ws()
> select(ws, egids, "id", "GeneID")
From Entry Entry.Name
1 1 P04217 A1BG_HUMAN
2 1 V9HWD8 V9HWD8_HUMAN
3 2 P01023 A2MG_HUMAN
4 9 P18440 ARY1_HUMAN
5 9 F5H5R8 F5H5R8_HUMAN
6 9 Q400J6 Q400J6_HUMAN
7 10 P11245 ARY2_HUMAN
8 10 A4Z6T7 A4Z6T7_HUMAN
9 12 P01011 AACT_HUMAN
10 12 A0A024R6P0 A0A024R6P0_HUMAN
11 13 P22760 AAAD_HUMAN
12 14 Q13685 AAMP_HUMAN
13 14 C9JEH3 C9JEH3_HUMAN
14 15 Q16613 SNAT_HUMAN
15 15 F1T0I5 F1T0I5_HUMAN
16 16 P49588 SYAC_HUMAN
17 18 P80404 GABT_HUMAN
18 18 X5D8S1 X5D8S1_HUMAN
19 19 O95477 ABCA1_HUMAN
20 19 A0A7I2V5U0 A0A7I2V5U0_HUMAN
21 19 B2RUU2 B2RUU2_HUMAN
22 19 B7XCW9 B7XCW9_HUMAN
23 20 Q9BZC7 ABCA2_HUMAN
24 21 Q99758 ABCA3_HUMAN
25 21 Q4LE27 Q4LE27_HUMAN
26 22 O75027 ABCB7_HUMAN
27 22 A0A087WW65 A0A087WW65_HUMAN
28 22 A0A0S2Z2Z3 A0A0S2Z2Z3_HUMAN
29 23 Q8NE71 ABCF1_HUMAN
30 23 A0A1U9X609 A0A1U9X609_HUMAN
31 23 Q2L6I2 Q2L6I2_HUMAN
32 24 P78363 ABCA4_HUMAN
33 24 Q6AI28 Q6AI28_HUMAN
34 25 P00519 ABL1_HUMAN
35 25 A0A024R8E2 A0A024R8E2_HUMAN
36 25 Q59FK4 Q59FK4_HUMAN
Warning message:
IDs not mapped: 11, 17, 3
Which still has duplicates, but not as bad.
Oh wait, I think I misunderstood.
Is that what you meant?