The error message tries to give you some hints as to what to do next:
Invalid attribute(s): uniprot_swissprot, uniprot_genename
Please use the function 'listAttributes' to get valid attribute names
The first line above tells you that both uniprot_swissprot
and uniprot_genename
are invalid attributes i.e. there is no column in the Ensembl database with that name, so you need to find the correct name. The second line above suggests using the function listAttributes
to give you a set of all the possible attributes that can be retrieved from the dataset you're accessing. e.g.
listAttributes(mart)
Did you get those original names from a document somewhere? Sometimes Ensembl changes the column names and examples go out of date, so if you found that in the biomaRt documentation let me know and I'll update it.
For your case I think the two attributes you actually want are uniprotswissprot
and uniprot_gn
e.g.
library(biomaRt)
mart = useMart("ensembl")
mart = useDataset("hsapiens_gene_ensembl", mart)
prot <- c("P03891", "P03885")
db = getBM(attributes = c("uniprotswissprot", "uniprot_gn", "illumina_humanht_12_v4"),
filters = "uniprotswissprot",
values = prot,
mart = mart)
> db
uniprotswissprot uniprot_gn illumina_humanht_12_v4
1 P03891 P03891 NA
2 P03891 Q7GXY9 NA
The names in my original script worked with biomart but probably Ensembl changed them. How can I know when and how names change in Ensembl?
Script now works thank you
Not easily is the short answer. Ensembl usually list them in their release notes e.g. ftp://ftp.ensembl.org/pub/release-88/release_88_biomart_changes.txt and Thomas Maurel generally mentions them in his Ensembl release posts here (Ensembl 88 is out!), but normally I find out when an existing script stops working with the error your posted.