Entering edit mode
Hello I am trying to annotate a DESeq2 significantly differentially expressed genes results table with the gene start (bp) , gene end (bp) and SNP ID info using the biomaRt package on RStudio, when I look at the data sets available in biomart using the command below I only see human GRCh38 data set. How can I get the GRCh37 data set from biomart?
library(biomaRt)
# look at top 10 databases
head(biomaRt::listMarts(host = "www.ensembl.org"), 10)
###marts providing annotation for specific classes of organisms###
head(biomaRt::listDatasets(biomaRt::useMart("ENSEMBL_MART_ENSEMBL", host = "www.ensembl.org")), 100)
Hello @james-w-macdonald-5106 I followed the exact commands seen in section 2.4 see my commands below
But I get this error message: Incorrect BioMart name, use the listMarts function to see which BioMart databases are available
You must have an old version of R/Bioc. Setting aside the fact that Ensembl 95 is not GRCh37, I get
Assuming you are using old versions of R and Bioconductor, you should first update both.
Hello your right, my mistake I should not have selected ensembl 95 if I want to use GRCh37 I use the right host as you suggested :
Then used this command to specify GRCh37
but I get a new error: Error in useEnsembl("ensembl", "hsapiens_gene_ensembl", version = GRCh37) : object 'GRCh37' not found
So the first call got you a
Mart
object that points to the archive site for GRCh37. You can now use that to do things, and you didn't need to do anything more.What the second call does is figure out which archive you want, based on the version argument. You could use that, but it's not necessary because the first one worked. But for pedantic reasons, please note that 95 is always something in R (because it's a number), but GRCh37 isn't, because unless you put that in quotes, R thinks you want an object in the global workspace called GRCh37 that doesn't exist - it's not a thing.
So if you say 'version = 95', that will work because 95 is a number and by definition exists, and R will happily use it. Basically, under the hood what happens is
listEnsemblArchives
is called, and then whatever you used for the version argument is matched to the 'version' column of thedata.frame
that is output bylistEnsemblArchives
. This works because 95 exists, and will be coerced to character when used for matching. But since GRCh37 isn't an existing object, you get the error you see. In other words:Make sense?
Thank you for clarifying. You said I only need this first line to access GRCh37 biomart:
but when I use the command (see below) to view the items in the mart it show me items from ensemble version 104 and not from GRCh37. For some reason it redirects me to ensemble version 104, despite using the host sever link for GRCh37
These are the items it shows
I have to say this is getting a bit frustrating for me. I feel like I give you the answer, and then you check my work (incorrectly) and then tell me it's not working, rather than just accepting that I might actually know a bit about this subject and taking my advice at face value.
So anyway, if you use
listMarts
without an argument, you are asking for what marts are available if you use the default arguments for that function. Because, to reiterate, if you don't provide any arguments to a function it uses the default arguments! It doesn't know anything about the existingMart
object in your workspace, and you shouldn't expect that it would (what if you have twoMart
objects?). Anyway, the help page should clear that up for you.And in fact you are meant to be able to do something like
listMarts(mart)
, but so far as I can tell it doesn't work as expected unless you are pointing to the default, most current version.HOWEVER!
Thank you for further clarifying