First we'll set it up so we are using the Ensembl human genes mart:
library(biomaRt)
human = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
Now create a vector with the names of the genes we're interested in. In this example we'll look for paralogs to a single gene. If you've got more than one you can provide all of them at this step. I'm also using the 'external gene name'. If your list of genes is HGNC symbols, Entrez IDs, etc you'll have to choose the correct field in then final step.
gene_id <- "TBPB"
Then we submit the query to Ensembl BioMart. filters
is the field we want to search, values
are the specific entries we want to look for, and attributes
specifies the fields we want to get back. So here we are searching for the gene called TBPB and getting back the gene name and Ensembl ID, plus the gene name and Ensembl ID for anything that is annotated as being a paralog. If there is more than one paralog we will get more than one row in the entry. If there are no paralogs then you'll get nothing back.
results <- getBM(attributes = c("ensembl_gene_id",
"external_gene_name",
"hsapiens_paralog_ensembl_gene",
"hsapiens_paralog_associated_gene_name"),
filters = "external_gene_name",
values = gene_id,
mart = human)
We can look at the result to see what was returned.
results
ensembl_gene_id external_gene_name hsapiens_paralog_ensembl_gene hsapiens_paralog_associated_gene_name
1 ENSG00000042813 ZPBP ENSG00000186075 ZPBP2