Question

make TxDb from list of genes

0

Entering edit mode

rtwest • 0

@rtwest-18709

Last seen 6.4 years ago

I am relatively new to bioinformatics however I have learned a lot from this site and can't find a solution to an issue I am having.

I am trying to create a TxDb from a certain list of genes.

I have tried several different options to no avail. Converting the list to granges, however could never create the txdb because no meta data was ever captured, tried to create from ensembl and most recently from biomart directly.

I inputted the list. Converted the list to ensembl Ids. Then converted to a list of characters and finally tried to create the txdb, the transcript Ids are invalid.

Long story short: How can i create a custom txdb from a certain list of genes?

Also (second question) from the list of 70 genes, almost 700 ensembl ids are generated. Not exactly sure why that is as well.

Below is my code after loading packages:

list <- c("AMOTL2",
"ANKRD1",
"ANLN",
"ARHGAP29",
"AXL",
"NA",
"BIRC5",
"CCRN4L",
"CDC20",
"CDK6",
"CDKN2C",
"CENPF",
"COL4A3",
"CRIM1",
"CTGF",
"CYR61",
"CYR61",
"DAB2",
"DDAH1",
"ASAP1",
"DLC1",
"DUSP1",
"DUT",
"ECT2",
"EMP2",
"ETV5",
"FGF2",
"FLNA",
"FSCN1",
"FSTL1",
"GADD45B",
"GAS2L3",
"GAS6",
"GGH",
"GKAP1",
"GLIS2",
"GLS",
"HEXB",
"HMMR",
"AGFG2",
"ITGB2",
"ITGB5",
"LHFP",
"MACF1",
"MARCKS",
"MDFIC",
"MSRB3",
"MYO1C",
"NDRG1",
"PDLIM2",
"PHGDH",
"PMP22",
"SCHIP1",
"SDPR",
"SERPINE1",
"SERTAD4",
"SFRS2IP",
"SGK1",
"SH2D4A",
"SHCBP1",
"SLIT2",
"STMN1",
"TGFB2",
"TGM2",
"THBS1",
"TK1",
"TNNT2",
"TNS1",
"TOP2A",
"TSPAN3")

ids <- getBM(attributes="ensembl_transcript_id", filters = "hgnc_symbol", values = list, mart= ensembl)
ids
ids.c <- as.character(ids)
ids.c
yap_taz.c <- as.character(yap_taz)

txdb_YT <- makeTxDbFromBiomart(biomart="ensembl",
dataset="hsapiens_gene_ensembl",
transcript_ids=ids.c,
circ_seqs=NULL,
host="www.ensembl.org",
port=80,
taxonomyId=NA,
miRBaseBuild=NA)

Download and preprocess the 'transcripts' data frame ... Error in .makeBiomartTranscripts(filter, mart, transcript_ids, recognized_attribs, :
invalid transcript ids:

R maketxdbfrombiomart maketxdbfromgranges rstudio ensembl • 2.3k views

ADD COMMENT • link updated 2.0 years ago by Shreyash ▴ 10 • written 6.4 years ago by rtwest • 0

1

Entering edit mode

Why do you want a TxDb for just a set of genes? It's simple enough to use a full sized one and subset after the fact.

ADD REPLY • link 6.4 years ago James W. MacDonald 68k

1

Entering edit mode

I agree with James. No need to create a new TxDb database. You could simply subset an EnsDb database to your list of input genes (even better if you have Ensembl IDs): assuming you have your Ensembl gene IDs in a variable called ensids:

library(EnsDb.Hsapiens.v86)
edb <- filter(EnsDb.Hsapiens.v86, filter = ~ gene_id == ensids)

On that edb you can then call the same functions you would use on a TxDb (such as genes, exonsBy etc) and you would always just get the results for the genes you provided.

The package I loaded above contains annotations from Ensembl version 86, if you want more recent annotations you would want to download the EnsDb from AnnotationHub.

ADD REPLY • link 6.4 years ago Johannes Rainer ★ 2.1k

0

Entering edit mode

I know I am almost five years too late to this, but I was hoping to get this to work with kpPloteGene from karyoplote R but ended up realizing that datatype has to be a Txdb object.

> filterIds <- ranges.df[ranges.df$coverage == 0,]$names
> edb <- filter(EnsDb.Hsapiens.v75, filter = ~ gene_id == filterIds)
> kpPlotGenes(kp, edb, r0 = 0.2, r1 = 0.3, gene.name.cex = 0.8, data.panel = 2, gene.margin = 0, col = "darkblue", gene.names.col = "black", gene.name.position = "top", avoid.overlapping = TRUE, plot.transcripts.structure = FALSE, plot.transcripts = FALSE)

Error in data$genes: $ operator not defined for this S4 class
Show stack trace

I was wondering if we could filter the Txdb object directly?

ADD REPLY • link 2.0 years ago Shreyash ▴ 10

0

Entering edit mode

Thank you, I was over thinking it and going about it the wrong way, appreciate the guidance

ADD REPLY • link 6.4 years ago rtwest • 0