Question

Ensembl dataset version

0

Entering edit mode

st3472 • 0

@a46e72d2

Last seen 3.2 years ago

United States

Hello,

I am analyzing bulk RNA seq data and I used Kallisto to align my data to the transcriptome. Then, I used tximport to assign the gene names from ensembl to my counts. I am comparing the results I analyzed currently to some data that were run 4 years ago and I noticed that in the data from 4 years ago I ended up with an estimated gene counts table with ~50000 genes while now I have about half. Is it possible to see which version of the gene annotation I am using? Is it possible that the difference in the overall amount of genes could be that there was an update on the Ensembl dataset I am using?

I am using the Ensembl dataset using the code below:

mart <- biomaRt::useMart("ensembl", hsapiens_gene_ensembl, host = "uswest.ensembl.org", ensemblRedirect = FALSE)

I also noticed that the estimated gene counts from 4 years ago contains thousand of gene names that are similar to AC253536.2 (they all start with AC) but the version I am using now does not output any gene names like this. Does anyone know why those were removed?

Thank you

tximport rna rnaseqGene biomaRt • 1.2k views

ADD COMMENT • link updated 3.3 years ago by ATpoint ★ 4.8k • written 3.3 years ago by st3472 • 0

score 0 · Answer 1 · 2022-01-05

0

Entering edit mode

ATpoint ★ 4.8k

@atpoint-13662

Last seen 11 hours ago

Germany

Cross-posted and answered: https://www.biostars.org/p/9504434/

ADD COMMENT • link 3.3 years ago ATpoint ★ 4.8k