Specifying tx2gene with tximeta
1
0
Entering edit mode
@aimeehanson-23841
Last seen 4.4 years ago

Hi!

I'm working with RNASeq data from Salmon (Hg38 index) and have had success with reading my quant.sf files into R using the recommended:

## Read quant.sf files into summarised experiment (se) using tximeta
se <- tximeta(coldata)
gse <- summarizeToGene(se)

Subsequently, I've been trying to determine what to do with genes that have multiple Ensembl gene IDs (due to being derived from alternate haplotypes/patches in the Hg38 reference) prior to DESeq2 analysis. My attempted work around for this has been to generate a tx2gene table so that counts derived from all transcripts of a gene, including those transcripts on alternate haplotypes which are otherwise mapped to differing gene IDs, are instead summarised under the Ensembl gene ID corresponding to the gene on the reference chromosome.

After generating a tx2gene reference (tx2gene.switch) I've had success doing this: gse.noalt <- tximport::tximport(coldata$files, type = "salmon", tx2gene = tx2gene.switch, ignoreTxVersion = TRUE) however, would like to still import the very useful meta data that is accessible when using tximeta.

Simply running this fails:

> gse.noalt <- summarizeToGene(se, tx2gene = tx2gene.switch)
loading existing EnsDb created: 2020-06-09 08:29:00
obtaining transcript-to-gene mapping from database
loading existing gene ranges created: 2020-06-10 00:50:52
Error in .local(object, ...) : 
  formal argument "tx2gene" matched by multiple actual arguments

Presumably due to the default tx2gene information being sourced from the Salmon Index. Is there way for me to obtain data in the format achieved by using tximeta --> summarizeToGene with my own transcript to gene mapping? More generally, is it wise to collapse transcript counts derived from genes on alternate haplotypes in the same way counts from alternate isoforms are for gene level analyses?

Thanks in advance for any help!!

Aimee

tximeta tximport • 1.3k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 20 hours ago
United States

If you are working with human data, I'd just recommend using GENCODE, which is then a standard annotation set and does not have the issues with duplicate genes on haplotype chromosomes. I switched over to primarily using GENCODE a few years ago for human data.

ADD COMMENT

Login before adding your answer.

Traffic: 900 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6