Question

DEXSeq-Error in FUN(X[[i]], ...) : subscript out of bounds

1

Entering edit mode

Mthabisi Moyo ▴ 40

@mthabisi-moyo-9721

Last seen 3 months ago

United States

I am trying to use DEXSeq to analyze a mouse paired-end stranded RNA-seq data (poly-A RNA capture). I Aligned the reads using STAR and obtained the reference genome and annotation file (which I collapsed with dexseq_prepare_annotation.py) from GENCODE (release m11). I used HTSeq to count exons (using command "htseq-count -m union -s reverse -i gene_id -t exon) and over 80% mapped to a feature.

I set up my countFiles, collapsed annotation file and sampleTable in R:
countFiles <- list.files(directory, pattern=".txt", full.names = TRUE)
sampleTable <- data.frame(row.names = c("WT_RNA_1", "WT_RNA_2", "WT_RNA_3", "KO_RNA_1", "KO_RNA_2", "KO_RNA_3"), condition = c("WT", "WT", "WT", "KO", "KO", "KO"), libType = c("paired-end", "paired-end", "paired-end", "paired-end", "paired-end", "paired-end"))

When I try to create a DEXSeqdataset it gives me an out of bounds error:
dxd <- DEXSeqDataSetFromHTSeq(countFiles, sampleData = sampleTable, design= ~sample + exon + condition:exon, flattenedfile = flattenedFile)
Error in FUN(X[[i]], ...) : subscript out of bounds

I have looked over the count files and the collapsed annotation and I can't seem to figure out where the mismatch between these files is occurring. What is curious is that I do not see this error when I use DEXSeq to analyze RNA-seq data from human cell lines. Is there something I am missing regarding alignment and counting of the mouse data set?

Thank you for your help

dexseq • 2.6k views

ADD COMMENT • link 8.4 years ago Mthabisi Moyo ▴ 40

0

Entering edit mode

Hi, it's probable that either the gtf file or that one or more of the count files are corrupted. Could you check that count files have all the same number of lines and that the gtf file has the same number of exonic regions as the number of lines of the count files?

Alejandro

ADD REPLY • link 8.4 years ago Alejandro Reyes ★ 1.9k

0

Entering edit mode

Hi Alejandro,

It turns out it was likely the GENCODE gtf file. Switched to the mouse annotation file on Ensembl and it worked well. It is still odd to me considering the count files were created using the GENCODE gtf so I thought it would still result with the same number of lines but I guess not.

Mthabisi

ADD REPLY • link 8.4 years ago Mthabisi Moyo ▴ 40