Hello, I'm using the RNASeqGeneEdgeRQL workflow to process some RNASeq data. I've tried to create the required matrix of counts through the featureCounts function from Rsubread, by using an external reference genome. That's the code I used.
fastqPath <- list.files("JustThePathofMyFastQFolder", pattern="\\.fastq$", full=TRUE)
all.bam <- sub("\\.fastq$", ".bam", fastqPath)
fc <- featureCounts(all.bam, annot.ext="ThePathofMyGtfFile", isGTFAnnotationFile=TRUE, nthreads=16, GTF.featureType="gene", GTF.attrType="gene_id", allowMultiOverlap=TRUE, fraction=TRUE, isPairedEnd=FALSE`)
I previously edited my GTF file (after converting a .GFF3 file with the package rtracklayer) so that only the genes are included, because otherwise featureCounts would return an error. I changed 'ID' with 'gene_id' and only kept the lines referring to genes features, as follows:
The old file: https://i.ibb.co/ThQrThC/File-old.png
The new file: https://i.ibb.co/ykz9rpF/File.png
The previous .bam files were successfully created, but the counting step seems to fail.
While running the featureCount script, that's exactly what happens:
It says it can assign successfully 0% of the alignments and has a dramatically low running time:
Then, when I try to visualize the output, by typing fc, that's what runs on my screen:
What may the problem be? Thanks a lot!
Here is my sessioninfo() :
I have removed the RNASeqGeneEdgeRQL tag from your question, and I've also edited the title, because what you are doing has no overlap with the RNASeqGeneEdgeRQL workflow.