Based on the manual, featureCounts function in Rsubread set countPrimaryAlignmentsOnly=FALSE by default. By setting countPrimaryAlignmentsOnly=TRUE, as alignments used are less, the counts seem to be less too. But I got more counts in all samples. Sometimes much more. For example, I got this when not setting countPrimaryAlignmentsOnly (=FALSE)
ENSG00000227232.5 1 1 2 2 1 1
But when I set countPrimaryAlignmentsOnly=TRUE, I got this
ENSG00000227232.5 190 217 210 256 235 254
I used Rsubread 1.18.0 in R 3.2.1. The samples are RNA single-end data. Mapper was STAR 2.5.1b. Other settings of featureCounts() were: isGTFAnnotationFile=TRUE, nthreads=16, strandSpecific=2. All other settings are default. I used an external human GTF file from GENCODE (v24, with patches and scaffolds).
The code is as below. Did I do anything wrong? I'm glad to provide further info if needed. Thanks.
library(Rsubread) fc <- featureCounts(files=c('G1/G1.STAR.genome.bam', 'G2/G2.STAR.genome.bam', 'G3/G3.STAR.genome.bam', 's1/s1.STAR.genome.bam', 's2/s2.STAR.genome.bam', 's3/s3.STAR.genome.bam'), countPrimaryAlignmentsOnly=TRUE, annot.ext='gencode.v24.chr_patch_hapl_scaff.annotation.gtf', isGTFAnnotationFile=TRUE, nthreads=16, strandSpecific=2) colnames(fc$counts) <- c('G1', 'G2', 'G3', 's1', 's2', 's3') write.table(fc$counts, file='count.txt', quote=FALSE, sep='\t', col.names=TRUE, row.names=TRUE) > sessionInfo() R version 3.2.1 (2015-06-18) Platform: x86_64-unknown-linux-gnu (64-bit) Running under: CentOS release 6.4 (Final) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Rsubread_1.18.0 loaded via a namespace (and not attached): [1] tools_3.2.1
I see. Thank you so much for your detailed explanation.