Hi,
I am having some problems understanding the settings for counting multi-mapping reads using featureCounts
Previously we did the following:
- Aligned with HISAT2
- Used featureCounts from Subread 1.5.0 (for consistency with another paper)
featureCounts -p -C -B -f -T 5 --primary \
-a examples.gtf \
-o output.counts \
input.bam >> featurecounts.txt 2>&1
This did count multimapping reads once each
Alternatively using
featureCounts -p -C -B -f -T 5 \
-a examples.gtf \
-o output.counts \
input.bam >> featurecounts.txt 2>&1
Did not count multimapping reads
(We know that they were/were not being counted from a small amplicon sequencing dataset with few reads)
Now due to some strange gzip error that suddenly appeared using Subread 1.5.0 when the cluster hardware was changed by our admin (GZIP error 2 with any BAM file) we have upgraded to Subread 1.6.4
So now we do the following:
- Align with HISAT2
- Use featureCounts from Subread 1.6.4
However if we run
featureCounts -p -C -B -f -T 5 --primary \
-a examples.gtf \
-o output.counts \
input.bam >> featurecounts.txt 2>&1
OR
featureCounts -p -C -B -f -T 5 \
-a examples.gtf \
-o output.counts \
input.bam >> featurecounts.txt 2>&1
We get the same output...
- Are the multi-mapping reads being counted when
--primary
is used in version 1.6.4? - Has there been a change in the usage of
--primary
in different versions or has something else happened? - How do we specify that we do or do not wish to count multi-mapping reads in 1.6.4? - Some research has lead us to the
-M
option but the manual says that--primary
disregards it - Why are we getting the GZIP error 2 with 1.5.0? - we have tried freshly downloaded TCGA BAMs
Thank you so much for your help!