Rsubread featureCounts with byReadGroup
1
0
Entering edit mode
Ge Tan ▴ 20
@ge-tan-7918
Last seen 3.0 years ago
Switzerland

I have Rsubread 1.26.1 installed in order to do featureCounts with byReadGroup=TRUE. But featureCounts complains with "No read groups are found; no output is generated".

|| Process BAM file merge_alignments.bam...                                   ||
||    Single-end reads are included.                                          ||
||    Assign reads to features...                                             ||
||    No read groups are found; no output is generated.                       ||
||    Running time : 0.48 minutes

My input bam files do have read group tag. Here's some lines of the heade and reads.

@RG ID:20170822.B-E11_m_Mut-10-1_A2B5_R1  SM:20170822.B-E11_m_Mut-10-1_A2B5_R1
@RG ID:20170822.B-E11_m_Mut-10-2_A2B5_R1  SM:20170822.B-E11_m_Mut-10-2_A2B5_R1
@RG ID:20170822.B-E11_m_Mut-10-3_A2B5_R1  SM:20170822.B-E11_m_Mut-10-3_A2B5_R1
@RG ID:20170822.B-E11_m_Mut-10-4_A2B5_R1  SM:20170822.B-E11_m_Mut-10-4_A2B5_R1
K00206:101:HH7C2BBXX:1:1105:5274:39506  256 1 3055515 0 124M2S  * 0 0 TAAATTAAAAGTTTTAGATGCTTGCCAGAAACTGTTAGAAAATTTTGGATTTTAATCTTGGTTTGACAAGCTACCTCTTCTTACAAGCAGGAAAGGAAAGAAAGGAATAAAAGAAAATGGATTTTA  AAFA<<J-AA<<7<JJJJJFJJJJFJJFFJJJJJJJJFJFFJJJJJFFFJJJJJJJJJJ-AJJJ7FJFFFFJJAJJAJJJJJJFFFAFFFJJJAFFFFJFFAJJFJJJJJFJJJJJJAJA-7A<A<  MD:Z:0A1T0T0A0A2G0T0T2A0G1T0G0C0T0T0G0C0C0A0G1A1C0T0G1T0A0G1A2T0T2G0G0A0T0T2A0A0T0C1T0G0G0T0T1G0A0C1A0G0C0T0A0C0C0T2T0C1T0A0C1A0G0C0A0G0G0A0A1G0G0A0A1G1A1G0G0A0A0T1A2G1A2T0G0G0A0T0T2A0  PG:Z:STAR RG:Z:20170822.B-E11_m_Mut-10-4_A2B5_R1  NH:i:7  NM:i:89 UQ:i:3294 AS:i:122
K00206:101:HH7C2BBXX:1:2214:17239:45572 256 1 3076930 0 124M2S  * 0 0 GTATTAGTCAGGGTTCTCTAGAGTCACAGAACTTATGGACAGTCTCTAGATAGTAAAGGAATTTATTGATGACTTACAGTCGGCAGCCCAATTCCTTACAATTGTTCAGTCGCAGCTGTGACTGGA  AAAAAJJFAFJJAAJJJJJJJJJ-<FJJJJJJAAJJJJJJJJ-FJJFJ<FAJJJJJJJJJJJFJJJJJJJJJJ7JFJFJAAJFJJJJJJJJJFJFAFFJJJJJJFJFJ<FFFF7FF<AJJJ-AFA-  MD:Z:0A1T0A0G0T0C0A0G0G1T0T0C3A0G2T0C0A2G1A0C0T0T0A1G0G0A0C1G0T0C3A0G1T1G0T0A0A1G0G0A0A0T0T1A1T0G0A0T0G0A0C0T0T0A0C1G0T0C0G0G0C0A0G0C0C1A0A0T0T0C0C0T0T0A0C1A0T0T0G1T0C0A0G0T0C0G1A0G0C0T0G2A0A0T0G0G0A0  PG:Z:STAR RG:Z:20170822.B-E11_m_Mut-10-4_A2B5_R1  NH:i:20 NM:i:97 UQ:i:3668 AS:i:120
K00206:101:HH7C2BBXX:1:1124:3234:47788  0 1 3077228 0 124M2S  * 0 0 GATTAAAGGTGTGTTCCTTAAACTCGGAGATTCAATCTTCTGGAATCCATAGCCACTATGGCTCAAGATCTCCAAACCAAGATCCAGATAAGGATCTCCAAGCCTCCAGATAAGGGTCACTGGTGA  AAFFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJFAFJJJJ7FJJAJJJJJJJJFJFJJJJJJJJFJFJJJJJJJJJJJJFFJJJJJJJJFJJJJJJJAJJJJJA7<<JJA  MD:Z:0T0T0A0A1G0G0T4T0C0C0T0T0A0A1C0T1G0G0A2T0T0C0A0A0T0C1T0C1G0G0A0A0T0C0C0A0T1G0C0C0A1T0A1G0G0C0T1A0A0G1T0C2C0A0A1C0C0A0A0G1T0C0C0A0G1T1A0G0G0A0T0C2C0A0A0G0C0C0T1C0A0G1T1A0G0G1T0C0A1T0G0G0T1A0  PG:Z:STAR RG:Z:20170822.B-E11_m_Mut-10-3_A2B5_R1  NH:i:38 NM:i:94 UQ:i:3732 AS:i:122

Ge

rsubread featurecounts • 2.2k views
ADD COMMENT
0
Entering edit mode
Yang Liao ▴ 450
@yang-liao-6075
Last seen 9 days ago
Australia

Hi Ge Tan,

I used Rsubread_1.26.1 to assign the lines you provided, and the read groups were correctly treated by featureCounts:

x<-featureCounts(c('user-RG.bam','user-RG.sam') , annot.inbuilt = "hg19", byReadGroup = T)
NCBI RefSeq annotation for hg19 (build 37.2) is used.

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 SAM file  1 BAM file                           ||
||                           S user-RG.bam                                    ||
||                           S user-RG.sam                                    ||
||                                                                            ||
||      Dir for temp files : .                                                ||
||                 Threads : 1                                                ||
||                   Level : meta-feature level                               ||
||              Paired-end : no                                               ||
||         Strand specific : no                                               ||
||      Multimapping reads : not counted                                      ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file /wehisan/general/system/bioinf-software/bioinfsof ... ||
||    Features : 225074                                                       ||
||    Meta-features : 25702                                                   ||
||    Chromosomes/contigs : 52                                                ||
||                                                                            ||
|| Process BAM file user-RG.bam...                                            ||
||    Single-end reads are included.                                          ||
||    Assign reads to features...                                             ||
||    Total read groups : 2                                                   ||
||    Total reads : 3                                                         ||
||    Successfully assigned reads : 0 (0.0%)                                  ||
||    Running time : 0.00 minutes                                             ||
||                                                                            ||
|| Process SAM file user-RG.sam...                                            ||
||    Single-end reads are included.                                          ||
||    Assign reads to features...                                             ||
||    Total read groups : 2                                                   ||
||    Total reads : 3                                                         ||
||    Successfully assigned reads : 0 (0.0%)                                  ||
||    Running time : 0.00 minutes                                             ||
||                                                                            ||
||                         Read assignment finished.                          ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

> x$counts[1:3,]
          user.RG.bam.20170822.B.E11_m_Mut.10.4_A2B5_R1
653635                                                0
100422834                                             0
645520                                                0
          user.RG.bam.20170822.B.E11_m_Mut.10.3_A2B5_R1
653635                                                0
100422834                                             0
645520                                                0
          user.RG.sam.20170822.B.E11_m_Mut.10.4_A2B5_R1
653635                                                0
100422834                                             0
645520                                                0
          user.RG.sam.20170822.B.E11_m_Mut.10.3_A2B5_R1
653635                                                0
100422834                                             0
645520       

Can you provide your full command for running featureCounts?

Cheers,

Yang

 

 

ADD COMMENT
0
Entering edit mode

Many thanks Yang! I used external Ensembl annotation. Here's the minimal command.

featureCounts("merge_alignments_sorted.bam", annot.inbuilt=NULL, annot.ext="/srv/GT/reference/Mus_musculus/Ensembl/GRCm38.p5/Annotation/Release_89-2017-05-31/Genes/genes.gtf", isGTFAnnotationFile=TRUE, GTF.featureType="exon", GTF.attrType="gene_id", byReadGroup=TRUE)

It seems I figured out the problem. My bam file was derived from picard MergeBamAlignment. Somehow featurecounts doesn't like the bam from picard tools. When I convert the bam to sam, and sam back to bam with samtools, then both bam and sam files work.

You think it might be the issue?

Ge

ADD REPLY
0
Entering edit mode

It still works when I converted the SAM file to BAM by using Picard (SortSam).

Is it possible if you can send us the BAM file you're using? 

Cheers,

Yang

# /picard-tools-2.9.4/bin/SortSam  I=user-RG.bam o=user-RG-Picard.bam SO=coordinate

> x<-featureCounts(c('user-RG.bam','user-RG-Picard.bam'), annot.ext="ensembl-for_3tests-shuf.GTF",  annot.inbuilt=NULL , isGTFAnnotationFile=TRUE, GTF.featureType="exon", GTF.attrType="gene_id", byReadGroup=TRUE)
        ==========     _____ _    _ ____  _____  ______          _____
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 1.26.1

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 2 BAM files                                      ||
||                           S user-RG.bam                                    ||
||                           S user-RG-Picard.bam                             ||
||                                                                            ||
||      Dir for temp files : .                                                ||
||                 Threads : 1                                                ||
||                   Level : meta-feature level                               ||
||              Paired-end : no                                               ||
||         Strand specific : no                                               ||
||      Multimapping reads : not counted                                      ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file ensembl-for_3tests-shuf.GTF ...                       ||
||    Features : 1302592                                                      ||
||    Meta-features : 63152                                                   ||
||    Chromosomes/contigs : 255                                               ||
||                                                                            ||
|| Process BAM file user-RG.bam...                                            ||
||    Single-end reads are included.                                          ||
||    Assign reads to features...                                             ||
||    Total read groups : 2                                                   ||
||    Total reads : 3                                                         ||
||    Successfully assigned reads : 0 (0.0%)                                  ||
||    Running time : 0.00 minutes                                             ||
||                                                                            ||
|| Process BAM file user-RG-Picard.bam...                                     ||
||    Single-end reads are included.                                          ||
||    Assign reads to features...                                             ||
||    Total read groups : 2                                                   ||
||    Total reads : 3                                                         ||
||    Successfully assigned reads : 0 (0.0%)                                  ||
||    Running time : 0.00 minutes                                             ||
||                                                                            ||
||                         Read assignment finished.                          ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//


ADD REPLY
0
Entering edit mode

You can download the bam from https://1drv.ms/u/s!Ar4vJvgFbg4Ki8IqTmvwZ1kh8pGgXA

Many thanks!

Ge

ADD REPLY
0
Entering edit mode

Thanks! I've downloaded the data, and found that featureCounts had a bug in the routine for fixing the BAM format. It did not keep the read group names when fixing the BAM files from Picard or STAR. 

This bug has been fixed. The next released version should work on your BAM files.

Cheers,

Yang

ADD REPLY
0
Entering edit mode

Super!

Many thanks.

Ge

ADD REPLY

Login before adding your answer.

Traffic: 565 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6