Error in Rsubread featureCounts
1
0
Entering edit mode
Zhaoxu • 0
@d13c6171
Last seen 2.7 years ago
United States

Hi there,

Excellent package! I am using it to do RNA-seq. But I encountered a small problem when using featureCounts(). The code is as follows:


featureCounts(
"A1.raw_1.fastq.gz.subjunc.BAM",
annot.inbuilt = NULL,
annot.ext = "GCF_015227675.2_mRatBN7.2_genomic.gtf",
isGTFAnnotationFile=TRUE,
isPairedEnd=TRUE,

nthreads = 8
    )

And it returns this:

        ==========     _____ _    _ ____  _____  ______          _____  
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \ 
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 2.8.0

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 BAM file                                       ||
||                                                                            ||
||                           A1.raw_1.fastq.gz.subjunc.BAM                    ||
||                                                                            ||
||              Paired-end : yes                                              ||
||        Count read pairs : yes                                              ||
||              Annotation : GCF_015227675.2_mRatBN7.2_genomic.gtf (GTF)      ||
||      Dir for temp files : .                                                ||
||                 Threads : 8                                                ||
||                   Level : meta-feature level                               ||
||      Multimapping reads : counted                                          ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
\\============================================================================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file GCF_015227675.2_mRatBN7.2_genomic.gtf ...             ||
||    Features : 1144963                                                      ||
||    Meta-features : 34322                                                   ||
||    Chromosomes/contigs : 80                                                ||
||                                                                            ||
|| Process BAM file A1.raw_1.fastq.gz.subjunc.BAM...                          ||
||    Paired-end reads are included.                                          ||
||    Total alignments : 41444160                                             ||
||    Successfully assigned alignments : 33409266 (80.6%)                     ||
||    Running time : 0.89 minutes                                             ||
||                                                                            ||
|| Write the final count table.                                               ||
|| Write the read assignment summary.                                         ||
||                                                                            ||
\\============================================================================//

ERROR while rich displaying an object: Error in paste(capture.output(print(obj)), collapse = "\n"): result would exceed 2^31-1 bytes

Traceback:
1. FUN(X[[i]], ...)
2. tryCatch(withCallingHandlers({
 .     if (!mime %in% names(repr::mime2repr)) 
 .         stop("No repr_* for mimetype ", mime, " in repr::mime2repr")
 .     rpr <- repr::mime2repr[[mime]](obj)
 .     if (is.null(rpr)) 
 .         return(NULL)
 .     prepare_content(is.raw(rpr), rpr)
 . }, error = error_handler), error = outer_handler)
3. tryCatchList(expr, classes, parentenv, handlers)
4. tryCatchOne(expr, names, parentenv, handlers[[1L]])
5. doTryCatch(return(expr), name, parentenv, handler)
6. withCallingHandlers({
 .     if (!mime %in% names(repr::mime2repr)) 
 .         stop("No repr_* for mimetype ", mime, " in repr::mime2repr")
 .     rpr <- repr::mime2repr[[mime]](obj)
 .     if (is.null(rpr)) 
 .         return(NULL)
 .     prepare_content(is.raw(rpr), rpr)
 . }, error = error_handler)
7. repr::mime2repr[[mime]](obj)
8. repr_text.default(obj)
9. paste(capture.output(print(obj)), collapse = "\n")

It's my first time doing RNA-seq, and I really appreciate your help. Thank you.

Rsubread • 3.9k views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 4 hours ago
United States

You should capture the output from featureCounts in a new object. If you do not do that, it will just print the output to your screen.

It appears that the function finishes without problems, but it also appears that you are trying to use the repr package to capture something, and that is giving you the error. I can't come up with a reason to use repr in this context. There is no reason to output any of these data, but instead you should be using with downstream tools like edgeR or DESeq2.

ADD COMMENT
0
Entering edit mode

Hi James,

Thank you so much for your advice! it worked perfectly after capturing the output to a new object.

Can I ask another question? I am stuck on how to input multiple files to featureCounts() or subjunc(). If I write this code:

featurefiles<-featureCounts(
"B3.raw_1.fastq.gz.subjunc.BAM","B2.raw_1.fastq.gz.subjunc.BAM",
annot.inbuilt = NULL,
annot.ext = "GCF_015227675.2_mRatBN7.2_genomic.gtf",
isGTFAnnotationFile=TRUE,
isPairedEnd=TRUE,

nthreads = 8
    )

it will return this:

        ==========     _____ _    _ ____  _____  ______          _____  
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \ 
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 2.8.0

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 BAM file                                       ||
||                                                                            ||
||                           B3.raw_1.fastq.gz.subjunc.BAM                    ||
||                                                                            ||
||              Paired-end : yes                                              ||
||        Count read pairs : yes                                              ||
||              Annotation : GCF_015227675.2_mRatBN7.2_genomic.gtf (GTF)      ||
||      Dir for temp files : .                                                ||
||                 Threads : 8                                                ||
||                   Level : meta-feature level                               ||
||      Multimapping reads : counted                                          ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
\\============================================================================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file GCF_015227675.2_mRatBN7.2_genomic.gtf ...             ||
ERROR: no features were loaded in format GTF. The annotation format can be specified by the 'isGTFAnnotationFile' option, and the required feature type can be specified by the 'GTF.featureType' option.


No counts were generated.
Error in .stop_quietly(): 
Traceback:

1. featureCounts("B3.raw_1.fastq.gz.subjunc.BAM", "B2.raw_1.fastq.gz.subjunc.BAM", 
 .     annot.inbuilt = NULL, annot.ext = "GCF_015227675.2_mRatBN7.2_genomic.gtf", 
 .     isGTFAnnotationFile = TRUE, isPairedEnd = TRUE, nthreads = 8)
2. .stop_quietly()
3. stop()

Thank you again for your kind help.

ADD REPLY
2
Entering edit mode

Your R command seems correct, although featureCounts by default expects the GTF file to contain at least one line of "exon" annotation (i.e. the value in the 3rd column in the GTF file is exon). It also expects to see the gene_id tag in the key-value pairs of the exon lines. If your GTF file does not satisfy these conditions, featureCounts will report this error message. If you want to use a GTF file that contains no exons but other types of annotations, you may use the GTF.featureType option of featureCounts to specify the type you want (i.e. the wanted value in the 3rd column). Or, if your GTF file does not have gene_id tags, you may use the GTF.attrType option to specify the name of tag that you want to use as the name of features (e.g. gene_name).

Here is an example of a line of GTF annotation that featureCounts by default accepts (from Ensembl). It is an exon annotation and has a gene_id tag.

1       havana  exon    14404   14501   .       -       .       gene_id "ENSG00000227232"; gene_version "5" ......
ADD REPLY
0
Entering edit mode

Hi Yang,

Thank you so much for your reply. The GTF file is downloaded from NCBI. And it contains other tags like gene and transcript along with exon. The first a few lines are posted below:

#gtf-version 2.2
#!genome-build mRatBN7.2
#!genome-build-accession NCBI_Assembly:GCF_015227675.2
#!annotation-source NCBI Rattus norvegicus Annotation Release 108
NC_051336.1 Gnomon  gene    40042   56215   .   -   .   gene_id "LOC102552844"; transcript_id ""; db_xref "GeneID:102552844"; db_xref "RGD:7551986"; gbkey "Gene"; gene "LOC102552844"; gene_biotype "pseudogene"; pseudo "true"; 
NC_051336.1 BestRefSeq%2CGnomon gene    76909   85762   .   +   .   gene_id "Vom2r3"; transcript_id ""; db_xref "GeneID:502213"; db_xref "RGD:1565892"; description "vomeronasal 2 receptor, 3"; gbkey "Gene"; gene "Vom2r3"; gene_biotype "protein_coding"; gene_synonym "RGD1565892"; 
NC_051336.1 BestRefSeq  transcript  76909   85762   .   +   .   gene_id "Vom2r3"; transcript_id "NM_001099460.1"; db_xref "GeneID:502213"; exception "annotated by transcript or proteomic data"; gbkey "mRNA"; gene "Vom2r3"; inference "similar to RNA sequence, mRNA (same species):RefSeq:NM_001099460.1"; note "The RefSeq transcript has 4 substitutions compared to this genomic sequence"; product "vomeronasal 2 receptor, 3"; transcript_biotype "mRNA"; 
NC_051336.1 BestRefSeq  exon    76909   77114   .   +   .   gene_id "Vom2r3"; transcript_id "NM_001099460.1"; db_xref "GeneID:502213"; exception "annotated by transcript or proteomic data"; gene "Vom2r3"; inference "similar to RNA sequence, mRNA (same species):RefSeq:NM_001099460.1"; note "The RefSeq transcript has 4 substitutions compared to this genomic sequence"; product "vomeronasal 2 receptor, 3"; transcript_biotype "mRNA"; exon_number "1"; 
NC_051336.1 BestRefSeq  exon    79753   80035   .   +   .   gene_id "Vom2r3"; transcript_id "NM_001099460.1"; db_xref "GeneID:502213"; exception "annotated by transcript or proteomic data"; gene "Vom2r3"; inference "similar to RNA sequence, mRNA (same species):RefSeq:NM_001099460.1"; note "The RefSeq transcript has 4 substitutions compared to this genomic sequence"; product "vomeronasal 2 receptor, 3"; transcript_biotype "mRNA"; exon_number "2"; 

Can I use GTF.featureType = "exon" for this GTF file?

And if I only enter one filename in the files, it will work out successfully.

B3<-featureCounts(
"B3.raw_1.fastq.gz.subjunc.BAM",
annot.inbuilt = NULL,
annot.ext = "GCF_015227675.2_mRatBN7.2_genomic.gtf",
isGTFAnnotationFile=TRUE,
isPairedEnd=TRUE,

nthreads = 8
    )


        ==========     _____ _    _ ____  _____  ______          _____  
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \ 
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 2.8.0

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 BAM file                                       ||
||                                                                            ||
||                           B3.raw_1.fastq.gz.subjunc.BAM                    ||
||                                                                            ||
||              Paired-end : yes                                              ||
||        Count read pairs : yes                                              ||
||              Annotation : GCF_015227675.2_mRatBN7.2_genomic.gtf (GTF)      ||
||      Dir for temp files : .                                                ||
||                 Threads : 8                                                ||
||                   Level : meta-feature level                               ||
||      Multimapping reads : counted                                          ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
\\============================================================================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file GCF_015227675.2_mRatBN7.2_genomic.gtf ...             ||
||    Features : 1144963                                                      ||
||    Meta-features : 34322                                                   ||
||    Chromosomes/contigs : 80                                                ||
||                                                                            ||
|| Process BAM file B3.raw_1.fastq.gz.subjunc.BAM...                          ||
||    Paired-end reads are included.                                          ||
||    Total alignments : 87080909                                             ||
||    Successfully assigned alignments : 70095471 (80.5%)                     ||
||    Running time : 1.42 minutes                                             ||
||                                                                            ||
|| Write the final count table.                                               ||
|| Write the read assignment summary.                                         ||
||                                                                            ||
\\============================================================================//

But if I enter more than one filename in the files, it will encounter a problem.

B3<-featureCounts(
"B3.raw_1.fastq.gz.subjunc.BAM","B2.raw_1.fastq.gz.subjunc.BAM",
annot.inbuilt = NULL,
annot.ext = "GCF_015227675.2_mRatBN7.2_genomic.gtf",
isGTFAnnotationFile=TRUE,
isPairedEnd=TRUE,

nthreads = 8
    )


        ==========     _____ _    _ ____  _____  ______          _____  
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \ 
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 2.8.0

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 BAM file                                       ||
||                                                                            ||
||                           B3.raw_1.fastq.gz.subjunc.BAM                    ||
||                                                                            ||
||              Paired-end : yes                                              ||
||        Count read pairs : yes                                              ||
||              Annotation : GCF_015227675.2_mRatBN7.2_genomic.gtf (GTF)      ||
||      Dir for temp files : .                                                ||
||                 Threads : 8                                                ||
||                   Level : meta-feature level                               ||
||      Multimapping reads : counted                                          ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
\\============================================================================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file GCF_015227675.2_mRatBN7.2_genomic.gtf ...             ||
ERROR: no features were loaded in format GTF. The annotation format can be specified by the 'isGTFAnnotationFile' option, and the required feature type can be specified by the 'GTF.featureType' option.


No counts were generated.
Error in .stop_quietly(): 
Traceback:

1. featureCounts("B3.raw_1.fastq.gz.subjunc.BAM", "B2.raw_1.fastq.gz.subjunc.BAM", 
 .     annot.inbuilt = NULL, annot.ext = "GCF_015227675.2_mRatBN7.2_genomic.gtf", 
 .     isGTFAnnotationFile = TRUE, isPairedEnd = TRUE, nthreads = 8)
2. .stop_quietly()
3. stop()

I think I must enter the filenames in the wrong way. I really appreciate it if you can give me some advice on how to enter multiple filenames in the files.

Thank you again for your kind reply.

ADD REPLY
1
Entering edit mode

The GTF seems to be OK, but in R, the unnamed parameters are given to the 1st, 2nd, 3rd, ... options of the function. For running the featureCounts function with multiple input BAM files, you need to put them in a vector as the first parameter:

featurefiles<-featureCounts(
 c("B3.raw_1.fastq.gz.subjunc.BAM","B2.raw_1.fastq.gz.subjunc.BAM"), ...)

Otherwise featureCounts will treat the second file name as the second parameter.

ADD REPLY
0
Entering edit mode

Hi Yang,

Thank you so much for answering such a basic question for me. It worked perfectly.

ADD REPLY

Login before adding your answer.

Traffic: 408 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6