Question: Error when running QoRTs for JunctionSeq
1
0
Entering edit mode
n.tear ▴ 20
@ntear-23651
Last seen 3.4 years ago
United Kingdom

Hi,

I am trying to run QoRTs on my STAR aligned data however I have been getting errors I dont understand. I have attempted to re-sort my files with samtools and have increased the memort to no avail.

I have included my full pipeline below:

#STAR alignment

#Indexing

STAR --runThreadN 6 \
--runMode genomeGenerate \
--genomeDir STARindex \
--genomeFastaFiles /mnt/z/NathanHaffordTear/ReferenceGenomes/Homo_sapiens.GRCh38.dna.primary_assembly.fa \
--sjdbGTFfile /mnt/z/NathanHaffordTear/ReferenceGenomes/Homo_sapiens.GRCh38.93.chr_patch_hapl_scaff.gtf \
--sjdbOverhang 149

#cd to location of raw data
#aligning

for f in `ls *.fq.gz | sed 's/_[12].fq.gz//g' | \
sort -u`; do echo $f && \
STAR --genomeDir STARindex \
--runThreadN 6 \
--readFilesIn <(gunzip -c ${f}_1.fq.gz ${f}_2.fq.gz) \
--outFileNamePrefix /mnt/z/NathanHaffordTear/STARalignments/STARalignment.hg38v93/${f}.sorted \
--outSAMtype BAM SortedByCoordinate \
--outSAMunmapped Within \
--outSAMattributes Standard; done

#sorting again to make sure

for f in `ls *.bam | sed 's/.bam//g' | \
sort -u`; do echo $f && \ 
samtools sort -@ 4 ${f}.bam -o ${f}.samtoolsort.bam;
done

#JunctionSeq
#cd to analysis output location

java -Xmx16G -jar /usr/local/QoRTs.jar QC --stranded --runFunctions writeKnownSplices,writeNovelSplices,writeSpliceExon /mnt/z/NathanHaffordTear/STARalignments/STARalignment.hg38v93/C18.sortedAligned.sortedByCoord.out.samtoolsort.bam /mnt/z/NathanHaffordTear/ReferenceGenomes/Homo_sapiens.GRCh38.93.chr_patch_hapl_scaff.gtf RawCounts/

#I have done this with both ensembl and gencode alignments with and without samtools sorting with the same errors

QoRTs .log file:

Starting QoRTs v1.3.6 (Compiled Tue Sep 25 11:21:46 EDT 2018)
Starting time: (Tue Feb 16 17:38:53 GMT 2021)
INPUT_COMMAND(QC)
  INPUT_ARG(infile)=/mnt/z/NathanHaffordTear/STARalignments/STARalignment.hg38v93/C18.sortedAligned.sortedByCoord.out.samtoolsort.bam
  INPUT_ARG(gtffile)=/mnt/z/NathanHaffordTear/ReferenceGenomes/Homo_sapiens.GRCh38.93.chr_patch_hapl_scaff.gtf
  INPUT_ARG(outdir)=RawCounts/
  INPUT_ARG(stranded)=true
  INPUT_ARG(runFunctions)=List(writeKnownSplices, writeNovelSplices, writeSpliceExon)
Created Log File: RawCounts//QC.jpnPkkWzfKwU.log
Warning: run-in-progress file "RawCounts//QC.QORTS_RUNNING" already exists. Is there another QoRTs job running?
Starting QC
[Time: 2021-02-16 17:38:53] [Mem usage: [21MB / 536MB]] [Elapsed Time: 00:00:00.0001]
QoRTs is Running in paired-end mode.
QoRTs is Running in any-sorted mode.
No parameter --genomeFA found.
NOTE: Function "writeKnownSplices" requires function "JunctionCalcs". Adding "JunctionCalcs" to the active function list...
Running functions: JunctionCalcs, writeKnownSplices, writeNovelSplices, writeSpliceExon 

Checking first 10000 reads. Checking SAM file for formatting errors...
   Stats on the first 10000 reads:
        Num Reads Primary Map:    3159
        Num Reads Paired-ended:   0
        Num Reads mapped pair:    0
        Num Pair names found:     0
        Num Pairs matched:        0
        Read Seq length:          150 to 150
        Unclipped Read length:    150 to 150
        Final maxReadLength:      150
        maxPhredScore:            37
        minPhredScore:            2
   WARNING WARNING WARNING! Running in paired-end mode, but reads appear to be single-end! Errors may follow.
           Strongly recommend using the '--singleEnded' option
   Note: Data appears to be paired-ended.
   Warning: Have not found any matched read-pairs in the first 10000 reads. Is data paired-end? Is data sorted?
   Sorting Note: Reads appear to be grouped by read-pair, probably sorted by name
   Sorting Note: Reads are sorted by position 
Done checking first 10000 reads. WARNINGS FOUND!
SAMRecord Reader Generated. Read length: 150.
[Time: 2021-02-16 17:38:56] [Mem usage: [315MB / 536MB]] [Elapsed Time: 00:00:02.0624]
> Init JunctionCalcs utility
Compiling flat feature annotation, internally in memory...
FlatteningGtf: starting...(2021-02-16 17:38:56)
    FlatteningGtf: gtf file read complete.(2021-02-16 17:40:24)
    FlatteningGtf: Splice Junction Map read.(2021-02-16 17:40:26)
    FlatteningGtf: gene Sets generated.(2021-02-16 17:40:28)
    FlatteningGtf: Aggregate Sets built.
    FlatteningGtf: Compiling Aggregate Info . . . (2021-02-16 17:40:28)
    FlatteningGtf: Finished Compiling Aggregate Info. (2021-02-16 17:40:28)
    FlatteningGtf: Iterating through the step-vector...(2021-02-16 17:40:28)
    FlatteningGtf: Adding the aggregate genes themselves...(2021-02-16 17:40:30)
    FlatteningGtf: Iterating through the splice junctions...(2021-02-16 17:40:31)
    FlatteningGtf: Sorting the aggregate genes...(2021-02-16 17:40:33)
    FlatteningGtf: Folding the FlatGtfLine iterator...(2021-02-16 17:40:34)
    FlatteningGtf: Features Built.(2021-02-16 17:40:34)
Internal flat feature annotation compiled!
length of knownSpliceMap after instantiation: 393388
length of knownCountMap after instantiation: 393388
> Init MinorUtils Utility
QC Utilities Generated!
[Time: 2021-02-16 17:40:39] [Mem usage: [1914MB / 9GB]] [Elapsed Time: 00:01:46.0350]
============================FATAL_ERROR============================
QoRTs encountered a FATAL ERROR. For general help, use command:
          java -jar path/to/jar/QoRTs.jar --man
============================FATAL_ERROR============================
Error info:
Exception in thread "main" java.lang.IllegalStateException: Inappropriate call if not paired read        
        at net.sf.samtools.SAMRecord.requireReadPaired(SAMRecord.java:655)
        at net.sf.samtools.SAMRecord.getFirstOfPairFlag(SAMRecord.java:713)
        at internalUtils.commonSeqUtils$$anon$4.next(commonSeqUtils.scala:1020)
        at internalUtils.commonSeqUtils$$anon$4.next(commonSeqUtils.scala:978)
        at internalUtils.stdUtils$IteratorProgressReporter$$anon$14.next(stdUtils.scala:969)
        at scala.collection.Iterator.foreach(Iterator.scala:929)
        at scala.collection.Iterator.foreach$(Iterator.scala:929)
        at internalUtils.stdUtils$IteratorProgressReporter$$anon$14.foreach(stdUtils.scala:963)
        at qcUtils.runAllQC$.runOnSeqFile(runAllQC.scala:1300)
        at qcUtils.runAllQC$.run(runAllQC.scala:970)
        at qcUtils.runAllQC$allQC_runner.run(runAllQC.scala:680)
        at runner.runner$.main(runner.scala:98)
        at runner.runner.main(runner.scala)

If anyone can assist me I would be very gateful

Nathan

DEXSeq JunctionSeq RNAseq QoRTs • 1.4k views
ADD COMMENT
2
Entering edit mode
Ian Sealy ▴ 20
@iansealy
Last seen 3.7 years ago

Hi,

The error from QoRTs is:

Exception in thread "main" java.lang.IllegalStateException: Inappropriate call if not paired read

And further up it says:

   WARNING WARNING WARNING! Running in paired-end mode, but reads appear to be single-end! Errors may follow.
           Strongly recommend using the '--singleEnded' option

QoRTs assumes you've using paired-end reads unless you specify otherwise, but your BAM file appears not to be, so it thinks there might a problem.

In your STAR alignment step, you've got:

--readFilesIn <(gunzip -c ${f}_1.fq.gz ${f}_2.fq.gz) \

From STAR's point of view, this is just one file (although in reality it's two files concatenated together). You need to specify two files by changing this to:

--readFilesIn <(gunzip -c ${f}_1.fq.gz) <(gunzip -c ${f}_2.fq.gz) \

Or:

--readFilesIn ${f}_1.fq.gz ${f}_2.fq.gz --readFilesCommand "gunzip -c" \

Or:

--readFilesIn ${f}_1.fq.gz ${f}_2.fq.gz --readFilesCommand zcat \

Hope that helps.

ADD COMMENT
0
Entering edit mode

This anwer solved my problem! Thank you for the insight and help!

ADD REPLY

Login before adding your answer.

Traffic: 469 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6