Single end STAR Chimeric.out.junction fails to read with chimera::importFusionData
3
0
Entering edit mode
wresch • 0
@wresch-7286
Last seen 10.0 years ago
United States

Hi,

I have a 1x50nt illumina RNASeq data set that was only intended for gene level expression analysis.  I was asked to look for fusion transcripts even though I emphasized that the sensitivity for finding such fusions would be lousy.  I decided to start with a  STAR (alignment to mm9) -> chimera workflow.  Each sample reports a small number of fusion reads with only a few fusions that have 5 or more supporting reads as determined by awk.  My guess is that they are all false positives.  One example from a out.junction file (full file: https://s3.amazonaws.com/idata.drgang.net/temp/Chimeric.out.junction)

chr3    138267132       +       chr2    181382061       +       0       0       0       DFXGT8Q1:294:C5A6EACXX:8:1105:20793:74047       138267108       24M26S  181382062       24S26M

chimera::importFusionData("star", "path/to/file", org = "mm", min.support = 1)

returns NULL and complains:

The input file does not have any spanning read.
Your fusion lacking of spanning reads are most probably artifacts
The analysis of fusions lacking spanning reads is not supported.

I'm new to fusion transcript detection, so this is a stupid question, but the read above to me seems to be a spanning read, right?  So what is wrong with what I'm doing?

Thanks in advance for any help

Wolfgang

 

> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] chimera_1.8.4                          
 [2] TxDb.Hsapiens.UCSC.hg19.knownGene_3.0.0
 [3] GenomicFeatures_1.18.2                 
 [4] BSgenome.Hsapiens.UCSC.hg19_1.4.0      
 [5] BSgenome_1.34.0                        
 [6] rtracklayer_1.26.2                     
 [7] org.Hs.eg.db_3.0.0                     
 [8] RSQLite_1.0.0                          
 [9] DBI_0.3.1                              
[10] AnnotationDbi_1.28.1                   
[11] GenomicAlignments_1.2.1                
[12] Rsamtools_1.18.2                       
[13] Biostrings_2.34.0                      
[14] XVector_0.6.0                          
[15] GenomicRanges_1.18.3                   
[16] GenomeInfoDb_1.2.3                     
[17] IRanges_2.0.0                          
[18] S4Vectors_0.4.0                        
[19] Biobase_2.26.0                         
[20] BiocGenerics_0.12.1                    

loaded via a namespace (and not attached):
 [1] base64enc_0.1-2    BatchJobs_1.5      BBmisc_1.8         BiocParallel_1.0.0
 [5] biomaRt_2.22.0     bitops_1.0-6       brew_1.0-6         checkmate_1.5.0   
 [9] codetools_0.2-9    digest_0.6.4       fail_1.2           foreach_1.4.2     
[13] iterators_1.0.7    RCurl_1.95-4.5     sendmailR_1.2-1    stringr_0.6.2     
[17] tools_3.1.1        XML_3.98-1.1       zlibbioc_1.12.0   

 

chimera STAR • 2.6k views
ADD COMMENT
1
Entering edit mode
@raffaele-calogero-294
Last seen 9.1 years ago
Italy/Turin/University of Torino

Hi Wolfgang,

you are the second person that highlight this problem in uploading STAR data in chimera.

We have identified the problem in the C++ parser that counts the reads in the Chimeric.out.junction file and we are going to fix it by next week in version 1.8.5.

Thanks for highlighting the problem.

Raffaele

 

 

ADD COMMENT
0
Entering edit mode

Hi Raffaele,

great. Thanks for the fast reply and the upcoming fix.

Wolfgang

ADD REPLY
0
Entering edit mode
@raffaele-calogero-294
Last seen 9.1 years ago
Italy/Turin/University of Torino

Hi,

we committed  to the Bioconductor repository chimera 1.8.5. We think we fixed the issue encountered importing STAR data.

It should be available for downloading in 24 hours

Cheers

Raf

ADD COMMENT
0
Entering edit mode

 

Hi Raf,

I was receiving the same error message using chimera 1.6 with STAR data when I found this post via google.   I installed version 1.8.5 and received this error message:

tmp <- importFusionData('star',"Chimeric.out.junction",org="mm", min.support=1)

chrM is removed from fusion acceptor

chrM is removed from fusion donor

The input file does not seems to have any fusion.
Please contact the developers.

Is this the same issue with the C++ parser or something else?

thanks in advance

 

ADD REPLY
0
Entering edit mode
rcaloger ▴ 500
@rcaloger-1888
Last seen 9.9 years ago
European Union

I think the problem is related to the version of the human genome you have used.

Are you using hg38?

In the actual stable version it only allow the use of hg19. This issue is solved in the devel version.

Could please try to use the devel version?

If the problem is not solved with the devel could please send me the STAR output to understand the issue?

Cheers

Raf

 

ADD COMMENT

Login before adding your answer.

Traffic: 660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6