CRISPRseek Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") interpretation
1
0
Entering edit mode
xenon ▴ 40
@xenon-21520
Last seen 3.9 years ago

I am trying to implement the same analysis of offTargetAnalysis() but without a BSgenome object. This is being done by using offTargetAnalysisWithoutBSgenome(). But I get an error at the end of the analysis. Am I missing any parameter that's critical for the offTargetAnalysisWithoutBSgenome() run ?

Note : The error comes only when I use the offTargetAnalysisWithoutBSgenome(). It does not come when I am using offTargetAnalysis with BSgenome.Hsapiens.UCSC.hg19 and TxDb.Hsapiens.UCSC.hg19.knownGene. I tried using TxDb.Hsapiens.UCSC.hg19.knownGene instead of using a txdb object generated from hg19.refGene.gtf in offTargetAnalysisWithoutBSgenome() , but the error comes even in this case.

Here's the offTargetAnalysis call that I want to implement in offTargetAnalysisWithoutBSgenome :-


################################################################
#loading necessary pacakges
################################################################
library(Biostrings)
library(CRISPRseek)
library(GenomicFeatures)

################################################################
#loading the 1 guide rna sequence from a set which are present as 23 base pair sequences 
################################################################
temp_gRNA = readDNAStringSet("/home/user/Documents/case2/gRNAs.fa", nrec = 1)

################################################################
#removing the PAM sequence from the end and keeping th rest of the 20 bp
################################################################
temp_gRNA = subseq(temp_gRNA, start = 1, width = 20)

################################################################
#offTargetAnalysis call
################################################################
res = offTargetAnalysis(inputFilePath = temp_gRNA, findgRNAs = FALSE, findgRNAsWithREcutOnly = FALSE,findPairedgRNAOnly = FALSE,chromToSearch = "all", outputDir = "/home/user/Documents/case4/", overwrite = TRUE, gRNAoutputName = "temp_guide_rna_scores" , enable.multicore = TRUE, n.cores.max = 5, BSgenomeName =  BSgenome.Hsapiens.UCSC.hg19, txdb = TxDb.Hsapiens.UCSC.hg19.knownGene)

Here's the offTargetAnalysisWithoutBSgenome implementation:-

 ################################################################
#loading necessary pacakges
################################################################
library(Biostrings)
library(CRISPRseek)
library(GenomicFeatures)

################################################################
#loading the 1 guide rna sequence from a set which are present as 23 base pair sequences 
################################################################
temp_gRNA = readDNAStringSet("/home/user/Documents/case2/gRNAs.fa", nrec = 1)

################################################################
#removing the PAM sequence from the end and keeping th rest of the 20 bp
################################################################
temp_gRNA = subseq(temp_gRNA, start = 1, width = 20)

################################################################
#making txdb object
#gtf source : http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/genes/hg19.refGene.gtf.gz
################################################################
human_txdb = makeTxDbFromGFF("/home/user/Documents/genome/hg19/hg19.refGene.gtf")

################################################################
#running off target analysis with hg19 genome fasta file
#genome source : http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz
################################################################
res = offTargetAnalysisWithoutBSgenome(inputFilePath = temp_gRNA, findgRNAs = FALSE, findgRNAsWithREcutOnly = FALSE,findPairedgRNAOnly = FALSE,chromToSearch = "all", outputDir = "/home/user/Documents/crisprseek_sample_output_from_forcrisprseek_venv", overwrite = TRUE, gRNAoutputName = "temp_guide_rna_scores" , enable.multicore = TRUE, n.cores.max = 12,useBSgenome = FALSE, genomeSeqFile = "/home/user/Documents/genome/hg19/hg19.fa" , txdb = human_txdb)

I am not able to interpret the error which comes at the end : "Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") : In range 12: 'end' must be >= 'start' - 1"

Here's the run output :

# Validating input ...
# >>> Finding all hits in sequence chr1 ...
.
.
.
# >>> Finding all hits in sequence chr18_gl000207_random ...
# >>> DONE searching
# Building feature vectors for scoring ...
# Calculating scores ...
# Annotating, filtering and generating reports ...
# Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") : 
#   In range 12: 'end' must be >= 'start' - 1.
# In addition: There were 50 or more warnings (use warnings() to see the first 50)

Session info:

sessionInfo( )
─ Session info ─────────────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 3.6.1 (2019-07-05)
 os       CentOS Linux 8 (Core)       
 system   x86_64, linux-gnu           
 ui       RStudio                     
 language (EN)                        
 collate  en_IN.UTF-8                 
 ctype    en_IN.UTF-8                 
 tz       America/New_York            
 date     2020-11-22                  

─ Packages ─────────────────────────────────────────────────────────────────────────
 package                           * version  date       lib source        
 ade4                                1.7-16   2020-10-28 [1] CRAN (R 3.6.1)
 AnnotationDbi                     * 1.48.0   2019-10-29 [1] Bioconductor  
 askpass                             1.0      2019-01-02 [1] CRAN (R 3.6.0)
 assertthat                          0.2.1    2019-03-21 [1] CRAN (R 3.6.0)
 backports                           1.1.4    2019-04-10 [1] CRAN (R 3.6.0)
 Biobase                           * 2.46.0   2019-10-29 [1] Bioconductor  
 BiocFileCache                       1.10.2   2019-11-08 [1] Bioconductor  
 BiocGenerics                      * 0.32.0   2019-10-29 [1] Bioconductor  
 BiocManager                         1.30.10  2019-11-16 [1] CRAN (R 3.6.1)
 BiocParallel                        1.20.1   2019-12-21 [1] Bioconductor  
 biomaRt                             2.42.1   2020-03-26 [1] Bioconductor  
 Biostrings                        * 2.54.0   2019-10-29 [1] Bioconductor  
 bit                                 4.0.4    2020-08-04 [1] CRAN (R 3.6.1)
 bit64                               4.0.5    2020-08-30 [1] CRAN (R 3.6.1)
 bitops                              1.0-6    2013-08-17 [1] CRAN (R 3.6.1)
 blob                                1.2.1    2020-01-20 [1] CRAN (R 3.6.1)
 BSgenome                          * 1.54.0   2019-10-29 [1] Bioconductor  
 BSgenome.Hsapiens.UCSC.hg19       * 1.4.0    2020-11-08 [1] Bioconductor  
 callr                               3.5.1    2020-10-13 [1] CRAN (R 3.6.1)
 cli                                 2.1.0    2020-10-12 [1] CRAN (R 3.6.1)
 crayon                              1.3.4    2017-09-16 [1] CRAN (R 3.6.0)
 CRISPRseek                        * 1.30.0   2020-10-27 [1] Bioconductor  
 curl                                3.3      2019-01-10 [1] CRAN (R 3.6.0)
 data.table                          1.12.2   2019-04-07 [1] CRAN (R 3.6.0)
 DBI                                 1.1.0    2019-12-15 [1] CRAN (R 3.6.1)
 dbplyr                              2.0.0    2020-11-03 [1] CRAN (R 3.6.1)
 DelayedArray                        0.12.3   2020-04-09 [1] Bioconductor  
 desc                                1.2.0    2018-05-01 [1] CRAN (R 3.6.1)
 devtools                            2.3.2    2020-09-18 [1] CRAN (R 3.6.1)
 digest                              0.6.18   2018-10-10 [1] CRAN (R 3.6.0)
 dplyr                               1.0.2    2020-08-18 [1] CRAN (R 3.6.1)
 ellipsis                            0.3.1    2020-05-15 [1] CRAN (R 3.6.1)
 fansi                               0.4.0    2018-10-05 [1] CRAN (R 3.6.0)
 fs                                  1.5.0    2020-07-31 [1] CRAN (R 3.6.1)
 generics                            0.0.2    2018-11-29 [1] CRAN (R 3.6.0)
 GenomeInfoDb                      * 1.22.1   2020-03-27 [1] Bioconductor  
 GenomeInfoDbData                    1.2.2    2020-11-07 [1] Bioconductor  
 GenomicAlignments                   1.22.1   2019-11-12 [1] Bioconductor  
 GenomicFeatures                   * 1.38.2   2020-02-15 [1] Bioconductor  
 GenomicRanges                     * 1.38.0   2019-10-29 [1] Bioconductor  
 glue                                1.4.2    2020-08-27 [1] CRAN (R 3.6.1)
 hash                                2.2.6.1  2019-03-04 [1] CRAN (R 3.6.1)
 hms                                 0.5.3    2020-01-08 [1] CRAN (R 3.6.1)
 httr                                1.4.2    2020-07-20 [1] CRAN (R 3.6.1)
 IRanges                           * 2.20.2   2020-01-13 [1] Bioconductor  
 jsonlite                            1.7.1    2020-09-07 [1] CRAN (R 3.6.1)
 lattice                             0.20-38  2018-11-04 [1] CRAN (R 3.6.0)
 lifecycle                           0.2.0    2020-03-06 [1] CRAN (R 3.6.1)
 magrittr                            1.5      2014-11-22 [1] CRAN (R 3.6.0)
 MASS                                7.3-51.3 2019-03-31 [1] CRAN (R 3.6.0)
 Matrix                              1.2-17   2019-03-22 [1] CRAN (R 3.6.0)
 matrixStats                         0.57.0   2020-09-25 [1] CRAN (R 3.6.1)
 memoise                             1.1.0    2017-04-21 [1] CRAN (R 3.6.1)
 openssl                             1.3      2019-03-22 [1] CRAN (R 3.6.0)
 pillar                              1.4.6    2020-07-10 [1] CRAN (R 3.6.1)
 pkgbuild                            1.1.0    2020-07-13 [1] CRAN (R 3.6.1)
 pkgconfig                           2.0.2    2018-08-16 [1] CRAN (R 3.6.0)
 pkgload                             1.1.0    2020-05-29 [1] CRAN (R 3.6.1)
 prettyunits                         1.0.2    2015-07-13 [1] CRAN (R 3.6.0)
 processx                            3.4.4    2020-09-03 [1] CRAN (R 3.6.1)
 progress                            1.2.0    2018-06-14 [1] CRAN (R 3.6.0)
 ps                                  1.4.0    2020-10-07 [1] CRAN (R 3.6.1)
 purrr                               0.3.4    2020-04-17 [1] CRAN (R 3.6.1)
 R6                                  2.4.0    2019-02-14 [1] CRAN (R 3.6.0)
 rappdirs                            0.3.1    2016-03-28 [1] CRAN (R 3.6.1)
 Rcpp                                1.0.1    2019-03-17 [1] CRAN (R 3.6.0)
 RCurl                               1.98-1.2 2020-04-18 [1] CRAN (R 3.6.1)
 remotes                             2.2.0    2020-07-21 [1] CRAN (R 3.6.1)
 reticulate                          1.18     2020-10-25 [1] CRAN (R 3.6.1)
 rhdf5                               2.30.1   2019-11-26 [1] Bioconductor  
 Rhdf5lib                            1.8.0    2019-10-29 [1] Bioconductor  
 rlang                               0.4.8    2020-10-08 [1] CRAN (R 3.6.1)
 rprojroot                           1.3-2    2018-01-03 [1] CRAN (R 3.6.1)
 Rsamtools                           2.2.3    2020-02-23 [1] Bioconductor  
 RSQLite                             2.2.1    2020-09-30 [1] CRAN (R 3.6.1)
 rstudioapi                          0.11     2020-02-07 [1] CRAN (R 3.6.1)
 rtracklayer                       * 1.46.0   2019-10-29 [1] Bioconductor  
 S4Vectors                         * 0.24.4   2020-04-09 [1] Bioconductor  
 seqinr                              4.2-4    2020-10-10 [1] CRAN (R 3.6.1)
 sessioninfo                         1.1.1    2018-11-05 [1] CRAN (R 3.6.1)
 stringi                             1.4.3    2019-03-12 [1] CRAN (R 3.6.0)
 stringr                             1.4.0    2019-02-10 [1] CRAN (R 3.6.0)
 SummarizedExperiment                1.16.1   2019-12-19 [1] Bioconductor  
 testthat                            3.0.0    2020-10-31 [1] CRAN (R 3.6.1)
 tibble                              3.0.4    2020-10-12 [1] CRAN (R 3.6.1)
 tidyselect                          1.1.0    2020-05-11 [1] CRAN (R 3.6.1)
 TxDb.Hsapiens.UCSC.hg19.knownGene * 3.2.2    2020-11-08 [1] Bioconductor  
 usethis                             1.6.3    2020-09-17 [1] CRAN (R 3.6.1)
 vctrs                               0.3.4    2020-08-29 [1] CRAN (R 3.6.1)
 withr                               2.3.0    2020-09-22 [1] CRAN (R 3.6.1)
 XML                                 3.99-0.3 2020-01-20 [1] CRAN (R 3.6.1)
 XVector                           * 0.26.0   2019-10-29 [1] Bioconductor  
 zlibbioc                            1.32.0   2019-10-29 [1] Bioconductor  

[1] /home/user/anaconda3/envs/for_crisprseek/lib/R/library
CRISPRseek • 3.1k views
ADD COMMENT
0
Entering edit mode

Hello Julie Zhu, Any pointers will be very helpful.

ADD REPLY
0
Entering edit mode

Hi Xenon,

Can you try to set annotateExon = FALSE to see if you still get the same error? Thanks! If you still encounter the same error, please post the input gRNA file "temp_gRNA" and hg19.fa. Thanks!

Best regards, Julie

ADD REPLY
0
Entering edit mode

I continue to see the same error with annotateExon = FALSE. I have used the genome, from the following link http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz

And in my temp_gRNA , there's only 1 string of length 20 TTAAAGGTTTATACCTTCCC

I even ran the analysis without txdb = human_txdb. Still got the same error. I thought as we are not annotating the exons , txdb parameter might not be required.

ADD REPLY
0
Entering edit mode

Hi Durbar,

Thanks for performing more tests!

I ran the following command without encountering any error. I am wondering whether you have downloaded the most recent version of CRISPRseek at https://bioconductor.org/packages/release/bioc/html/CRISPRseek.html.

res = offTargetAnalysisWithoutBSgenome(inputFilePath = temp_gRNA, findgRNAs = FALSE, findgRNAsWithREcutOnly = FALSE, findPairedgRNAOnly = FALSE,chromToSearch = "chr15", outputDir = “testDurbar”, overwrite = TRUE, gRNAoutputName = “DurbargRNA” , enable.multicore = FALSE, n.cores.max = 1,useBSgenome = FALSE, genomeSeqFile = "hg19.fa" , annotateExon = FALSE, txdb = human_txdb, max.mismatch = 1)

I also tried to run the analysis with BSgenome and found out that there is no perfect match in hg19 for the sequence you provided (TAAAGGTTTATACCTTCCC) which is consistent with the results using blat at https://genome.ucsc.edu/cgi-bin/hgBlat. Is this intended? Thanks!

library(BSgenome.Hsapiens.UCSC.hg19)

library(TxDb.Hsapiens.UCSC.hg19.knownGene)

res2 = offTargetAnalysis(inputFilePath = temp_gRNA, findgRNAs = FALSE, findgRNAsWithREcutOnly = FALSE, findPairedgRNAOnly = FALSE,chromToSearch = "chr15", outputDir = "testDurbar-BSgenome", overwrite = TRUE, gRNAoutputName = "DurbargRNA" , enable.multicore = FALSE, n.cores.max = 1, BSgenomeName = BSgenome.Hsapiens.UCSC.hg19, txdb = TxDb.Hsapiens.UCSC.hg19.knownGene, max.mismatch = 1)

$offtarget name gRNAPlusPAM OffTargetSequence inExon inIntron entrez_id score n.mismatch mismatch.distance2PAM 1 fromDurbar TTAAAGGTTTATACCTTCCCNGG TTAAAGGTTTATACCTTCACCAG TRUE 83660 31.5 1 2 alignment isCanonicalPAM forViewInUCSC strand chrom chromStart chromEnd extendedSequence gRNAefficacy 1 ..................A. 0 chr15:62941533-62941555 - chr15 62941533 62941555 ATCTTTAAAGGTTTATACCTTCACCAGCAC 0.1825474

BTW, to make your test run faster, you can specify max.mismatch = 1, and chromToSearch = "chr15" since there is only one match found in chr15 with at most 1 mismatch.

Best regards,

Julie

ADD REPLY
1
Entering edit mode

Hi Durbar,

I did more testing and discovered that there is indeed a bug when using non-BSgenome sequence files for fetching sequence. The error only occurs if more than one chromosome is searched. I fixed the bug and you can obtain the updated package with git clone https://github.com/LihuaJulieZhu/CRISPRseek.git.

I need to update the Bioconductor release and dev version yet.

Thanks for finding and reporting the bug!

Best regards,

Julie

ADD REPLY
0
Entering edit mode

This is working ! :)

ADD REPLY
1
Entering edit mode
@james-w-macdonald-5106
Last seen 3 days ago
United States

The simple answer is that you are trying to generate an IRanges object that has a start larger than the end.

> IRanges(6, 2)
Error in .Call2("C_solve_user_SEW0", start, end, width, PACKAGE = "IRanges") : 
  In range 1: 'end' must be >= 'start' - 1.

Which can also happen when creating a GRanges object, which is more likely to be happening

> GRanges("chr1", IRanges(6, 2))
Error in .Call2("C_solve_user_SEW0", start, end, width, PACKAGE = "IRanges") : 
  In range 1: 'end' must be >= 'start' - 1.

It's not likely that anybody can give you more help than that, because it's probably something to do with your data. You could either run offTargetAnalysisWithoutBSgenome under the debugger (see ?debug) or you could set options(error = recover) and then explore things when you hit the error. I personally find the former to be easier to do/understand than the latter, but ymmv.

ADD COMMENT

Login before adding your answer.

Traffic: 489 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6