Hello,
I recently downloaded CRISPRSeek and I got the tests to work in the vignette (specifically Scenario 5: Target and off-target analysis for user specified gRNAs) and the output looks wonderful. When I tried to replicate the results with a test file on my own, however, I got the error as described in the post title. Here is the following code that I ran:
library(CRISPRseek) library(BSgenome.Hsapiens.UCSC.hg19) library(TxDb.Hsapiens.UCSC.hg19.knownGene) library(org.Hs.eg.db) input_path = 'Path_to_pwd/Input.fasta' output_dir = 'Path_to_pwd/CRISPRSeek_Output' REpatternFile = system.file('extdata','NEBenzymes.fa', package ='CRISPRseek') offTargetAnalysis(inputFilePath = input_path, findgRNAsWithREcutOnly = FALSE, REpattenFile = REpattenFile, findPairedgRNAOnly = FALSE, findgRNAs = FALSE, BSgenomeName = Hsapiens, txd = TxDb.Hsapiens.UCSC.hg19.knownGene, orgAnn = org.Hs.egSYMBOL, max.mismatch = 0, outputDir = output_dir, overwrite = TRUE) Validating input ... >>> Finding all hits in sequence chr1 ... >>> DONE searching >>> Finding all hits in sequence chr2 ... >>> DONE searching >>> Finding all hits in sequence chr3 ... >>> DONE searching . . . >>> DONE searching >>> Finding all hits in sequence chrUn_gl000249 ... >>> DONE searching Building feature vectors for scoring ... Calculating scores ... Error in weights %*% mismatch.pos : non-conformable arguments In addition: Warning message: In dir.create(outputDir) : 'path_to_pwd\CRISPRSeek_Output' already exists
Here is the output of the traceback() function:
> traceback() 2: getOfftargetScore(featureVectors, weights = weights) 1: offTargetAnalysis(inputFilePath = input_path, findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile, findPairedgRNAOnly = FALSE, findgRNAs = FALSE, BSgenomeName = Hsapiens, txd = TxDb.Hsapiens.UCSC.hg19.knownGene, orgAnn = org.Hs.egSYMBOL, max.mismatch = 0, outputDir = output_dir, overwrite = TRUE)
Here is a sample of input.fasta (Total number of gRNAs is 613 sequences):
>Ref.A1.RW.92.92RW008.AB253421_0 TGGAAGGGCTAATTTACTCC >Ref.A1.RW.92.92RW008.AB253421_1 GGAAGGGCTAATTTACTCCA >Ref.A1.RW.92.92RW008.AB253421_2 GAAGGGCTAATTTACTCCAA >Ref.A1.RW.92.92RW008.AB253421_3 AAGGGCTAATTTACTCCAAG
Here is the output of the sessionInfo() command:
> sessionInfo(package = "CRISPRseek") R version 3.3.1 (2016-06-21) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: character(0) other attached packages: [1] CRISPRseek_1.12.0 loaded via a namespace (and not attached): [1] Rcpp_0.12.12 GenomeInfoDb_1.8.7 [3] XVector_0.12.1 methods_3.3.1 [5] GenomicFeatures_1.24.5 bitops_1.0-6 [7] utils_3.3.1 tools_3.3.1 [9] grDevices_3.3.1 zlibbioc_1.18.0 [11] biomaRt_2.28.0 digest_0.6.12 [13] bit_1.1-12 RSQLite_2.0 [15] memoise_1.1.0 tibble_1.3.3 [17] BSgenome_1.40.1 TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 [19] pkgconfig_2.0.1 rlang_0.1.1 [21] DBI_0.7 parallel_3.3.1 [23] rtracklayer_1.32.2 Biostrings_2.40.2 [25] S4Vectors_0.10.3 graphics_3.3.1 [27] datasets_3.3.1 stats_3.3.1 [29] IRanges_2.6.1 stats4_3.3.1 [31] ade4_1.7-6 bit64_0.9-7 [33] base_3.3.1 data.table_1.10.4 [35] Biobase_2.32.0 hash_2.2.6 [37] BSgenome.Hsapiens.UCSC.hg19_1.4.0 AnnotationDbi_1.34.4 [39] XML_3.98-1.9 BiocParallel_1.6.6 [41] seqinr_3.4-5 blob_1.1.0 [43] org.Hs.eg.db_3.3.0 Rsamtools_1.24.0 [45] GenomicAlignments_1.8.4 BiocGenerics_0.18.0 [47] GenomicRanges_1.24.3 SummarizedExperiment_1.2.3 [49] RCurl_1.95-4.8
I tried searching around on BioStars, Bioconductor forums, and Stack, but I could not find an answer.
Thank you very much!
Thank you, Julie! Updating CRISPRSeek fixed the issue. The gRNAs are not human sequences, so that would make sense.