Hi Julie,
I am using offTargetAnalysis function with CFD scoring method (CRISPRseek verison 1.16.0 from bioconductor) to select a few best-scoring gRNAs for each of a large set of sequences. In the past, I have used the combination of gRNAefficacy and top10OfftargetTotalScore in the summary file as criteria for selection. I wonder whether there are updated version of CRISPRseek that provides a single score that combine the gRNA efficacy and the off-target scores for each gRNA.
On another note, in several recent jobs, I set annotateExon = TRUE, but I did not see any notation in either the "inExon" or the "inIntron" column in the OfftargetAnalysis.xls file. Did I miss something? The script I used is copied below:
results=offTargetAnalysis("exon.fasta", findgRNAsWithREcutOnly = FALSE, annotateExon = TRUE, findPairedgRNAOnly = FALSE, exportAllgRNAs="fasta", txdb= txdb, max.mismatch = 3, BSgenomeName = ss, outputDir = ".", overwrite = TRUE, scoring.method="CFDscore")
Thank you very much for your help!
Joyce
Hi Julie,
I don't find the exon annotation in the OfftargetAnalysis.xls file. I list the information you requested below:
(1) > sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /n/app/openblas/0.2.19/lib/libopenblas_core2p-r0.2.19.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] Rcpp_0.12.13 AnnotationDbi_1.38.2
[3] XVector_0.16.0 GenomicAlignments_1.12.2
[5] GenomicRanges_1.28.6 BiocGenerics_0.22.1
[7] zlibbioc_1.22.0 IRanges_2.10.5
[9] BiocParallel_1.10.1 bit_1.1-12
[11] lattice_0.20-35 rlang_0.1.2
[13] blob_1.1.0 GenomeInfoDb_1.12.3
[15] tools_3.4.1 grid_3.4.1
[17] SummarizedExperiment_1.6.5 parallel_3.4.1
[19] Biobase_2.36.2 DBI_0.7
[21] matrixStats_0.52.2 bit64_0.9-7
[23] digest_0.6.12 tibble_1.3.4
[25] Matrix_1.2-10 GenomeInfoDbData_0.99.0
[27] rtracklayer_1.36.6 S4Vectors_0.14.7
[29] bitops_1.0-6 RCurl_1.95-4.8
[31] biomaRt_2.32.1 memoise_1.1.0
[33] RSQLite_2.0 DelayedArray_0.2.7
[35] compiler_3.4.1 Rsamtools_1.28.0
[37] GenomicFeatures_1.28.5 Biostrings_2.44.2
[39] XML_3.98-1.9 stats4_3.4.1
(2) test sequence
"TTACTGCTGTTGACAAGTTGGTTTAAGGGACAAAACTTTAAGTGTTAAAGCCACCTCAACAATTGATTGGACTTTTTCGTTTAATTT"
(3) txdb: https://s3.amazonaws.com/ffcf3-11.1/txdb
(4) ss: https://s3.amazonaws.com/ffcf3-11.1/ss
Thanks!
Best,
Joyce
Joyce,
I tried offtarget analysis with your test sequence and human genome. I was able to generate the InExon information. Here is the code snippet.
Best regards,
Julie
library(CRISPRseek)
library("BSgenome.Hsapiens.UCSC.hg19")
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(org.Hs.eg.db)
outputDir <- getwd()
inputFilePath <- DNAStringSet("TTACTGCTGTTGACAAGTTGGTTTAAGGGACAAAACTTTAAGTGTTAAAGCCACCTCAACAATTGATTGGACTTTTTCGTTTAATTT")
results <- offTargetAnalysis(inputFilePath,
findgRNAsWithREcutOnly = TRUE,
findPairedgRNAOnly = FALSE,
annotatePaired = FALSE,
scoring.method = "CFDscore",
gRNAoutputName = "test",
BSgenomeName = Hsapiens, chromToSearch = "chr6",
txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
orgAnn = org.Hs.egSYMBOL, max.mismatch = 2,
outputDir = outputDir, overwrite = TRUE)
results$offtarget
name gRNAPlusPAM OffTargetSequence inExon inIntron entrez_id gene score n.mismatch
1 NA_gR23f CTGTTGACAAGTTGGTTTAANGG CTGTTGACAAGTTGGTTTGAGGG TRUE 6908 TBP 0.375 1
2 NA_gR64f AAGCCACCTCAACAATTGATNGG AAGCCACCTCTATAATTGATTGG TRUE 6908 TBP 0.215385 2
3 NA_gR57r AGTCCAATCAATTGTTGAGGNGG AGTCCAATCAATTATAGAGGTGG TRUE 6908 TBP 0.681818 2
mismatch.distance2PAM alignment isCanonicalPAM forViewInUCSC strand chrom chromStart chromEnd
1 2 ..................G. 1 chr6:170881636-170881658 + chr6 170881636 170881658
2 10,8 ..........T.T....... 1 chr6:170881677-170881699 + chr6 170881677 170881699
3 7,5 .............A.A.... 1 chr6:170881680-170881702 - chr6 170881680 170881702
extendedSequence gRNAefficacy
1 ACTGCTGTTGACAAGTTGGTTTGAGGGAGA 0.1005271
2 GTTAAAGCCACCTCTATAATTGATTGGACT 0.2647941
3 AAAAAGTCCAATCAATTATAGAGGTGGCTT 0.3408939
Thank you, Julie. I found out the source of our problem - in our GFF file, the chromosome names were not in the form of chr1, chr2, ... etc. After making the proper modifications, I was able to get the exon annotations. Thanks again for your help!
Joyce
Joyce,
Glad to help! Great that you found the cause of problem. Thanks for letting me know!
Best regards,
Julie