Manipulating random CRISPR sequences
3
0
Entering edit mode
shane • 0
@shane-9068
Last seen 9.1 years ago
United States

I'm working on creating a small workflow to look at random CRISPR guide sequences. Essentially I'm generating all of the putative CRISPR locations on a particular chromosome and would like to be able to manipulate these a little bit. Ultimately my goal is to find CRISPRs that cut at multiple locations in the genome. I know these may not occur frequently but it certainly occurs in repetitive genome locations like rDNA. I've managed to get r to give me the locations and sequences of all the locations in chromosome 1, but that's where I'm stuck.
1) how can I force the output "table" to be written in a tab-delimited format that could theoretically go into excel it it wasn't so huge? I've toyed around with the data.frame and writeTable commands, but haven't had much success. These are a it confusing for a beginner
2) Can I take the output and force r to find those sequences that are duplicated (i.e. the far right column)? Can it bin them into groups depending on the number of times a particular pattern is repeated?
3) Since this should be a more manageable list, how do I send the output of these duplicated sequences to a tab-delimited file? In other words, can I essentially create a setup where I have a list of CRISPR guide sequences that are repeated 2 or more times on this particular chromosome?

4) Can I expand this to work on the whole genome (I tried to simplify to start).

 

The Script:

    p1="nnnnnnnnnnnnnnnnnnnnngg"

    library(BSgenome.Hsapiens.UCSC.hg38)

    chr1<-Hsapiens[["chr1"]]

    masks(chr1)<-null

    allsites<-matchPattern(p1, chr1, fixed="subject")

    allsites

The output:

    Views on a 248956422-letter DNAString subject
    subject: NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN...NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    views:
                   start       end width
           [1]     10451     10473    23 [AACCCTAACCCTAACCCTCGCGG]
           [2]     10464     10486    23 [ACCCTCGCGGTACCCTCAGCCGG]
           [3]     10477     10499    23 [CCTCAGCCGGCCCGCCCGCCCGG]
           [4]     10478     10500    23 [CTCAGCCGGCCCGCCCGCCCGGG]
           [5]     10490     10512    23 [GCCCGCCCGGGTCTGACCTGAGG]
           ...       ...       ...   ... ...
    [12491308] 248946388 248946410    23 [AGGGTTAGGGTTAGGGTTAAGGG]
    [12491309] 248946393 248946415    23 [TAGGGTTAGGGTTAAGGGTTAGG]
    [12491310] 248946394 248946416    23 [AGGGTTAGGGTTAAGGGTTAGGG]
    [12491311] 248946399 248946421    23 [TAGGGTTAAGGGTTAGGGTTAGG]
    [12491312] 248946400 248946422    23 [AGGGTTAAGGGTTAGGGTTAGGG]
r crispr • 2.0k views
ADD COMMENT
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 13 months ago
United States

Shane,

 

Please try CRISPRseek package https://www.bioconductor.org/packages/release/bioc/html/CRISPRseek.html and https://www.bioconductor.org/packages/release/bioc/vignettes/CRISPRseek/inst/doc/CRISPRseek.pdf

2.6 Scenario 6. Quick gRNA finding without target or off-target analysis 

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0108424

Please let me know if this fits your needs.

Best regards,

Julie

ADD COMMENT
0
Entering edit mode
mousheng xu ▴ 10
@mousheng-xu-2280
Last seen 7.1 years ago

Hi Julie,

Just a quick question: the CRISPRseek user's guide mainly talked about how to use the package for gRNA. Could you please instruct on how to use it for Guide-seq as you've done at http://mccb.umassmed.edu/GUIDE-seq/ for the python package?

Thanks!

-- Mo

ADD COMMENT
0
Entering edit mode
Mousheng, You would want to download GUIDEseq package for GUIDE-seq analysis. Best, Julie Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential, proprietary and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender immediately and destroy or permanently delete all copies of the original message. From: "mousheng xu [bioc]" <noreply@bioconductor.org<mailto:noreply@bioconductor.org>> Reply-To: "reply+c15470e2+code@bioconductor.org<mailto:reply+c15470e2+code@bioconductor.org>" <reply+c15470e2+code@bioconductor.org<mailto:reply+c15470e2+code@bioconductor.org>> Date: Thursday, June 2, 2016 11:25 AM To: Lihua Julie Zhu <julie.zhu@umassmed.edu<mailto:julie.zhu@umassmed.edu>> Subject: [bioc] A: Manipulating random CRISPR sequences Activity on a post you are following on support.bioconductor.org<https: support.bioconductor.org=""> User mousheng xu<https: support.bioconductor.org="" u="" 2280=""/> wrote Answer: Manipulating random CRISPR sequences <https: support.bioconductor.org="" p="" 73905="" #83312=""> : Hi Julie, Just a quick question: the CRISPRseek user's guide mainly talked about how to use the package for gRNA. Could you please instruct on how to use it for Guide-seq as you've done at http://mccb.umassmed.edu/GUIDE-seq/ for the python package? Thanks! -- Mo ________________________________ Post tags: r, crispr You may reply via email or visit A: Manipulating random CRISPR sequences
ADD REPLY
0
Entering edit mode

 

Hi Julie,

GUIDEseq successfully installed. However, the 1st step requires .bed & .bam files as input, while all we have are .fastq raw data files from HiSeq. 

How should I proceed?

Thanks!

-- Mo

ADD REPLY
0
Entering edit mode
Please take a look at http://mccb.umassmed.edu/GUIDE-seq/readme.txt. BTW, please use a different tag for GUIDEseq question. Thanks! Best, Julie Sent from my iPhone On Jun 2, 2016, at 2:07 PM, mousheng xu [bioc] <noreply@bioconductor.org<mailto:noreply@bioconductor.org>> wrote: Activity on a post you are following on support.bioconductor.org<https: support.bioconductor.org=""> User mousheng xu<https: support.bioconductor.org="" u="" 2280=""/> wrote Comment: Manipulating random CRISPR sequences <https: support.bioconductor.org="" p="" 73905="" #83321=""> : Hi Julie, GUIDEseq successfully installed. However, the 1st step requires .bed & .bam files as input, while all we have are .fastq raw data files from HiSeq. How should I proceed? Thanks! -- Mo ________________________________ Post tags: r, crispr You may reply via email or visit C: Manipulating random CRISPR sequences
ADD REPLY
0
Entering edit mode
mousheng xu ▴ 10
@mousheng-xu-2280
Last seen 7.1 years ago

OK. Started a new thread with GUIDE-seq in the question title. Thanks.

ADD COMMENT

Login before adding your answer.

Traffic: 834 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6