SOLiD color space data
2
0
Entering edit mode
Neel Aluru ▴ 460
@neel-aluru-3760
Last seen 7.9 years ago
United States
Dear Bioc Users, I just have quick question about the SOLiD sequencing analysis. Does Bioconductor has any packages that can handle SOLiD color space data. I want to do some preliminary analysis such as converting them to fastq sanger format and fasta format of unique reads. Right now I am using some perl scripts that come with aligners (BWA/bowtie). I went through BioC mailing lists and some associated papers and they have mentioned that in future they will extend their usage to SOLiD. If you know about them, could you please share it! Thank you very much in advance. Sincerely, Neel Neel Aluru Postdoctoral Scholar Biology Department Woods Hole Oceanographic Institution Woods Hole, MA 02543 USA 508-289-3607
Sequencing Sequencing • 2.5k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 4 months ago
United States
On 12/11/2010 06:12 PM, Neel Aluru wrote: > Dear Bioc Users, > > I just have quick question about the SOLiD sequencing analysis. Does > Bioconductor has any packages that can handle SOLiD color space data. > I want to do some preliminary analysis such as converting them to > fastq sanger format and fasta format of unique reads. Right now I am > using some perl scripts that come with aligners (BWA/bowtie). I went > through BioC mailing lists and some associated papers and they have > mentioned that in future they will extend their usage to SOLiD. If > you know about them, could you please share it! > > Thank you very much in advance. Hi Neel -- I don't think there are any Bioc packages handling color space; it would seem like one would want to carry color space through alignment / variant calling / ..., rather than converting to fastq? Martin > > Sincerely, Neel > > Neel Aluru Postdoctoral Scholar Biology Department Woods Hole > Oceanographic Institution Woods Hole, MA 02543 USA 508-289-3607 > > _______________________________________________ Bioc-sig-sequencing > mailing list Bioc-sig-sequencing at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
ADD COMMENT
0
Entering edit mode
Thanks, Martin. I also have read and heard about doing everything in color space. But, all the short read aligners convert the csfasta files to fastq before mapping to the genome. Neel On Dec 11, 2010, at 9:14 PM, Martin Morgan wrote: > On 12/11/2010 06:12 PM, Neel Aluru wrote: >> Dear Bioc Users, >> >> I just have quick question about the SOLiD sequencing analysis. Does >> Bioconductor has any packages that can handle SOLiD color space data. >> I want to do some preliminary analysis such as converting them to >> fastq sanger format and fasta format of unique reads. Right now I am >> using some perl scripts that come with aligners (BWA/bowtie). I went >> through BioC mailing lists and some associated papers and they have >> mentioned that in future they will extend their usage to SOLiD. If >> you know about them, could you please share it! >> >> Thank you very much in advance. > > Hi Neel -- I don't think there are any Bioc packages handling color > space; it would seem like one would want to carry color space through > alignment / variant calling / ..., rather than converting to fastq? Martin > >> >> Sincerely, Neel >> >> Neel Aluru Postdoctoral Scholar Biology Department Woods Hole >> Oceanographic Institution Woods Hole, MA 02543 USA 508-289-3607 >> >> _______________________________________________ Bioc-sig-sequencing >> mailing list Bioc-sig-sequencing at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 > Neel Aluru Postdoctoral Scholar Biology Department Woods Hole Oceanographic Institution Woods Hole, MA 02543 USA 508-289-3607
ADD REPLY
0
Entering edit mode
On 12/11/2010 06:26 PM, Neel Aluru wrote: > Thanks, Martin. I also have read and heard about doing everything in color space. But, all the short read aligners convert the csfasta files to fastq before mapping to the genome. Hi Neel -- I'm not a color space wiz but see http://solidsoftwaretools.com/gf/ http://solid.community.appliedbiosystems.com/index.jspa Martin > Neel > > On Dec 11, 2010, at 9:14 PM, Martin Morgan wrote: > >> On 12/11/2010 06:12 PM, Neel Aluru wrote: >>> Dear Bioc Users, >>> >>> I just have quick question about the SOLiD sequencing analysis. Does >>> Bioconductor has any packages that can handle SOLiD color space data. >>> I want to do some preliminary analysis such as converting them to >>> fastq sanger format and fasta format of unique reads. Right now I am >>> using some perl scripts that come with aligners (BWA/bowtie). I went >>> through BioC mailing lists and some associated papers and they have >>> mentioned that in future they will extend their usage to SOLiD. If >>> you know about them, could you please share it! >>> >>> Thank you very much in advance. >> >> Hi Neel -- I don't think there are any Bioc packages handling color >> space; it would seem like one would want to carry color space through >> alignment / variant calling / ..., rather than converting to fastq? Martin >> >>> >>> Sincerely, Neel >>> >>> Neel Aluru Postdoctoral Scholar Biology Department Woods Hole >>> Oceanographic Institution Woods Hole, MA 02543 USA 508-289-3607 >>> >>> _______________________________________________ Bioc-sig- sequencing >>> mailing list Bioc-sig-sequencing at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >> >> >> -- >> Computational Biology >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 >> >> Location: M1-B861 >> Telephone: 206 667-2793 >> > > Neel Aluru > Postdoctoral Scholar > Biology Department > Woods Hole Oceanographic Institution > Woods Hole, MA 02543 > USA > 508-289-3607 > > > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
ADD REPLY
0
Entering edit mode
Thanks, Martin. I have been looking at it. Neel On Dec 11, 2010, at 9:43 PM, Martin Morgan wrote: > On 12/11/2010 06:26 PM, Neel Aluru wrote: >> Thanks, Martin. I also have read and heard about doing everything in > color space. But, all the short read aligners convert the csfasta files > to fastq before mapping to the genome. > > Hi Neel -- I'm not a color space wiz but see > > http://solidsoftwaretools.com/gf/ > http://solid.community.appliedbiosystems.com/index.jspa > > Martin > >> Neel >> >> On Dec 11, 2010, at 9:14 PM, Martin Morgan wrote: >> >>> On 12/11/2010 06:12 PM, Neel Aluru wrote: >>>> Dear Bioc Users, >>>> >>>> I just have quick question about the SOLiD sequencing analysis. Does >>>> Bioconductor has any packages that can handle SOLiD color space data. >>>> I want to do some preliminary analysis such as converting them to >>>> fastq sanger format and fasta format of unique reads. Right now I am >>>> using some perl scripts that come with aligners (BWA/bowtie). I went >>>> through BioC mailing lists and some associated papers and they have >>>> mentioned that in future they will extend their usage to SOLiD. If >>>> you know about them, could you please share it! >>>> >>>> Thank you very much in advance. >>> >>> Hi Neel -- I don't think there are any Bioc packages handling color >>> space; it would seem like one would want to carry color space through >>> alignment / variant calling / ..., rather than converting to fastq? Martin >>> >>>> >>>> Sincerely, Neel >>>> >>>> Neel Aluru Postdoctoral Scholar Biology Department Woods Hole >>>> Oceanographic Institution Woods Hole, MA 02543 USA 508-289-3607 >>>> >>>> _______________________________________________ Bioc-sig- sequencing >>>> mailing list Bioc-sig-sequencing at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >>> >>> >>> -- >>> Computational Biology >>> Fred Hutchinson Cancer Research Center >>> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 >>> >>> Location: M1-B861 >>> Telephone: 206 667-2793 >>> >> >> Neel Aluru >> Postdoctoral Scholar >> Biology Department >> Woods Hole Oceanographic Institution >> Woods Hole, MA 02543 >> USA >> 508-289-3607 >> >> >> > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 > Neel Aluru Postdoctoral Scholar Biology Department Woods Hole Oceanographic Institution Woods Hole, MA 02543 USA 508-289-3607
ADD REPLY
0
Entering edit mode
Hi Neel,, On Sat, Dec 11, 2010 at 9:26 PM, Neel Aluru <naluru at="" whoi.edu=""> wrote: > Thanks, Martin. I also have read and heard about doing everything in color space. But, all the short read aligners convert the csfasta files to fastq before mapping to the genome. I don't actually think this is true. I've just had to deal with some SOLiD data, and bowtie does everything in colorspace (you can give it the csfasta and qual files separately -- look for their -Q flag). Note that you have to align to a special color space index. I'd also be very surprised if BWA doesn't handle color space reads, since (i) you can specify building a color space specific indiex (in the bwa index ... command); and (ii) their paper says they handle color space reads. I think the fastq file you would use as input to BWA would just have the sequences in color space. The SHRiMP aligner also does colorspace ... -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY
0
Entering edit mode
Hi Steve, Thanks for your comments. Yes, I used BWA and it works in color space as well. As a first step, I trimmed the adaptors using cutadapt and wrote the output to BWA fastq (I believe that is csfastq). Then I indexed the genome in color space and mapped the reads. My mapping didn't work and that made me think if my files are in wrong format. Do you do small RNA analysis? Thank you, Neel On Dec 12, 2010, at 10:57 AM, Steve Lianoglou wrote: > Hi Neel,, > > On Sat, Dec 11, 2010 at 9:26 PM, Neel Aluru <naluru at="" whoi.edu=""> wrote: >> Thanks, Martin. I also have read and heard about doing everything in color space. But, all the short read aligners convert the csfasta files to fastq before mapping to the genome. > > I don't actually think this is true. I've just had to deal with some > SOLiD data, and bowtie does everything in colorspace (you can give it > the csfasta and qual files separately -- look for their -Q flag). Note > that you have to align to a special color space index. > > I'd also be very surprised if BWA doesn't handle color space reads, > since (i) you can specify building a color space specific indiex (in > the bwa index ... command); and (ii) their paper says they handle > color space reads. I think the fastq file you would use as input to > BWA would just have the sequences in color space. > > The SHRiMP aligner also does colorspace ... > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > Neel Aluru Postdoctoral Scholar Biology Department Woods Hole Oceanographic Institution Woods Hole, MA 02543 USA 508-289-3607
ADD REPLY
0
Entering edit mode
Hi, On Sun, Dec 12, 2010 at 2:14 PM, Neel Aluru <naluru at="" whoi.edu=""> wrote: > Hi Steve, > > Thanks for your comments. Yes, I used BWA and it works in color space as well. As a first step, I trimmed the adaptors using cutadapt and wrote the output to BWA fastq (I believe that is csfastq). Then I indexed the genome in color space and mapped the reads. My mapping didn't work and that made me think if my files are in wrong format. > > Do you do small RNA analysis? I don't at the moment, sorry. I haven't stumbled on cutadapt before, thanks for pointing it out. Unfortunately I'm not sure how to help you smoke out your problem. Are you sure you have the correct adapter sequence you are trimming out? If you expect your reads to be of a minimum length, maybe you can just include the first X basepairs of the reads and try to align it that way? (maybe there's something wrong with the 3' trimming) -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY
0
Entering edit mode
rcaloger ▴ 500
@rcaloger-1888
Last seen 9.8 years ago
European Union
Hi Neel, I work both with Illumina and SOLiD data. As aligner I use SHRIMP, http://compbio.cs.toronto.edu/shrimp It indexes the genome in color space. After mapping you can easily upload in R the mapped data. Cheers Raffaele On 12/11/2010 06:12 PM, Neel Aluru wrote: >Dear Bioc Users, >I just have quick question about the SOLiD sequencing analysis. Does Bioconductor has any packages that can handle SOLiD color space>data. I want to do some preliminary analysis such as converting them to fastq sanger format and fasta format of unique reads. Right>now I am using some perl scripts that come with aligners (BWA/bowtie). I went through BioC mailing lists and some associated papers>and they have mentioned that in future they will extend their usage to SOLiD. If you know about them, could you please share it! >Thank you very much in advance. >Sincerely, >Neel >Neel Aluru >Postdoctoral Scholar >Biology Department >Woods Hole Oceanographic Institution >Woods Hole, MA 02543 >USA >508-289-3607 -- ---------------------------------------- Prof. Raffaele A. Calogero Bioinformatics and Genomics Unit MBC Centro di Biotecnologie Molecolari Via Nizza 52, Torino 10126 tel. ++39 0116706457 Fax ++39 0116706487 Mobile ++39 3333827080 email: raffaele.calogero at unito.it raffaele[dot]calogero[at]gmail[dot]com www: http://www.bioinformatica.unito.it
ADD COMMENT
0
Entering edit mode
Thank you all for the suggestions. I really appreciate it. Raffaele, I used BWA and it also indexes the genome in color space. Are you doing small RNA analysis with SOLiD seq? If so, what are the steps in your analysis pipeline? Which files format (SAM or BAM) are you uploading into R? Thank you, Neel On Dec 12, 2010, at 12:48 PM, rcaloger wrote: > Hi Neel, > I work both with Illumina and SOLiD data. > As aligner I use SHRIMP, > http://compbio.cs.toronto.edu/shrimp > It indexes the genome in color space. > After mapping you can easily upload in R the mapped data. > Cheers > Raffaele > > On 12/11/2010 06:12 PM, Neel Aluru wrote: >> Dear Bioc Users, > >> I just have quick question about the SOLiD sequencing analysis. Does Bioconductor has any packages that can handle SOLiD color space>data. I want to do some preliminary analysis such as converting them to fastq sanger format and fasta format of unique reads. Right>now I am using some perl scripts that come with aligners (BWA/bowtie). I went through BioC mailing lists and some associated papers>and they have mentioned that in future they will extend their usage to SOLiD. If you know about them, could you please share it! > >> Thank you very much in advance. > >> Sincerely, >> Neel > >> Neel Aluru >> Postdoctoral Scholar >> Biology Department >> Woods Hole Oceanographic Institution >> Woods Hole, MA 02543 >> USA >> 508-289-3607 > > > > > -- > > ---------------------------------------- > Prof. Raffaele A. Calogero > Bioinformatics and Genomics Unit > MBC Centro di Biotecnologie Molecolari > Via Nizza 52, Torino 10126 > tel. ++39 0116706457 > Fax ++39 0116706487 > Mobile ++39 3333827080 > email: raffaele.calogero at unito.it > raffaele[dot]calogero[at]gmail[dot]com > www: http://www.bioinformatica.unito.it > > Neel Aluru Postdoctoral Scholar Biology Department Woods Hole Oceanographic Institution Woods Hole, MA 02543 USA 508-289-3607
ADD REPLY

Login before adding your answer.

Traffic: 548 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6