Visualization of alignments with mismatch bases
1
0
Entering edit mode
Julian Gehring ★ 1.3k
@julian-gehring-5818
Last seen 5.5 years ago
Hi, Is there a good way to visualize aligned reads with mismatch bases (SNPs) along the genome (similar to what one knows from the standard genome browsers)? 'ggbio' came to my mind with which plotting a pile of reads is straight forward. However, overlaying the mismatched bases for each reads seems not that easy any more. Does someone of you have a good way to do this or found another solution that works for R/bioc? Best wishes Julian
• 1.7k views
ADD COMMENT
0
Entering edit mode
Tengfei Yin ▴ 420
@tengfei-yin-4323
Last seen 8.6 years ago
On Thu, Feb 28, 2013 at 11:46 AM, Julian Gehring <julian.gehring@gmail.com>wrote: > Hi, > > Is there a good way to visualize aligned reads with mismatch bases (SNPs) > along the genome (similar to what one knows from the standard genome > browsers)? > > 'ggbio' came to my mind with which plotting a pile of reads is straight > forward. However, overlaying the mismatched bases for each reads seems not > that easy any more. > Hi Julian, You are right, currently ggbio only supports summary of mismatch showing as coverage plot and barchart(?stat_mismatch), but looks like what you want is detailed short reads alignments visualization with mismatch bases showing right on the reads, . It's possible, but not easy to do it manually...you have to have two GRanges objects, one for alignment one for SNP, and plot them layer by layer, the tricky part is assigning each reads fixed stepping level, so snp can be plotted on the right position. I will NOT recommend you to do this, it's probably not worth taking time doing it. I need to implement this features in some easy way. The tricky part is that there are different modes, 1. show reads as gray rectangle, and color mismatch as segment 2. show SNP as nucleotide text, A/C/T/G.., 3. show sequence detail for each alignment. those depends on zoomed level and even coverage, and I guess most time you don't want to see bases for every reads... Just curious for future ggbio development, are those modes want you want? are you just using bam files here? no VCF files involved right? Because you mentioned 'snp', I think what you mean is mismatch? ps: I cannot speak for other tools, and only thing I know, in SRAdb package, looks like it could fire your data in IGV.. Thanks Tengfei > Does someone of you have a good way to do this or found another solution > that works for R/bioc? > > Best wishes > Julian > > ______________________________**_________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi Tengfei, What you describe covers what I tried to do so far (overlapping tracks, really a hacky task). Having these modes would be very handy to have, especially in combination with the other capabilities of 'ggbio'. All the three modes would be good to have; I would consider (1) the most useful one for me at the moment. The more detail to incorporate in the alignments (e.g. including SNV information of a VCF file) would need additional relations between BAM alignments and the VCF. I am not aware of any R package with this functionality, but I see a large potential for this. Currently, I interact with IGV using batch mode, but this far away from being a good and long-term solution. Best wishes Julian > You are right, currently ggbio only supports summary of mismatch showing > as coverage plot and barchart(?stat_mismatch), but looks like what you > want is detailed short reads alignments visualization with mismatch > bases showing right on the reads, . It's possible, but not easy to do it > manually...you have to have two GRanges objects, one for alignment one > for SNP, and plot them layer by layer, the tricky part is assigning each > reads fixed stepping level, so snp can be plotted on the right position. > I will NOT recommend you to do this, it's probably not worth taking time > doing it. I need to implement this features in some easy way. > > The tricky part is that there are different modes, 1. show reads as gray > rectangle, and color mismatch as segment 2. show SNP as nucleotide > text, A/C/T/G.., 3. show sequence detail for each alignment. those > depends on zoomed level and even coverage, and I guess most time you > don't want to see bases for every reads... > > Just curious for future ggbio development, are those modes want you > want? are you just using bam files here? no VCF files involved right? > Because you mentioned 'snp', I think what you mean is mismatch? > > ps: I cannot speak for other tools, and only thing I know, in SRAdb > package, looks like it could fire your data in IGV.. > > Thanks > > Tengfei >
ADD REPLY
0
Entering edit mode
Thanks Julian, for you feedback, I will keep your feature request in my mind, and will think about implementing something like different modes/types in geom_alignment or stat_mismatch function in ggbio. Tengfei On Thu, Feb 28, 2013 at 2:31 PM, Julian Gehring <julian.gehring@gmail.com>wrote: > Hi Tengfei, > > What you describe covers what I tried to do so far (overlapping tracks, > really a hacky task). Having these modes would be very handy to have, > especially in combination with the other capabilities of 'ggbio'. All the > three modes would be good to have; I would consider (1) the most useful one > for me at the moment. The more detail to incorporate in the alignments > (e.g. including SNV information of a VCF file) would need additional > relations between BAM alignments and the VCF. > > I am not aware of any R package with this functionality, but I see a large > potential for this. Currently, I interact with IGV using batch mode, but > this far away from being a good and long-term solution. > > Best wishes > Julian > > > > You are right, currently ggbio only supports summary of mismatch showing >> as coverage plot and barchart(?stat_mismatch), but looks like what you >> want is detailed short reads alignments visualization with mismatch >> bases showing right on the reads, . It's possible, but not easy to do it >> manually...you have to have two GRanges objects, one for alignment one >> for SNP, and plot them layer by layer, the tricky part is assigning each >> reads fixed stepping level, so snp can be plotted on the right position. >> I will NOT recommend you to do this, it's probably not worth taking time >> doing it. I need to implement this features in some easy way. >> >> The tricky part is that there are different modes, 1. show reads as gray >> rectangle, and color mismatch as segment 2. show SNP as nucleotide >> text, A/C/T/G.., 3. show sequence detail for each alignment. those >> depends on zoomed level and even coverage, and I guess most time you >> don't want to see bases for every reads... >> >> Just curious for future ggbio development, are those modes want you >> want? are you just using bam files here? no VCF files involved right? >> Because you mentioned 'snp', I think what you mean is mismatch? >> >> ps: I cannot speak for other tools, and only thing I know, in SRAdb >> package, looks like it could fire your data in IGV.. >> >> Thanks >> >> Tengfei >> >> -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Tengfei, Thanks a lot for taking this into account! I'm sure many users would find it useful. Do you have any code drafts on how to plot alignments and mismatches? I know doing this will be hard, but at the moment I would be willing to give it a try. Best wishes Julian On 02/28/2013 10:05 PM, Tengfei Yin wrote: > Thanks Julian, for you feedback, I will keep your feature request in my > mind, and will think about implementing something like different > modes/types in geom_alignment or stat_mismatch function in ggbio. > > Tengfei > > On Thu, Feb 28, 2013 at 2:31 PM, Julian Gehring > <julian.gehring at="" gmail.com="" <mailto:julian.gehring="" at="" gmail.com="">> wrote: > > Hi Tengfei, > > What you describe covers what I tried to do so far (overlapping > tracks, really a hacky task). Having these modes would be very > handy to have, especially in combination with the other capabilities > of 'ggbio'. All the three modes would be good to have; I would > consider (1) the most useful one for me at the moment. The more > detail to incorporate in the alignments (e.g. including SNV > information of a VCF file) would need additional relations between > BAM alignments and the VCF. > > I am not aware of any R package with this functionality, but I see a > large potential for this. Currently, I interact with IGV using > batch mode, but this far away from being a good and long-term solution. > > Best wishes > Julian > > > > You are right, currently ggbio only supports summary of mismatch > showing > as coverage plot and barchart(?stat_mismatch), but looks like > what you > want is detailed short reads alignments visualization with mismatch > bases showing right on the reads, . It's possible, but not easy > to do it > manually...you have to have two GRanges objects, one for > alignment one > for SNP, and plot them layer by layer, the tricky part is > assigning each > reads fixed stepping level, so snp can be plotted on the right > position. > I will NOT recommend you to do this, it's probably not worth > taking time > doing it. I need to implement this features in some easy way. > > The tricky part is that there are different modes, 1. show reads > as gray > rectangle, and color mismatch as segment 2. show SNP as nucleotide > text, A/C/T/G.., 3. show sequence detail for each alignment. those > depends on zoomed level and even coverage, and I guess most time you > don't want to see bases for every reads... > > Just curious for future ggbio development, are those modes want you > want? are you just using bam files here? no VCF files involved > right? > Because you mentioned 'snp', I think what you mean is mismatch? > > ps: I cannot speak for other tools, and only thing I know, in SRAdb > package, looks like it could fire your data in IGV.. > > Thanks > > Tengfei > > > > > -- > Tengfei Yin > MCDB PhD student > 1620 Howe Hall, 2274, > Iowa State University > Ames, IA,50011-2274 > >
ADD REPLY

Login before adding your answer.

Traffic: 494 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6