Distinguish Synonymous vs. Non-synonymous SNPs
1
0
Entering edit mode
Wu, Xiwei ▴ 350
@wu-xiwei-1102
Last seen 10.2 years ago
Dear list, I have a list of SNPs with chromosome location, and trying to see which ones are non-synonymous. Are there any packages/functions that can distinguish synonymous and non-synonymous SNPs? I tried to search the list, but could not find anything related. Thanks in advance. Xiwei --------------------------------------------------------------------- SECURITY/CONFIDENTIALITY WARNING: \ This message and ...{{dropped:30}}
• 1.8k views
ADD COMMENT
0
Entering edit mode
@michael-dondrup-3849
Last seen 10.2 years ago
Hi Xiwei, do you mean SNPs that result in non-synonymous vs synonymous coding? Then a biomart query might do the job and therefore the package biomaRt could be used to query from within R. There is a filter in biomart for different consequence types of SNPs, one of which is NON_SYNONYMOUS_CODING. You can check which filter seems appropriate This lengthy Url represents a possible query in the biomart web interface: http://www.biomart.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTR IBUTES=hsapiens_snp.default.snp.refsnp_id|hsapiens_snp.default.snp.chr _name|hsapiens_snp.default.snp.chrom_start|hsapiens_snp.default.snp.co nsequence_type_tv|hsapiens_snp.default.snp.ensembl_type|hsapiens_snp.d efault.snp.ensembl_peptide_shift|hsapiens_snp.default.snp.phenotype_de scription|hsapiens_snp.default.snp.phenotype_name&FILTERS=hsapiens_snp .default.filters.consequence_type."NON_SYNONYMOUS_CODING"&VISIBLEPANEL =resultspanel it should be possible to set identical parameters in the bioconductor package biomaRt although I didn't try this yet. Best Michael Am Jan 29, 2010 um 1:57 AM schrieb Wu, Xiwei: > Dear list, > > I have a list of SNPs with chromosome location, and trying to see which > ones are non-synonymous. Are there any packages/functions that can > distinguish synonymous and non-synonymous SNPs? I tried to search the > list, but could not find anything related. Thanks in advance. > > Xiwei > > > > > --------------------------------------------------------------------- > SECURITY/CONFIDENTIALITY WARNING: \ This message and ...{{dropped:30}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Michael, Thanks a lot for your help. I will give it a try. Is it good for novel SNPs not in the dbSNP? The SNPs I got are from a sequencing project, many of them are not in dbSNP. Xiwei -----Original Message----- From: Michael Dondrup [mailto:Michael.Dondrup@uni.no] Sent: Friday, January 29, 2010 3:11 AM To: Wu, Xiwei Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] Distinguish Synonymous vs. Non-synonymous SNPs Hi Xiwei, do you mean SNPs that result in non-synonymous vs synonymous coding? Then a biomart query might do the job and therefore the package biomaRt could be used to query from within R. There is a filter in biomart for different consequence types of SNPs, one of which is NON_SYNONYMOUS_CODING. You can check which filter seems appropriate This lengthy Url represents a possible query in the biomart web interface: http://www.biomart.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTR IB UTES=hsapiens_snp.default.snp.refsnp_id|hsapiens_snp.default.snp.chr_n am e|hsapiens_snp.default.snp.chrom_start|hsapiens_snp.default.snp.conseq ue nce_type_tv|hsapiens_snp.default.snp.ensembl_type|hsapiens_snp.default .s np.ensembl_peptide_shift|hsapiens_snp.default.snp.phenotype_descriptio n| hsapiens_snp.default.snp.phenotype_name&FILTERS=hsapiens_snp.default.f il ters.consequence_type."NON_SYNONYMOUS_CODING"&VISIBLEPANEL=resultspane l it should be possible to set identical parameters in the bioconductor package biomaRt although I didn't try this yet. Best Michael Am Jan 29, 2010 um 1:57 AM schrieb Wu, Xiwei: > Dear list, > > I have a list of SNPs with chromosome location, and trying to see which > ones are non-synonymous. Are there any packages/functions that can > distinguish synonymous and non-synonymous SNPs? I tried to search the > list, but could not find anything related. Thanks in advance. > > Xiwei > > > > > --------------------------------------------------------------------- > SECURITY/CONFIDENTIALITY WARNING: \ This message and ...{{dropped:30}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
I'm not sure BioC is the tool you want to use. Have you tried something like BioPerl: http://www.bioperl.org/Core/Latest/bioscripts.html#scripts_utilities_p airwise_kaks_pls "Takes DNA sequences as input, aligns them as proteins, projects the alignment back into DNA and estimates the Ka (non-synonymous) and Ks (synonymous) substitutions." ________________________________________ From: bioconductor-bounces@stat.math.ethz.ch [bioconductor- bounces@stat.math.ethz.ch] On Behalf Of Wu, Xiwei [XWu@coh.org] Sent: 29 January 2010 17:26 To: Michael Dondrup Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] Distinguish Synonymous vs. Non-synonymous SNPs Michael, Thanks a lot for your help. I will give it a try. Is it good for novel SNPs not in the dbSNP? The SNPs I got are from a sequencing project, many of them are not in dbSNP. Xiwei -----Original Message----- From: Michael Dondrup [mailto:Michael.Dondrup@uni.no] Sent: Friday, January 29, 2010 3:11 AM To: Wu, Xiwei Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] Distinguish Synonymous vs. Non-synonymous SNPs Hi Xiwei, do you mean SNPs that result in non-synonymous vs synonymous coding? Then a biomart query might do the job and therefore the package biomaRt could be used to query from within R. There is a filter in biomart for different consequence types of SNPs, one of which is NON_SYNONYMOUS_CODING. You can check which filter seems appropriate This lengthy Url represents a possible query in the biomart web interface: http://www.biomart.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTR IB UTES=hsapiens_snp.default.snp.refsnp_id|hsapiens_snp.default.snp.chr_n am e|hsapiens_snp.default.snp.chrom_start|hsapiens_snp.default.snp.conseq ue nce_type_tv|hsapiens_snp.default.snp.ensembl_type|hsapiens_snp.default .s np.ensembl_peptide_shift|hsapiens_snp.default.snp.phenotype_descriptio n| hsapiens_snp.default.snp.phenotype_name&FILTERS=hsapiens_snp.default.f il ters.consequence_type."NON_SYNONYMOUS_CODING"&VISIBLEPANEL=resultspane l it should be possible to set identical parameters in the bioconductor package biomaRt although I didn't try this yet. Best Michael Am Jan 29, 2010 um 1:57 AM schrieb Wu, Xiwei: > Dear list, > > I have a list of SNPs with chromosome location, and trying to see which > ones are non-synonymous. Are there any packages/functions that can > distinguish synonymous and non-synonymous SNPs? I tried to search the > list, but could not find anything related. Thanks in advance. > > Xiwei > > > > > --------------------------------------------------------------------- > SECURITY/CONFIDENTIALITY WARNING: \ This message and ...{{dropped:30}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Dear all, I found that the findOverlaps function does not work properly if the space levels do not match exactly between subject and query. Has anyone noticed the same problem? I am using the R-2.10.0 and IRanges-1.4.0. Is this problem being fixed in the developmental version? Thanks. Xiwei --------------------------------------------------------------------- SECURITY/CONFIDENTIALITY WARNING: \ This message and ...{{dropped:30}}
ADD REPLY
0
Entering edit mode
Need input... what is the expected result? and what actually happened? sessionInfo()... On Fri, Jan 29, 2010 at 11:47 AM, Wu, Xiwei <xwu@coh.org> wrote: > Dear all, > > I found that the findOverlaps function does not work properly if the > space levels do not match exactly between subject and query. Has anyone > noticed the same problem? I am using the R-2.10.0 and IRanges-1.4.0. Is > this problem being fixed in the developmental version? > > Thanks. > > Xiwei > > > --------------------------------------------------------------------- > SECURITY/CONFIDENTIALITY WARNING: \ This message and ...{{dropped:30}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Michael, Here is one example. Please let me know if I have missed anything. Thanks. > a <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 9, 5, 6, 3, 7), sep="")) > b <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 18, 5, 21, 3, "X"), sep="")) > as.matrix(findOverlaps(a, a)) query subject [1,] 1 1 [2,] 2 2 [3,] 3 3 [4,] 4 4 [5,] 5 5 [6,] 6 6 [7,] 7 7 > as.matrix(findOverlaps(a, b)) query subject [1,] 1 1 [2,] 2 3 [3,] 3 5 [4,] 4 5 > a[4] RangedData with 1 row and 0 value columns across 1 space space ranges | <character> <iranges> | 1 chr5 [1, 10] | > b[5] RangedData with 1 row and 0 value columns across 1 space space ranges | <character> <iranges> | 1 chr3 [1, 10] | > sessionInfo() R version 2.10.0 (2009-10-26) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=C [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rtracklayer_1.6.0 RCurl_1.3-1 [3] bitops_1.0-4.1 BSgenome.Hsapiens.UCSC.hg18_1.3.15 [5] ShortRead_1.4.0 lattice_0.17-26 [7] BSgenome_1.14.0 Biostrings_2.14.0 [9] IRanges_1.4.0 loaded via a namespace (and not attached): [1] Biobase_2.6.0 grid_2.10.0 hwriter_1.1 tools_2.10.0 XML_2.6-0 Xiwei ________________________________________ From: Michael Lawrence [mailto:lawrence.michael@gene.com] Sent: Friday, January 29, 2010 1:13 PM To: Wu, Xiwei Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] FindOverlaps Problem Need input... what is the expected result? and what actually happened? sessionInfo()... On Fri, Jan 29, 2010 at 11:47 AM, Wu, Xiwei <xwu at="" coh.org=""> wrote: Dear all, I found that the findOverlaps function does not work properly if the space levels do not match exactly between subject and query. Has anyone noticed the same problem? I am using the R-2.10.0 and IRanges-1.4.0. Is this problem being fixed in the developmental version? Thanks. Xiwei --------------------------------------------------------------------- SECURITY/CONFIDENTIALITY WARNING: ?\ This message and ...{{dropped:10}}
ADD REPLY
0
Entering edit mode
It seems to me that it is working correctly but you can't assume that the order of ranges at time of construction serves as the order in the ultimate object. A lexicographic ordering by space names is used. Challenging to interpret but if you look at the values of a and b before interpreting your findOverlaps result it starts to make sense. > a <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 9, 5, 6, 3, 7), sep="")) > a RangedData with 7 rows and 0 value columns across 7 spaces space ranges | <character> <iranges> | 1 chr10 [1, 10] | 2 chr2 [1, 10] | 3 chr3 [1, 10] | 4 chr5 [1, 10] | 5 chr6 [1, 10] | 6 chr7 [1, 10] | 7 chr9 [1, 10] | > b <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 18, 5, 21, 3 , "X"), sep="")) > b RangedData with 7 rows and 0 value columns across 7 spaces space ranges | <character> <iranges> | 1 chr10 [1, 10] | 2 chr18 [1, 10] | 3 chr2 [1, 10] | 4 chr21 [1, 10] | 5 chr3 [1, 10] | 6 chr5 [1, 10] | 7 chrX [1, 10] | > findOverlaps(a,b) RangesMatchingList of length 7 names(7): chr10 chr2 chr3 chr5 chr6 chr7 chr9 > as.matrix(.Last.value) query subject [1,] 1 1 [2,] 2 3 [3,] 3 5 [4,] 4 6 On Sat, Jan 30, 2010 at 1:09 PM, Wu, Xiwei <xwu at="" coh.org=""> wrote: > Michael, > > Here is one example. Please let me know if I have missed anything. Thanks. > >> a <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 9, 5, 6, 3, 7), sep="")) >> b <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 18, 5, 21, 3, "X"), sep="")) >> as.matrix(findOverlaps(a, a)) > ? ? query subject > [1,] ? ? 1 ? ? ? 1 > [2,] ? ? 2 ? ? ? 2 > [3,] ? ? 3 ? ? ? 3 > [4,] ? ? 4 ? ? ? 4 > [5,] ? ? 5 ? ? ? 5 > [6,] ? ? 6 ? ? ? 6 > [7,] ? ? 7 ? ? ? 7 >> as.matrix(findOverlaps(a, b)) > ? ? query subject > [1,] ? ? 1 ? ? ? 1 > [2,] ? ? 2 ? ? ? 3 > [3,] ? ? 3 ? ? ? 5 > [4,] ? ? 4 ? ? ? 5 >> a[4] > RangedData with 1 row and 0 value columns across 1 space > ? ? ? ?space ? ?ranges | > ?<character> <iranges> | > 1 ? ? ? ?chr5 ? [1, 10] | >> b[5] > RangedData with 1 row and 0 value columns across 1 space > ? ? ? ?space ? ?ranges | > ?<character> <iranges> | > 1 ? ? ? ?chr3 ? [1, 10] | >> sessionInfo() > R version 2.10.0 (2009-10-26) > x86_64-unknown-linux-gnu > > locale: > ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C > ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 > ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=C > ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C > ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > > other attached packages: > [1] rtracklayer_1.6.0 ? ? ? ? ? ? ? ? ?RCurl_1.3-1 > [3] bitops_1.0-4.1 ? ? ? ? ? ? ? ? ? ? BSgenome.Hsapiens.UCSC.hg18_1.3.15 > [5] ShortRead_1.4.0 ? ? ? ? ? ? ? ? ? ?lattice_0.17-26 > [7] BSgenome_1.14.0 ? ? ? ? ? ? ? ? ? ?Biostrings_2.14.0 > [9] IRanges_1.4.0 > > loaded via a namespace (and not attached): > [1] Biobase_2.6.0 grid_2.10.0 ? hwriter_1.1 ? tools_2.10.0 ?XML_2.6-0 > > > Xiwei > ________________________________________ > From: Michael Lawrence [mailto:lawrence.michael at gene.com] > Sent: Friday, January 29, 2010 1:13 PM > To: Wu, Xiwei > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] FindOverlaps Problem > > Need input... what is the expected result? and what actually happened? sessionInfo()... > On Fri, Jan 29, 2010 at 11:47 AM, Wu, Xiwei <xwu at="" coh.org=""> wrote: > Dear all, > > I found that the findOverlaps function does not work properly if the > space levels do not match exactly between subject and query. Has anyone > noticed the same problem? I am using the R-2.10.0 and IRanges-1.4.0. Is > this problem being fixed in the developmental version? > > Thanks. > > Xiwei > > > --------------------------------------------------------------------- > SECURITY/CONFIDENTIALITY WARNING: ?\ This message and ...{{dropped:10}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
ah -- my result was with devel, and in fact it does not agree with yours, but it seems correct in this instance. > sessionInfo() R version 2.11.0 Under development (unstable) (2010-01-07 r50940) i386-apple-darwin9.8.0 locale: [1] C attached base packages: [1] grid stats graphics grDevices datasets tools utils [8] methods base other attached packages: [1] GenomeGraphs_1.7.1 biomaRt_2.3.0 leeBamSet_0.0.8 [4] Rsamtools_0.1.24 BSgenome_1.15.4 Biostrings_2.15.18 [7] IRanges_1.5.31 org.Sc.sgd.db_2.3.5 RSQLite_0.7-3 [10] DBI_0.2-4 AnnotationDbi_1.9.0 Biobase_2.7.0 [13] weaver_1.13.0 codetools_0.2-2 digest_0.4.1 loaded via a namespace (and not attached): [1] RCurl_1.3-0 XML_2.6-0 On Sat, Jan 30, 2010 at 4:30 PM, Vincent Carey <stvjc at="" channing.harvard.edu=""> wrote: > It seems to me that it is working correctly but you can't assume that > the order of ranges at time of construction serves as the order in the > ultimate object. ?A lexicographic ordering by space names is used. > Challenging to interpret but if you look at the values of a and b > before interpreting your findOverlaps result it starts to make sense. > >> a <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 9, 5, 6, 3, > 7), sep="")) >> a > RangedData with 7 rows and 0 value columns across 7 spaces > ? ? ? ?space ? ?ranges | > ?<character> <iranges> | > 1 ? ? ? chr10 ? [1, 10] | > 2 ? ? ? ?chr2 ? [1, 10] | > 3 ? ? ? ?chr3 ? [1, 10] | > 4 ? ? ? ?chr5 ? [1, 10] | > 5 ? ? ? ?chr6 ? [1, 10] | > 6 ? ? ? ?chr7 ? [1, 10] | > 7 ? ? ? ?chr9 ? [1, 10] | >> b <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 18, 5, 21, 3 > , "X"), sep="")) >> b > RangedData with 7 rows and 0 value columns across 7 spaces > ? ? ? ?space ? ?ranges | > ?<character> <iranges> | > 1 ? ? ? chr10 ? [1, 10] | > 2 ? ? ? chr18 ? [1, 10] | > 3 ? ? ? ?chr2 ? [1, 10] | > 4 ? ? ? chr21 ? [1, 10] | > 5 ? ? ? ?chr3 ? [1, 10] | > 6 ? ? ? ?chr5 ? [1, 10] | > 7 ? ? ? ?chrX ? [1, 10] | >> findOverlaps(a,b) > RangesMatchingList of length 7 > names(7): chr10 chr2 chr3 chr5 chr6 chr7 chr9 >> as.matrix(.Last.value) > ? ? query subject > [1,] ? ? 1 ? ? ? 1 > [2,] ? ? 2 ? ? ? 3 > [3,] ? ? 3 ? ? ? 5 > [4,] ? ? 4 ? ? ? 6 > > > On Sat, Jan 30, 2010 at 1:09 PM, Wu, Xiwei <xwu at="" coh.org=""> wrote: >> Michael, >> >> Here is one example. Please let me know if I have missed anything. Thanks. >> >>> a <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 9, 5, 6, 3, 7), sep="")) >>> b <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 18, 5, 21, 3, "X"), sep="")) >>> as.matrix(findOverlaps(a, a)) >> ? ? query subject >> [1,] ? ? 1 ? ? ? 1 >> [2,] ? ? 2 ? ? ? 2 >> [3,] ? ? 3 ? ? ? 3 >> [4,] ? ? 4 ? ? ? 4 >> [5,] ? ? 5 ? ? ? 5 >> [6,] ? ? 6 ? ? ? 6 >> [7,] ? ? 7 ? ? ? 7 >>> as.matrix(findOverlaps(a, b)) >> ? ? query subject >> [1,] ? ? 1 ? ? ? 1 >> [2,] ? ? 2 ? ? ? 3 >> [3,] ? ? 3 ? ? ? 5 >> [4,] ? ? 4 ? ? ? 5 >>> a[4] >> RangedData with 1 row and 0 value columns across 1 space >> ? ? ? ?space ? ?ranges | >> ?<character> <iranges> | >> 1 ? ? ? ?chr5 ? [1, 10] | >>> b[5] >> RangedData with 1 row and 0 value columns across 1 space >> ? ? ? ?space ? ?ranges | >> ?<character> <iranges> | >> 1 ? ? ? ?chr3 ? [1, 10] | >>> sessionInfo() >> R version 2.10.0 (2009-10-26) >> x86_64-unknown-linux-gnu >> >> locale: >> ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C >> ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 >> ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=C >> ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C >> ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >> >> other attached packages: >> [1] rtracklayer_1.6.0 ? ? ? ? ? ? ? ? ?RCurl_1.3-1 >> [3] bitops_1.0-4.1 ? ? ? ? ? ? ? ? ? ? BSgenome.Hsapiens.UCSC.hg18_1.3.15 >> [5] ShortRead_1.4.0 ? ? ? ? ? ? ? ? ? ?lattice_0.17-26 >> [7] BSgenome_1.14.0 ? ? ? ? ? ? ? ? ? ?Biostrings_2.14.0 >> [9] IRanges_1.4.0 >> >> loaded via a namespace (and not attached): >> [1] Biobase_2.6.0 grid_2.10.0 ? hwriter_1.1 ? tools_2.10.0 ?XML_2.6-0 >> >> >> Xiwei >> ________________________________________ >> From: Michael Lawrence [mailto:lawrence.michael at gene.com] >> Sent: Friday, January 29, 2010 1:13 PM >> To: Wu, Xiwei >> Cc: bioconductor at stat.math.ethz.ch >> Subject: Re: [BioC] FindOverlaps Problem >> >> Need input... what is the expected result? and what actually happened? sessionInfo()... >> On Fri, Jan 29, 2010 at 11:47 AM, Wu, Xiwei <xwu at="" coh.org=""> wrote: >> Dear all, >> >> I found that the findOverlaps function does not work properly if the >> space levels do not match exactly between subject and query. Has anyone >> noticed the same problem? I am using the R-2.10.0 and IRanges-1.4.0. Is >> this problem being fixed in the developmental version? >> >> Thanks. >> >> Xiwei >> >> >> --------------------------------------------------------------------- >> SECURITY/CONFIDENTIALITY WARNING: ?\ This message and ...{{dropped:10}} >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> >
ADD REPLY
0
Entering edit mode
On Sat, Jan 30, 2010 at 1:33 PM, Vincent Carey <stvjc@channing.harvard.edu>wrote: > ah -- my result was with devel, and in fact it does not agree with > yours, but it seems correct in this instance. > > I think there was a bug in the as.matrix() method on the overlap result that was fixed relatively recently. This probably explains the difference. Thanks for looking into this Vince. Michael > > sessionInfo() > R version 2.11.0 Under development (unstable) (2010-01-07 r50940) > i386-apple-darwin9.8.0 > > locale: > [1] C > > attached base packages: > [1] grid stats graphics grDevices datasets tools utils > [8] methods base > > other attached packages: > [1] GenomeGraphs_1.7.1 biomaRt_2.3.0 leeBamSet_0.0.8 > [4] Rsamtools_0.1.24 BSgenome_1.15.4 Biostrings_2.15.18 > [7] IRanges_1.5.31 org.Sc.sgd.db_2.3.5 RSQLite_0.7-3 > [10] DBI_0.2-4 AnnotationDbi_1.9.0 Biobase_2.7.0 > [13] weaver_1.13.0 codetools_0.2-2 digest_0.4.1 > > loaded via a namespace (and not attached): > [1] RCurl_1.3-0 XML_2.6-0 > > > On Sat, Jan 30, 2010 at 4:30 PM, Vincent Carey > <stvjc@channing.harvard.edu> wrote: > > It seems to me that it is working correctly but you can't assume that > > the order of ranges at time of construction serves as the order in the > > ultimate object. A lexicographic ordering by space names is used. > > Challenging to interpret but if you look at the values of a and b > > before interpreting your findOverlaps result it starts to make sense. > > > >> a <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), > space=paste("chr", c(2, 10, 9, 5, 6, 3, > > 7), sep="")) > >> a > > RangedData with 7 rows and 0 value columns across 7 spaces > > space ranges | > > <character> <iranges> | > > 1 chr10 [1, 10] | > > 2 chr2 [1, 10] | > > 3 chr3 [1, 10] | > > 4 chr5 [1, 10] | > > 5 chr6 [1, 10] | > > 6 chr7 [1, 10] | > > 7 chr9 [1, 10] | > >> b <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), > space=paste("chr", c(2, 10, 18, 5, 21, 3 > > , "X"), sep="")) > >> b > > RangedData with 7 rows and 0 value columns across 7 spaces > > space ranges | > > <character> <iranges> | > > 1 chr10 [1, 10] | > > 2 chr18 [1, 10] | > > 3 chr2 [1, 10] | > > 4 chr21 [1, 10] | > > 5 chr3 [1, 10] | > > 6 chr5 [1, 10] | > > 7 chrX [1, 10] | > >> findOverlaps(a,b) > > RangesMatchingList of length 7 > > names(7): chr10 chr2 chr3 chr5 chr6 chr7 chr9 > >> as.matrix(.Last.value) > > query subject > > [1,] 1 1 > > [2,] 2 3 > > [3,] 3 5 > > [4,] 4 6 > > > > > > On Sat, Jan 30, 2010 at 1:09 PM, Wu, Xiwei <xwu@coh.org> wrote: > >> Michael, > >> > >> Here is one example. Please let me know if I have missed anything. > Thanks. > >> > >>> a <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), > space=paste("chr", c(2, 10, 9, 5, 6, 3, 7), sep="")) > >>> b <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), > space=paste("chr", c(2, 10, 18, 5, 21, 3, "X"), sep="")) > >>> as.matrix(findOverlaps(a, a)) > >> query subject > >> [1,] 1 1 > >> [2,] 2 2 > >> [3,] 3 3 > >> [4,] 4 4 > >> [5,] 5 5 > >> [6,] 6 6 > >> [7,] 7 7 > >>> as.matrix(findOverlaps(a, b)) > >> query subject > >> [1,] 1 1 > >> [2,] 2 3 > >> [3,] 3 5 > >> [4,] 4 5 > >>> a[4] > >> RangedData with 1 row and 0 value columns across 1 space > >> space ranges | > >> <character> <iranges> | > >> 1 chr5 [1, 10] | > >>> b[5] > >> RangedData with 1 row and 0 value columns across 1 space > >> space ranges | > >> <character> <iranges> | > >> 1 chr3 [1, 10] | > >>> sessionInfo() > >> R version 2.10.0 (2009-10-26) > >> x86_64-unknown-linux-gnu > >> > >> locale: > >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > >> [5] LC_MONETARY=C LC_MESSAGES=C > >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > >> [9] LC_ADDRESS=C LC_TELEPHONE=C > >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > >> > >> attached base packages: > >> [1] stats graphics grDevices utils datasets methods base > >> > >> other attached packages: > >> [1] rtracklayer_1.6.0 RCurl_1.3-1 > >> [3] bitops_1.0-4.1 > BSgenome.Hsapiens.UCSC.hg18_1.3.15 > >> [5] ShortRead_1.4.0 lattice_0.17-26 > >> [7] BSgenome_1.14.0 Biostrings_2.14.0 > >> [9] IRanges_1.4.0 > >> > >> loaded via a namespace (and not attached): > >> [1] Biobase_2.6.0 grid_2.10.0 hwriter_1.1 tools_2.10.0 XML_2.6-0 > >> > >> > >> Xiwei > >> ________________________________________ > >> From: Michael Lawrence [mailto:lawrence.michael@gene.com] > >> Sent: Friday, January 29, 2010 1:13 PM > >> To: Wu, Xiwei > >> Cc: bioconductor@stat.math.ethz.ch > >> Subject: Re: [BioC] FindOverlaps Problem > >> > >> Need input... what is the expected result? and what actually happened? > sessionInfo()... > >> On Fri, Jan 29, 2010 at 11:47 AM, Wu, Xiwei <xwu@coh.org> wrote: > >> Dear all, > >> > >> I found that the findOverlaps function does not work properly if the > >> space levels do not match exactly between subject and query. Has anyone > >> noticed the same problem? I am using the R-2.10.0 and IRanges-1.4.0. Is > >> this problem being fixed in the developmental version? > >> > >> Thanks. > >> > >> Xiwei > >> > >> > >> --------------------------------------------------------------------- > >> SECURITY/CONFIDENTIALITY WARNING: \ This message and ...{{dropped:10}} > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@stat.math.ethz.ch > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >> > > > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 973 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6