rtracklayer export.wig variableStep problem
1
0
Entering edit mode
Vince Schulz ▴ 160
@vince-schulz-3553
Last seen 21 days ago
United States
Hi, I can no longer generate variableStep wig files in R2.14 (or 2.13 or 2.12) but it works in 2.11. Example below: library(IRanges) library(rtracklayer) dummy <- file() # dummy file connection for demo track2 <- RangedData(IRanges(start = c(1, 72, 74, 76, 77, 94,108,151,152), end = c(71, 73, 75, 76, 93, 107, 150, 151, 152)), score = c(3,4,5,8,9,10,11,14,15), space = c("chr1", "chr1", "chr1", "chr1", "chr1", "chr1","chr1","chr1","chr1")) wig <- export.wig(track2, dummy, dataFormat = "variableStep") Error in FUN(X[i], ...) : The span must be uniform for Wiggle export. Consider bedGraph or bigWig as alternatives. Note that is does work if the regions are all of the same size, but then what is the point of variableStep? Could this possibly be fixed in R2.15 (if it isn't already)? Another question, apologies that it is unrelated, probably obvious, and not bioconductor specific: How do I see code for S4 functions? What I tried below is unhelpfully recursive: getMethod(export.ucsc) Method Definition: function (object, con, subformat = c("auto", "gff1", "wig", "bed", "bed15", "bedGraph"), append = FALSE, ...) { cl <- class(object) track <- try(as(object, "RangedData"), silent = TRUE) if (class(track) == "try-error") { track <- try(as(object, "RangedDataList"), silent = TRUE) if (class(track) == "try-error") stop("cannot export object of class '", cl, "'") } export.ucsc(track, con = con, subformat = subformat, append = append, ...) } <environment: namespace:rtracklayer=""> Signatures: object con target "ANY" "ANY" defined "ANY" "ANY" Thanks, Vince sessionInfo() R version 2.14.2 (2012-02-29) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] IRanges_1.12.5 rtracklayer_1.14.4 RCurl_1.91-1 bitops_1.0-4.1 loaded via a namespace (and not attached): [1] Biostrings_2.22.0 BSgenome_1.22.0 GenomicRanges_1.6.7 [4] tools_2.14.2 XML_3.9-4 zlibbioc_1.0.0
• 2.0k views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 2.9 years ago
United States
On Fri, Mar 23, 2012 at 8:59 AM, Vincent Schulz <vincent.schulz@yale.edu>wrote: > Hi, > > I can no longer generate variableStep wig files in R2.14 (or 2.13 or 2.12) > but it works in 2.11. Example below: > > library(IRanges) > library(rtracklayer) > dummy <- file() # dummy file connection for demo > track2 <- RangedData(IRanges(start = c(1, 72, 74, 76, 77, 94,108,151,152), > end = c(71, 73, 75, 76, 93, 107, 150, 151, 152)), > score = c(3,4,5,8,9,10,11,14,15), > space = c("chr1", "chr1", "chr1", "chr1", "chr1", > "chr1","chr1","chr1","chr1")) > wig <- export.wig(track2, dummy, dataFormat = "variableStep") > > Error in FUN(X[i], ...) : > The span must be uniform for Wiggle export. Consider bedGraph or bigWig > as alternatives. > > Note that is does work if the regions are all of the same size, but then > what is the point of variableStep? Could this possibly be fixed in R2.15 > (if it isn't already)? > > The variableStep format means that the separation of intervals is non-uniform. The width must be uniform. Back in R 2.11, rtracklayer was exporting invalid WIG files (by creating multiple blocks of different spans; UCSC told me this was not legal). > Another question, apologies that it is unrelated, probably obvious, and > not bioconductor specific: How do I see code for S4 functions? What I > tried below is unhelpfully recursive: > > getMethod(export.ucsc) > Method Definition: > > function (object, con, subformat = c("auto", "gff1", "wig", "bed", > "bed15", "bedGraph"), append = FALSE, ...) > { > cl <- class(object) > track <- try(as(object, "RangedData"), silent = TRUE) > if (class(track) == "try-error") { > track <- try(as(object, "RangedDataList"), silent = TRUE) > if (class(track) == "try-error") > stop("cannot export object of class '", cl, "'") > } > export.ucsc(track, con = con, subformat = subformat, append = append, > ...) > } > <environment: namespace:rtracklayer=""> > > Signatures: > object con > target "ANY" "ANY" > defined "ANY" "ANY" > > See ?selectMethod. > Thanks, > > Vince > > sessionInfo() > R version 2.14.2 (2012-02-29) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] IRanges_1.12.5 rtracklayer_1.14.4 RCurl_1.91-1 bitops_1.0-4.1 > > loaded via a namespace (and not attached): > [1] Biostrings_2.22.0 BSgenome_1.22.0 GenomicRanges_1.6.7 > [4] tools_2.14.2 XML_3.9-4 zlibbioc_1.0.0 > > ______________________________**_________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
hey, speaking of which -- I always have to convert vstep WIG files to bedGraph (e.g. for Cui & Zhao's CD133/CD36 ChIPseq results) in order to import them. Is that pretty much the norm or is something wrong? I've tried renaming them, etc. but conversion to bedGraph gives me better results. Also, I wrote a function to directly import MACS output, does anyone want it in rtracklayer? On Fri, Mar 23, 2012 at 10:20 AM, Michael Lawrence < lawrence.michael@gene.com> wrote: > On Fri, Mar 23, 2012 at 8:59 AM, Vincent Schulz <vincent.schulz@yale.edu> >wrote: > > > Hi, > > > > I can no longer generate variableStep wig files in R2.14 (or 2.13 or > 2.12) > > but it works in 2.11. Example below: > > > > library(IRanges) > > library(rtracklayer) > > dummy <- file() # dummy file connection for demo > > track2 <- RangedData(IRanges(start = c(1, 72, 74, 76, 77, > 94,108,151,152), > > end = c(71, 73, 75, 76, 93, 107, 150, 151, 152)), > > score = c(3,4,5,8,9,10,11,14,15), > > space = c("chr1", "chr1", "chr1", "chr1", "chr1", > > "chr1","chr1","chr1","chr1")) > > wig <- export.wig(track2, dummy, dataFormat = "variableStep") > > > > Error in FUN(X[i], ...) : > > The span must be uniform for Wiggle export. Consider bedGraph or bigWig > > as alternatives. > > > > Note that is does work if the regions are all of the same size, but then > > what is the point of variableStep? Could this possibly be fixed in R2.15 > > (if it isn't already)? > > > > > The variableStep format means that the separation of intervals is > non-uniform. The width must be uniform. Back in R 2.11, rtracklayer was > exporting invalid WIG files (by creating multiple blocks of different > spans; UCSC told me this was not legal). > > > > Another question, apologies that it is unrelated, probably obvious, and > > not bioconductor specific: How do I see code for S4 functions? What I > > tried below is unhelpfully recursive: > > > > getMethod(export.ucsc) > > Method Definition: > > > > function (object, con, subformat = c("auto", "gff1", "wig", "bed", > > "bed15", "bedGraph"), append = FALSE, ...) > > { > > cl <- class(object) > > track <- try(as(object, "RangedData"), silent = TRUE) > > if (class(track) == "try-error") { > > track <- try(as(object, "RangedDataList"), silent = TRUE) > > if (class(track) == "try-error") > > stop("cannot export object of class '", cl, "'") > > } > > export.ucsc(track, con = con, subformat = subformat, append = append, > > ...) > > } > > <environment: namespace:rtracklayer=""> > > > > Signatures: > > object con > > target "ANY" "ANY" > > defined "ANY" "ANY" > > > > > See ?selectMethod. > > > > Thanks, > > > > Vince > > > > sessionInfo() > > R version 2.14.2 (2012-02-29) > > Platform: x86_64-pc-linux-gnu (64-bit) > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=C LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > other attached packages: > > [1] IRanges_1.12.5 rtracklayer_1.14.4 RCurl_1.91-1 > bitops_1.0-4.1 > > > > loaded via a namespace (and not attached): > > [1] Biostrings_2.22.0 BSgenome_1.22.0 GenomicRanges_1.6.7 > > [4] tools_2.14.2 XML_3.9-4 zlibbioc_1.0.0 > > > > ______________________________**_________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/**listinfo/bioconductor< > https://stat.ethz.ch/mailman/listinfo/bioconductor> > > Search the archives: http://news.gmane.org/gmane.** > > science.biology.informatics.**conductor< > http://news.gmane.org/gmane.science.biology.informatics.conductor> > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On Fri, Mar 23, 2012 at 10:34 AM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > hey, speaking of which -- I always have to convert vstep WIG files to > bedGraph (e.g. for Cui & Zhao's CD133/CD36 ChIPseq results) in order to > import them. > > Is that pretty much the norm or is something wrong? I've tried renaming > them, etc. but conversion to bedGraph gives me better results. > > Would you please provide a little more detail? What goes wrong? > Also, I wrote a function to directly import MACS output, does anyone want > it in rtracklayer? > > How widely used is MACS? I am not sure that I can commit to maintain parsers for every tool's custom format. If it is becoming a standard for representing ChIP-seq peak calls, then I'll consider it. Thanks, Michael > > On Fri, Mar 23, 2012 at 10:20 AM, Michael Lawrence < > lawrence.michael@gene.com> wrote: > >> On Fri, Mar 23, 2012 at 8:59 AM, Vincent Schulz <vincent.schulz@yale.edu>> >wrote: >> >> > Hi, >> > >> > I can no longer generate variableStep wig files in R2.14 (or 2.13 or >> 2.12) >> > but it works in 2.11. Example below: >> > >> > library(IRanges) >> > library(rtracklayer) >> > dummy <- file() # dummy file connection for demo >> > track2 <- RangedData(IRanges(start = c(1, 72, 74, 76, 77, >> 94,108,151,152), >> > end = c(71, 73, 75, 76, 93, 107, 150, 151, 152)), >> > score = c(3,4,5,8,9,10,11,14,15), >> > space = c("chr1", "chr1", "chr1", "chr1", "chr1", >> > "chr1","chr1","chr1","chr1")) >> > wig <- export.wig(track2, dummy, dataFormat = "variableStep") >> > >> > Error in FUN(X[i], ...) : >> > The span must be uniform for Wiggle export. Consider bedGraph or bigWig >> > as alternatives. >> > >> > Note that is does work if the regions are all of the same size, but then >> > what is the point of variableStep? Could this possibly be fixed in >> R2.15 >> > (if it isn't already)? >> > >> > >> The variableStep format means that the separation of intervals is >> non-uniform. The width must be uniform. Back in R 2.11, rtracklayer was >> exporting invalid WIG files (by creating multiple blocks of different >> spans; UCSC told me this was not legal). >> >> >> > Another question, apologies that it is unrelated, probably obvious, and >> > not bioconductor specific: How do I see code for S4 functions? What I >> > tried below is unhelpfully recursive: >> > >> > getMethod(export.ucsc) >> > Method Definition: >> > >> > function (object, con, subformat = c("auto", "gff1", "wig", "bed", >> > "bed15", "bedGraph"), append = FALSE, ...) >> > { >> > cl <- class(object) >> > track <- try(as(object, "RangedData"), silent = TRUE) >> > if (class(track) == "try-error") { >> > track <- try(as(object, "RangedDataList"), silent = TRUE) >> > if (class(track) == "try-error") >> > stop("cannot export object of class '", cl, "'") >> > } >> > export.ucsc(track, con = con, subformat = subformat, append = append, >> > ...) >> > } >> > <environment: namespace:rtracklayer=""> >> > >> > Signatures: >> > object con >> > target "ANY" "ANY" >> > defined "ANY" "ANY" >> > >> > >> See ?selectMethod. >> >> >> > Thanks, >> > >> > Vince >> > >> > sessionInfo() >> > R version 2.14.2 (2012-02-29) >> > Platform: x86_64-pc-linux-gnu (64-bit) >> > >> > locale: >> > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> > [7] LC_PAPER=C LC_NAME=C >> > [9] LC_ADDRESS=C LC_TELEPHONE=C >> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> > >> > attached base packages: >> > [1] stats graphics grDevices utils datasets methods base >> > >> > other attached packages: >> > [1] IRanges_1.12.5 rtracklayer_1.14.4 RCurl_1.91-1 >> bitops_1.0-4.1 >> > >> > loaded via a namespace (and not attached): >> > [1] Biostrings_2.22.0 BSgenome_1.22.0 GenomicRanges_1.6.7 >> > [4] tools_2.14.2 XML_3.9-4 zlibbioc_1.0.0 >> > >> > ______________________________**_________________ >> > Bioconductor mailing list >> > Bioconductor@r-project.org >> > https://stat.ethz.ch/mailman/**listinfo/bioconductor< >> https://stat.ethz.ch/mailman/listinfo/bioconductor> >> > Search the archives: http://news.gmane.org/gmane.** >> > science.biology.informatics.**conductor< >> http://news.gmane.org/gmane.science.biology.informatics.conductor> >> > >> >> [[alternative HTML version deleted]] >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > *A model is a lie that helps you see the truth.* > * > * > Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On Sat, Mar 24, 2012 at 5:40 AM, Michael Lawrence <lawrence.michael@gene.com> wrote: > > Would you please provide a little more detail? What goes wrong? > Hmm. Have you changed something recently? It used to grind away for a long time and then return an empty RangedData, but now if I symlink: R> foo = unlist(import('CD36.H3K4me3.vstep.wig', asRangedData=FALSE)) R> foo GRanges with 668730 ranges and 1 elementMetadata col: seqnames ranges strand | score <rle> <iranges> <rle> | <numeric> R Track chr1 [ 55801, 56000] * | 1 R Track1 chr1 [521001, 521200] * | 1 R Track2 chr1 [554401, 554600] * | 1 R Track3 chr1 [559801, 560000] * | 1 R Track4 chr1 [702601, 702800] * | 1 R Track5 chr1 [703001, 703200] * | 1 R Track6 chr1 [703201, 703400] * | 3 R Track7 chr1 [703401, 703600] * | 3 R Track8 chr1 [703601, 703800] * | 23 ... ... ... ... ... ... Well that's pretty cool! I don't know if you made this change recently but if so, thanks! > Also, I wrote a function to directly import MACS output, does anyone want >> it in rtracklayer? >> >> > How widely used is MACS? I am not sure that I can commit to maintain > parsers for every tool's custom format. If it is becoming a standard for > representing ChIP-seq peak calls, then I'll consider it. > Pretty widely used. The Farnham lab (whom we work with) prefers SoleSearch, and Brad Bernstein's lab usually seems to like their own (i.e. Kellis') methods, but a lot of others (e.g. Wysocka, Liu, Rao, sometimes Zhao, etc.) start out with, or just use, MACS output and put it up on GEO. Anyways, here's the code. Needless to say it isn't quite as optimized as rtracklayer I/O, but so it goes. importMACS <- function(filename, genome='hg18') { macs = read.table(filename, sep="\t", header=TRUE, stringsAsFactors=FALSE) names(macs)[ ncol(macs) ] = 'FDR' names(macs)[ ncol(macs)-3 ] = 'score' GR = df2GR(macs[, c('chr','start','end')]) elementMetadata(GR)$score = macs$score if(!is.null(genome)) genome(GR) = genome return(GR) } This is one of a number of odds and ends in the 'cRank' package that I've been working on (since someone else used 'methLab' for their crappy GUI). > > Thanks, > Michael > Thank you, --t -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On Sun, Mar 25, 2012 at 10:25 AM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > On Sat, Mar 24, 2012 at 5:40 AM, Michael Lawrence < > lawrence.michael@gene.com> wrote: >> >> Would you please provide a little more detail? What goes wrong? >> > > Hmm. Have you changed something recently? It used to grind away for a > long time and then return an empty RangedData, but now if I symlink: > > R> foo = unlist(import('CD36.H3K4me3.vstep.wig', asRangedData=FALSE)) > R> foo > GRanges with 668730 ranges and 1 elementMetadata col: > seqnames ranges strand | score > <rle> <iranges> <rle> | <numeric> > R Track chr1 [ 55801, 56000] * | 1 > R Track1 chr1 [521001, 521200] * | 1 > R Track2 chr1 [554401, 554600] * | 1 > R Track3 chr1 [559801, 560000] * | 1 > R Track4 chr1 [702601, 702800] * | 1 > R Track5 chr1 [703001, 703200] * | 1 > R Track6 chr1 [703201, 703400] * | 3 > R Track7 chr1 [703401, 703600] * | 3 > R Track8 chr1 [703601, 703800] * | 23 > ... ... ... ... ... ... > > Well that's pretty cool! I don't know if you made this change recently > but if so, thanks! > > I recently checked in a major overall of rtracklayer I/O, with a full suite of unit tests. Looks like it was an improvement. > > > >> Also, I wrote a function to directly import MACS output, does anyone want >>> it in rtracklayer? >>> >>> >> How widely used is MACS? I am not sure that I can commit to maintain >> parsers for every tool's custom format. If it is becoming a standard for >> representing ChIP-seq peak calls, then I'll consider it. >> > > Pretty widely used. The Farnham lab (whom we work with) prefers > SoleSearch, and Brad Bernstein's lab usually seems to like their own (i.e. > Kellis') methods, but a lot of others (e.g. Wysocka, Liu, Rao, sometimes > Zhao, etc.) start out with, or just use, MACS output and put it up on GEO. > > Anyways, here's the code. Needless to say it isn't quite as optimized as > rtracklayer I/O, but so it goes. > > importMACS <- function(filename, genome='hg18') { > macs = read.table(filename, sep="\t", header=TRUE, > stringsAsFactors=FALSE) > names(macs)[ ncol(macs) ] = 'FDR' > names(macs)[ ncol(macs)-3 ] = 'score' > GR = df2GR(macs[, c('chr','start','end')]) > elementMetadata(GR)$score = macs$score > if(!is.null(genome)) genome(GR) = genome > return(GR) > } > > So looking at this page: http://liulab.dfci.harvard.edu/MACS/00README.html There are a lot of different outputs. It sounds like you are dealing with #1, which at least originally an XLS (maybe you are converting this to a tab separated file?). Or is it something else? It sounds like #1 might be a BED3 + N format file. > This is one of a number of odds and ends in the 'cRank' package that I've > been working on (since someone else used 'methLab' for their crappy GUI). > > > >> >> Thanks, >> Michael >> > > Thank you, > > --t > > > > -- > *A model is a lie that helps you see the truth.* > * > * > Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> > > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 756 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6