writeVcf bug
1
0
Entering edit mode
@richard-pearson-5255
Last seen 10.2 years ago
Hi I've recently upgraded to R 3.0.0, but am still having a (now slightly different) problem with writeVcf. When there is only one element in geno(vcf), each variant gets a separate row for each sample. For example, in the following code, out1.vcf has 5 variants, each with genotypes for 3 samples, but out2.vcf has 15 variants, each with genotypes for just one sample. library(VariantAnnotation) fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation") in1 <- readVcf(fl, "hg19") writeVcf(in1, "out1.vcf") in2 <- readVcf(fl, "hg19", ScanVcfParam(geno="GT")) writeVcf(in2, "out2.vcf") > sessionInfo() R version 3.0.0 (2013-04-03) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] VariantAnnotation_1.6.2 Rsamtools_1.12.1 Biostrings_2.28.0 GenomicRanges_1.12.2 IRanges_1.18.0 BiocGenerics_0.6.0 devtools_1.2 BiocInstaller_1.10.0 loaded via a namespace (and not attached): [1] AnnotationDbi_1.22.1 Biobase_2.20.0 biomaRt_2.16.0 bitops_1.0-5 BSgenome_1.28.0 DBI_0.2-5 digest_0.6.3 evaluate_0.4.3 GenomicFeatures_1.12.0 httr_0.2 [11] memoise_0.1 RCurl_1.95-4.1 RSQLite_0.11.3 rtracklayer_1.20.1 stats4_3.0.0 stringr_0.6.2 tools_3.0.0 whisker_0.1 XML_3.96-1.1 zlibbioc_1.6.0 Cheers Richard On 29/11/2012 17:36, Richard Pearson wrote: > Hi Michael > > My devel sessionInfo is below. It seems biocLite has given me version > 1.5.17of the package, but I see that .makeVcfGenolooks different in the > latest from SVN (1.5.19) - so I'm sure you're right and I just need to > wait for things to filter through. Apologies for the noise. > > Every time I upgrade VariantAnnotation it is an even better package than > before - many thanks for all the hard work! > > Richard > > > > sessionInfo() > R Under development (unstable) (2012-11-27 r61172) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 > LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 > LC_MESSAGES=en_GB.UTF-8 LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] VariantAnnotation_1.5.17 Rsamtools_1.11.11 Biostrings_2.27.7 > GenomicRanges_1.11.15 IRanges_1.17.15 BiocGenerics_0.5.1 > devtools_0.8 BiocInstaller_1.9.4 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.21.7 Biobase_2.19.1 biomaRt_2.15.0 > bitops_1.0-5 BSgenome_1.27.1 DBI_0.2-5 digest_0.6.0 > evaluate_0.4.2 GenomicFeatures_1.11.5 httr_0.2 > [11] memoise_0.1 plyr_1.7.1 RCurl_1.95-3 > RSQLite_0.11.2 rtracklayer_1.19.6 stats4_2.16.0 > stringr_0.6.1 tools_2.16.0 whisker_0.1 XML_3.95-0.1 > [21] zlibbioc_1.5.0 > > > On 29/11/2012 17:22, Michael Lawrence wrote: >> I am pretty sure this bug has been fixed in devel. Would you mind >> pasting your devel session info? >> >> Thanks, >> Michael >> >> >> On Thu, Nov 29, 2012 at 8:44 AM, Richard Pearson >> <rpearson at="" well.ox.ac.uk="" <mailto:rpearson="" at="" well.ox.ac.uk="">> wrote: >> >> Hi >> >> I've noticed a bug with writeVcf when trying to write out >> relatively simple vcf files, e.g. a file containing only GT in the >> FORMAT fields. The following is a (hopefully) reproducible example >> based on examples in ?writeVcf >> >> library(VariantAnnotation) >> fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation") >> out1.vcf <- tempfile() >> out2.vcf <- tempfile() >> in1 <- readVcf(fl, "hg19") >> writeVcf(in1, out1.vcf) >> in2 <- readVcf(fl, "hg19", ScanVcfParam(geno="GT")) >> writeVcf(in2, out2.vcf) >> >> This last line gives me: >> >> Error in row(genoMat) : >> a matrix-like object is required as argument to 'row' >> >> Looking at the following code snippet from .makeVcfGeno, it seems >> genoMat is first getting created as a matrix, but then changed to >> something that is not a matrix by "genoMat <- genoMatFlat". The >> error then comes from the "row(genoMat)" on the last line shown. I >> don't really understand what is going on in this code so can't >> offer a patch I'm afraid. >> >> genoMat <- matrix(unlist(as.list(geno), use.names = FALSE, >> recursive = FALSE), nsub * nrec, length(geno)) >> genoMatFlat <- as.character(unlist(genoMat)) >> genoMatFlat[is.na <http: is.na="">(genoMatFlat)] <- "." >> if (is.list(genoMat)) { >> genoMatList <- relist(genoMatFlat, PartitioningByEnd(genoMat)) >> genoMatFlat <- .pasteCollapse(genoMatList, ",") >> genoMat <- matrix(genoMatFlat, nrow(genoMat), ncol(genoMat)) >> } >> else genoMat <- genoMatFlat >> formatMatPerSub <- matrix(rep(t(formatMat), nsub), nsub * >> nrec, length(geno), byrow = TRUE) >> keep <- !is.na <http: is.na="">(formatMatPerSub) >> genoListBySub <- seqsplit(genoMat[keep], row(genoMat)[keep]) >> >> FWIW, as a temporary fix I made an extra, slightly more complex >> FORMAT field using the following: >> >> geno(in2)[["DUMMY"]] <- >> matrix(sapply(as.vector(geno(in2)[["GT"]]), function(x) list(c(x, >> x))), ncol=ncol(geno(in2)[["GT"]]), >> dimnames=dimnames(geno(in2)[["GT"]])) >> writeVcf(in2, out2.vcf) >> >> I've checked and it seems this bug is also present in latest >> devel. Any chance of a fix? >> >> Thanks >> >> Richard >> >> > sessionInfo() >> R version 2.15.1 (2012-06-22) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 >> LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 >> LC_MESSAGES=en_GB.UTF-8 LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] VariantAnnotation_1.4.5 Rsamtools_1.10.2 Biostrings_2.26.2 >> GenomicRanges_1.10.5 IRanges_1.16.4 BiocGenerics_0.4.0 >> devtools_0.8 BiocInstaller_1.8.3 >> >> loaded via a namespace (and not attached): >> [1] AnnotationDbi_1.20.3 Biobase_2.18.0 biomaRt_2.14.0 >> bitops_1.0-5 BSgenome_1.26.1 DBI_0.2-5 digest_0.6.0 >> evaluate_0.4.2 GenomicFeatures_1.10.1 httr_0.2 >> [11] memoise_0.1 parallel_2.15.1 plyr_1.7.1 >> RCurl_1.95-3 RSQLite_0.11.2 rtracklayer_1.18.1 >> stats4_2.15.1 stringr_0.6.1 tools_2.15.1 whisker_0.1 >> [21] XML_3.95-0.1 zlibbioc_1.4.0 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >
VariantAnnotation VariantAnnotation VariantAnnotation VariantAnnotation • 1.4k views
ADD COMMENT
0
Entering edit mode
@valerie-obenchain-4275
Last seen 2.9 years ago
United States
Hi Richard, Thanks for reporting this. I've gone back and completely rewritten the code that parses/packages the geno data for writeVcf(). I didn't do a good job with this the first time around and it's now much cleaner and streamlined. Patched versions are 1.6.3 and 1.7.8 and should be available via biocLite() tomorrow (Friday) after ~9am PST. You mentioned you're 'still' having problems with writeVcf(). Are there other problems that haven't been addressed (fixed) yet? Valerie On 04/18/2013 03:55 AM, Richard Pearson wrote: > Hi > > I've recently upgraded to R 3.0.0, but am still having a (now slightly > different) problem with writeVcf. When there is only one element in > geno(vcf), each variant gets a separate row for each sample. For > example, in the following code, out1.vcf has 5 variants, each with > genotypes for 3 samples, but out2.vcf has 15 variants, each with > genotypes for just one sample. > > library(VariantAnnotation) > fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation") > in1 <- readVcf(fl, "hg19") > writeVcf(in1, "out1.vcf") > in2 <- readVcf(fl, "hg19", ScanVcfParam(geno="GT")) > writeVcf(in2, "out2.vcf") > > > sessionInfo() > R version 3.0.0 (2013-04-03) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 > LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 > LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] VariantAnnotation_1.6.2 Rsamtools_1.12.1 Biostrings_2.28.0 > GenomicRanges_1.12.2 IRanges_1.18.0 BiocGenerics_0.6.0 > devtools_1.2 BiocInstaller_1.10.0 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.22.1 Biobase_2.20.0 biomaRt_2.16.0 > bitops_1.0-5 BSgenome_1.28.0 DBI_0.2-5 digest_0.6.3 > evaluate_0.4.3 GenomicFeatures_1.12.0 httr_0.2 > [11] memoise_0.1 RCurl_1.95-4.1 RSQLite_0.11.3 > rtracklayer_1.20.1 stats4_3.0.0 stringr_0.6.2 tools_3.0.0 > whisker_0.1 XML_3.96-1.1 zlibbioc_1.6.0 > > Cheers > > Richard > > > On 29/11/2012 17:36, Richard Pearson wrote: >> Hi Michael >> >> My devel sessionInfo is below. It seems biocLite has given me version >> 1.5.17of the package, but I see that .makeVcfGenolooks different in the >> latest from SVN (1.5.19) - so I'm sure you're right and I just need to >> wait for things to filter through. Apologies for the noise. >> >> Every time I upgrade VariantAnnotation it is an even better package than >> before - many thanks for all the hard work! >> >> Richard >> >> >> > sessionInfo() >> R Under development (unstable) (2012-11-27 r61172) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 >> LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 >> LC_MESSAGES=en_GB.UTF-8 LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> base >> >> other attached packages: >> [1] VariantAnnotation_1.5.17 Rsamtools_1.11.11 Biostrings_2.27.7 >> GenomicRanges_1.11.15 IRanges_1.17.15 BiocGenerics_0.5.1 >> devtools_0.8 BiocInstaller_1.9.4 >> >> loaded via a namespace (and not attached): >> [1] AnnotationDbi_1.21.7 Biobase_2.19.1 biomaRt_2.15.0 >> bitops_1.0-5 BSgenome_1.27.1 DBI_0.2-5 digest_0.6.0 >> evaluate_0.4.2 GenomicFeatures_1.11.5 httr_0.2 >> [11] memoise_0.1 plyr_1.7.1 RCurl_1.95-3 >> RSQLite_0.11.2 rtracklayer_1.19.6 stats4_2.16.0 >> stringr_0.6.1 tools_2.16.0 whisker_0.1 XML_3.95-0.1 >> [21] zlibbioc_1.5.0 >> >> >> On 29/11/2012 17:22, Michael Lawrence wrote: >>> I am pretty sure this bug has been fixed in devel. Would you mind >>> pasting your devel session info? >>> >>> Thanks, >>> Michael >>> >>> >>> On Thu, Nov 29, 2012 at 8:44 AM, Richard Pearson >>> <rpearson at="" well.ox.ac.uk="" <mailto:rpearson="" at="" well.ox.ac.uk="">> wrote: >>> >>> Hi >>> >>> I've noticed a bug with writeVcf when trying to write out >>> relatively simple vcf files, e.g. a file containing only GT in the >>> FORMAT fields. The following is a (hopefully) reproducible example >>> based on examples in ?writeVcf >>> >>> library(VariantAnnotation) >>> fl <- system.file("extdata", "ex2.vcf", >>> package="VariantAnnotation") >>> out1.vcf <- tempfile() >>> out2.vcf <- tempfile() >>> in1 <- readVcf(fl, "hg19") >>> writeVcf(in1, out1.vcf) >>> in2 <- readVcf(fl, "hg19", ScanVcfParam(geno="GT")) >>> writeVcf(in2, out2.vcf) >>> >>> This last line gives me: >>> >>> Error in row(genoMat) : >>> a matrix-like object is required as argument to 'row' >>> >>> Looking at the following code snippet from .makeVcfGeno, it seems >>> genoMat is first getting created as a matrix, but then changed to >>> something that is not a matrix by "genoMat <- genoMatFlat". The >>> error then comes from the "row(genoMat)" on the last line shown. I >>> don't really understand what is going on in this code so can't >>> offer a patch I'm afraid. >>> >>> genoMat <- matrix(unlist(as.list(geno), use.names = FALSE, >>> recursive = FALSE), nsub * nrec, length(geno)) >>> genoMatFlat <- as.character(unlist(genoMat)) >>> genoMatFlat[is.na <http: is.na="">(genoMatFlat)] <- "." >>> if (is.list(genoMat)) { >>> genoMatList <- relist(genoMatFlat, >>> PartitioningByEnd(genoMat)) >>> genoMatFlat <- .pasteCollapse(genoMatList, ",") >>> genoMat <- matrix(genoMatFlat, nrow(genoMat), >>> ncol(genoMat)) >>> } >>> else genoMat <- genoMatFlat >>> formatMatPerSub <- matrix(rep(t(formatMat), nsub), nsub * >>> nrec, length(geno), byrow = TRUE) >>> keep <- !is.na <http: is.na="">(formatMatPerSub) >>> genoListBySub <- seqsplit(genoMat[keep], row(genoMat)[keep]) >>> >>> FWIW, as a temporary fix I made an extra, slightly more complex >>> FORMAT field using the following: >>> >>> geno(in2)[["DUMMY"]] <- >>> matrix(sapply(as.vector(geno(in2)[["GT"]]), function(x) list(c(x, >>> x))), ncol=ncol(geno(in2)[["GT"]]), >>> dimnames=dimnames(geno(in2)[["GT"]])) >>> writeVcf(in2, out2.vcf) >>> >>> I've checked and it seems this bug is also present in latest >>> devel. Any chance of a fix? >>> >>> Thanks >>> >>> Richard >>> >>> > sessionInfo() >>> R version 2.15.1 (2012-06-22) >>> Platform: x86_64-unknown-linux-gnu (64-bit) >>> >>> locale: >>> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 >>> LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 >>> LC_MESSAGES=en_GB.UTF-8 LC_PAPER=C LC_NAME=C >>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>> LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] VariantAnnotation_1.4.5 Rsamtools_1.10.2 Biostrings_2.26.2 >>> GenomicRanges_1.10.5 IRanges_1.16.4 BiocGenerics_0.4.0 >>> devtools_0.8 BiocInstaller_1.8.3 >>> >>> loaded via a namespace (and not attached): >>> [1] AnnotationDbi_1.20.3 Biobase_2.18.0 biomaRt_2.14.0 >>> bitops_1.0-5 BSgenome_1.26.1 DBI_0.2-5 digest_0.6.0 >>> evaluate_0.4.2 GenomicFeatures_1.10.1 httr_0.2 >>> [11] memoise_0.1 parallel_2.15.1 plyr_1.7.1 >>> RCurl_1.95-3 RSQLite_0.11.2 rtracklayer_1.18.1 >>> stats4_2.15.1 stringr_0.6.1 tools_2.15.1 whisker_0.1 >>> [21] XML_3.95-0.1 zlibbioc_1.4.0 >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Postdoctoral Research Position - Phylogenetic analysis of gene expression data A postdoctoral position is available in the Center for Computational and Molecular Biology at Brown University in Providence, Rhode Island. The successful candidate will develop new statistical methods and computational tools for the phylogenetic analysis of gene expression data, and will work collaboratively with the labs of Casey Dunn (http://dunnlab.org), Jean Wu (http://www.stat.brown.edu/zwu/), and Rossi Luo (https://sites.google.com/site/xirossiluo/). Research will focus on extending and implementing the ideas presented in the preprint available at: http://arxiv.org/abs/1302.2978 Funding is available for one year, with the possibility of additional funding for subsequent years. The position is available immediately. Applicants should hold a PhD in Statistics/Biostatistics, Math, Computer Science, Evolutionary Biology or related fields. Prior experience with the analysis of RNA-seq data is preferred. The candidate should have a strong background in statistics, and at least basic computational experience (eg Python, R, Unix, and git skills). Please apply be sending a cover letter, CV, and contact information for three references to Casey Dunn (casey_dunn at brown.edu). General inquiries are welcome. The position is available immediately.
ADD REPLY
0
Entering edit mode
Hi Valerie That's great, thanks. Will check out the new code when available. Sorry, I worded this poorly. I simply meant that the problem I had previously seen with the example code ("Error in row(genoMat)") had gone away, but I had noticed a new problem with this same code. There are no other problems that I am aware of that haven't been addressed. Best wishes Richard On 18/04/2013 23:03, Valerie Obenchain wrote: > Hi Richard, > > Thanks for reporting this. I've gone back and completely rewritten the > code that parses/packages the geno data for writeVcf(). I didn't do a > good job with this the first time around and it's now much cleaner and > streamlined. Patched versions are 1.6.3 and 1.7.8 and should be > available via biocLite() tomorrow (Friday) after ~9am PST. > > You mentioned you're 'still' having problems with writeVcf(). Are > there other problems that haven't been addressed (fixed) yet? > > Valerie > > On 04/18/2013 03:55 AM, Richard Pearson wrote: >> Hi >> >> I've recently upgraded to R 3.0.0, but am still having a (now slightly >> different) problem with writeVcf. When there is only one element in >> geno(vcf), each variant gets a separate row for each sample. For >> example, in the following code, out1.vcf has 5 variants, each with >> genotypes for 3 samples, but out2.vcf has 15 variants, each with >> genotypes for just one sample. >> >> library(VariantAnnotation) >> fl <- system.file("extdata", "ex2.vcf", >> package="VariantAnnotation") >> in1 <- readVcf(fl, "hg19") >> writeVcf(in1, "out1.vcf") >> in2 <- readVcf(fl, "hg19", ScanVcfParam(geno="GT")) >> writeVcf(in2, "out2.vcf") >> >> > sessionInfo() >> R version 3.0.0 (2013-04-03) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 >> LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 >> LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets >> methods base >> >> other attached packages: >> [1] VariantAnnotation_1.6.2 Rsamtools_1.12.1 Biostrings_2.28.0 >> GenomicRanges_1.12.2 IRanges_1.18.0 BiocGenerics_0.6.0 >> devtools_1.2 BiocInstaller_1.10.0 >> >> loaded via a namespace (and not attached): >> [1] AnnotationDbi_1.22.1 Biobase_2.20.0 biomaRt_2.16.0 >> bitops_1.0-5 BSgenome_1.28.0 DBI_0.2-5 digest_0.6.3 >> evaluate_0.4.3 GenomicFeatures_1.12.0 httr_0.2 >> [11] memoise_0.1 RCurl_1.95-4.1 RSQLite_0.11.3 >> rtracklayer_1.20.1 stats4_3.0.0 stringr_0.6.2 tools_3.0.0 >> whisker_0.1 XML_3.96-1.1 zlibbioc_1.6.0 >> >> Cheers >> >> Richard >> >> >> On 29/11/2012 17:36, Richard Pearson wrote: >>> Hi Michael >>> >>> My devel sessionInfo is below. It seems biocLite has given me version >>> 1.5.17of the package, but I see that .makeVcfGenolooks different in the >>> latest from SVN (1.5.19) - so I'm sure you're right and I just need to >>> wait for things to filter through. Apologies for the noise. >>> >>> Every time I upgrade VariantAnnotation it is an even better package >>> than >>> before - many thanks for all the hard work! >>> >>> Richard >>> >>> >>> > sessionInfo() >>> R Under development (unstable) (2012-11-27 r61172) >>> Platform: x86_64-unknown-linux-gnu (64-bit) >>> >>> locale: >>> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 >>> LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 >>> LC_MESSAGES=en_GB.UTF-8 LC_PAPER=C LC_NAME=C >>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>> LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C >>> >>> attached base packages: >>> [1] parallel stats graphics grDevices utils datasets methods >>> base >>> >>> other attached packages: >>> [1] VariantAnnotation_1.5.17 Rsamtools_1.11.11 Biostrings_2.27.7 >>> GenomicRanges_1.11.15 IRanges_1.17.15 BiocGenerics_0.5.1 >>> devtools_0.8 BiocInstaller_1.9.4 >>> >>> loaded via a namespace (and not attached): >>> [1] AnnotationDbi_1.21.7 Biobase_2.19.1 biomaRt_2.15.0 >>> bitops_1.0-5 BSgenome_1.27.1 DBI_0.2-5 digest_0.6.0 >>> evaluate_0.4.2 GenomicFeatures_1.11.5 httr_0.2 >>> [11] memoise_0.1 plyr_1.7.1 RCurl_1.95-3 >>> RSQLite_0.11.2 rtracklayer_1.19.6 stats4_2.16.0 >>> stringr_0.6.1 tools_2.16.0 whisker_0.1 XML_3.95-0.1 >>> [21] zlibbioc_1.5.0 >>> >>> >>> On 29/11/2012 17:22, Michael Lawrence wrote: >>>> I am pretty sure this bug has been fixed in devel. Would you mind >>>> pasting your devel session info? >>>> >>>> Thanks, >>>> Michael >>>> >>>> >>>> On Thu, Nov 29, 2012 at 8:44 AM, Richard Pearson >>>> <rpearson at="" well.ox.ac.uk="" <mailto:rpearson="" at="" well.ox.ac.uk="">> wrote: >>>> >>>> Hi >>>> >>>> I've noticed a bug with writeVcf when trying to write out >>>> relatively simple vcf files, e.g. a file containing only GT in >>>> the >>>> FORMAT fields. The following is a (hopefully) reproducible >>>> example >>>> based on examples in ?writeVcf >>>> >>>> library(VariantAnnotation) >>>> fl <- system.file("extdata", "ex2.vcf", >>>> package="VariantAnnotation") >>>> out1.vcf <- tempfile() >>>> out2.vcf <- tempfile() >>>> in1 <- readVcf(fl, "hg19") >>>> writeVcf(in1, out1.vcf) >>>> in2 <- readVcf(fl, "hg19", ScanVcfParam(geno="GT")) >>>> writeVcf(in2, out2.vcf) >>>> >>>> This last line gives me: >>>> >>>> Error in row(genoMat) : >>>> a matrix-like object is required as argument to 'row' >>>> >>>> Looking at the following code snippet from .makeVcfGeno, it seems >>>> genoMat is first getting created as a matrix, but then changed to >>>> something that is not a matrix by "genoMat <- genoMatFlat". The >>>> error then comes from the "row(genoMat)" on the last line >>>> shown. I >>>> don't really understand what is going on in this code so can't >>>> offer a patch I'm afraid. >>>> >>>> genoMat <- matrix(unlist(as.list(geno), use.names = FALSE, >>>> recursive = FALSE), nsub * nrec, length(geno)) >>>> genoMatFlat <- as.character(unlist(genoMat)) >>>> genoMatFlat[is.na <http: is.na="">(genoMatFlat)] <- "." >>>> if (is.list(genoMat)) { >>>> genoMatList <- relist(genoMatFlat, >>>> PartitioningByEnd(genoMat)) >>>> genoMatFlat <- .pasteCollapse(genoMatList, ",") >>>> genoMat <- matrix(genoMatFlat, nrow(genoMat), >>>> ncol(genoMat)) >>>> } >>>> else genoMat <- genoMatFlat >>>> formatMatPerSub <- matrix(rep(t(formatMat), nsub), nsub * >>>> nrec, length(geno), byrow = TRUE) >>>> keep <- !is.na <http: is.na="">(formatMatPerSub) >>>> genoListBySub <- seqsplit(genoMat[keep], row(genoMat)[keep]) >>>> >>>> FWIW, as a temporary fix I made an extra, slightly more complex >>>> FORMAT field using the following: >>>> >>>> geno(in2)[["DUMMY"]] <- >>>> matrix(sapply(as.vector(geno(in2)[["GT"]]), function(x) list(c(x, >>>> x))), ncol=ncol(geno(in2)[["GT"]]), >>>> dimnames=dimnames(geno(in2)[["GT"]])) >>>> writeVcf(in2, out2.vcf) >>>> >>>> I've checked and it seems this bug is also present in latest >>>> devel. Any chance of a fix? >>>> >>>> Thanks >>>> >>>> Richard >>>> >>>> > sessionInfo() >>>> R version 2.15.1 (2012-06-22) >>>> Platform: x86_64-unknown-linux-gnu (64-bit) >>>> >>>> locale: >>>> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 >>>> LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 >>>> LC_MESSAGES=en_GB.UTF-8 LC_PAPER=C LC_NAME=C >>>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>>> LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> other attached packages: >>>> [1] VariantAnnotation_1.4.5 Rsamtools_1.10.2 Biostrings_2.26.2 >>>> GenomicRanges_1.10.5 IRanges_1.16.4 BiocGenerics_0.4.0 >>>> devtools_0.8 BiocInstaller_1.8.3 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] AnnotationDbi_1.20.3 Biobase_2.18.0 biomaRt_2.14.0 >>>> bitops_1.0-5 BSgenome_1.26.1 DBI_0.2-5 digest_0.6.0 >>>> evaluate_0.4.2 GenomicFeatures_1.10.1 httr_0.2 >>>> [11] memoise_0.1 parallel_2.15.1 plyr_1.7.1 >>>> RCurl_1.95-3 RSQLite_0.11.2 rtracklayer_1.18.1 >>>> stats4_2.15.1 stringr_0.6.1 tools_2.15.1 whisker_0.1 >>>> [21] XML_3.95-0.1 zlibbioc_1.4.0 >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>>> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLY

Login before adding your answer.

Traffic: 606 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6