Entering edit mode
Minimal reproducible example:
library(VariantAnnotation) con <- url("https://raw.githubusercontent.com/timflutre/rutilstimflutre/master/inst/extdata/example.vcf") vcf.txt <- readLines(con) close(con) vcf.file <- "example.vcf" writeLines(vcf.txt, vcf.file) vcf <- readVcf(vcf.file) geno(vcf)$GT
which returns:
ind1 ind2 ind3 snp1 "0/0" "0/1" "1/1" snp2 "0/1" "." "." indel1 "0/0" "0/1" "1/1"
However, "."
should be "./."
, as in the input file and in the VCF format specification. Or am I missing something?
ps: here is my sessionInfo()
R version 3.3.2 (2016-10-31) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.5 LTS locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] VariantAnnotation_1.20.2 Rsamtools_1.26.1 [3] Biostrings_2.42.0 XVector_0.14.0 [5] SummarizedExperiment_1.4.0 Biobase_2.34.0 [7] GenomicRanges_1.26.1 GenomeInfoDb_1.10.1 [9] IRanges_2.8.1 S4Vectors_0.12.0 [11] BiocGenerics_0.20.0 loaded via a namespace (and not attached): [1] Rcpp_0.12.8 AnnotationDbi_1.36.0 GenomicAlignments_1.10.0 [4] zlibbioc_1.20.0 BiocParallel_1.8.1 BSgenome_1.42.0 [7] lattice_0.20-34 tools_3.3.2 grid_3.3.2 [10] DBI_0.5-1 digest_0.6.10 Matrix_1.2-8 [13] rtracklayer_1.34.1 bitops_1.0-6 biomaRt_2.30.0 [16] RCurl_1.95-4.8 memoise_1.0.0 RSQLite_1.1-2 [19] compiler_3.3.2 GenomicFeatures_1.26.0 XML_3.98-1.5
Hi,
Yes, readVcf() does currently ignore ploidy - all missing values are represented with a single '.' dot. This is a good suggestion and we'll make the change. There are several things before this on the TODO but we'll get to it as soon as we can.
Valerie
That would be great, thanks!