Entering edit mode
tony j
•
0
@tony-j-14276
Last seen 7.0 years ago
I am attempting to run predictCoding as follows for a small set of variants across the complete genome, resulting in the sequence not found error:
> predictCoding(vcf, txdb_hg19, Hsapiens) Error in .getOneSeqFromBSgenomeMultipleSequences(x, names[i], start[i], : sequence ^1$ not found
Both the naming conventions do match, and my vcf ranges appear in range:
> seqlevels(txdb_hg19) [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "X" "Y" > seqlevels(my_vcf) [1] "1" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "2" "20" "21" "22" "3" "4" "5" "6" "7" "8" "9" "X" "Y" > which(end(my_vcf) > seqlengths(txdb_hg19)[as.character(seqnames(my_vcf))]) named integer(0)
Please let me know what other details would aid in troubleshooting.
Thanks in advance for any direction!
TJ
> sessionInfo() R version 3.4.2 (2017-09-28) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 15063) Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] SNPlocs.Hsapiens.dbSNP142.GRCh37_0.99.5 BSgenome.Hsapiens.UCSC.hg19_1.4.0 BSgenome_1.44.2 rtracklayer_1.36.6 org.Hs.eg.db_3.4.1 [6] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.28.5 AnnotationDbi_1.38.2 biomaRt_2.32.1 plyr_1.8.4 [11] VariantAnnotation_1.22.3 Rsamtools_1.28.0 Biostrings_2.44.2 XVector_0.16.0 SummarizedExperiment_1.6.5 [16] DelayedArray_0.2.7 matrixStats_0.52.2 Biobase_2.36.2 GenomicRanges_1.28.6 GenomeInfoDb_1.12.3 [21] IRanges_2.10.5 S4Vectors_0.14.7 BiocGenerics_0.22.1 variants_0.99.129254 loaded via a namespace (and not attached): [1] Rcpp_0.12.13 compiler_3.4.2 bitops_1.0-6 tools_3.4.2 zlibbioc_1.22.0 digest_0.6.12 bit_1.1-12 [8] RSQLite_2.0 memoise_1.1.0 tibble_1.3.4 lattice_0.20-35 pkgconfig_2.0.1 rlang_0.1.2 Matrix_1.2-11 [15] DBI_0.7 GenomeInfoDbData_0.99.0 bit64_0.9-7 grid_3.4.2 cgdv17_0.14.0 XML_3.98-1.9 BiocParallel_1.10.1 [22] PolyPhen.Hsapiens.dbSNP131_1.0.2 blob_1.1.0 GenomicAlignments_1.12.2 RCurl_1.95-4.8
Yep - figured it out almost immediately after posting. The Hsapiens seqlevels were not sent to "NCBI":