I couldn't get location annotation for some SNPs by using "VariantAnnotation" package.
For some SNP ids, eventhough their GRanges data are normal as others, they couldn't be annotated by using "locateVariants" function. Here is the code I ran. The link for "abnorm_snpid.txt" is https://drive.google.com/open?id=0BwtKdDxl2iyQX2lQZVZsbmJQWWs, the link for "rd_abnorm.rds" is https://drive.google.com/open?id=0BwtKdDxl2iyQSklvNGlla3F6UGs
abnorm_snpid <- readLines("abnorm_snpid.txt") rd_abnorm <- readRDS("rd_abnorm.rds") library(BSgenome.Hsapiens.UCSC.hg38) fa <- BSgenome.Hsapiens.UCSC.hg38 download.file("ftp://ftp.ensembl.org/pub/release-90/gtf/homo_sapiens/Homo_sapiens.GRCh38.90.gtf.gz", "Homo_sapiens.GRCh38.90.gtf.gz") library(GEOquery) gunzip("Homo_sapiens.GRCh38.90.gtf.gz") library(GenomicFeatures); library(VariantAnnotation) txdb <- makeTxDbFromGFF(file="Homo_sapiens.GRCh38.90.gtf", format="gtf", dataSource="Ensembl", organism="Homo sapiens") allvar <- locateVariants(rd_abnorm, txdb, AllVariants()) # allvar is empty, no location annotation # GRanges object with 0 ranges and 9 metadata columns: # seqnames ranges strand | LOCATION LOCSTART LOCEND QUERYID TXID CDSID GENEID # <Rle> <IRanges> <Rle> | <factor> <integer> <integer> <integer> <integer> <IntegerList> <character> # PRECEDEID FOLLOWID # <CharacterList> <CharacterList> # ------- # seqinfo: no sequences ## NOTE: the SNP ids are in dbSNP build 150, which uses hg38 (GRCh38) as reference genome. sessionInfo() # R version 3.4.1 (2017-06-30) # Platform: x86_64-pc-linux-gnu (64-bit) # Running under: Ubuntu 16.04.1 LTS # # Matrix products: default # BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0 # LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0 # # locale: # [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 # [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C # [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C # # attached base packages: # [1] parallel stats4 stats graphics utils datasets grDevices methods base # # other attached packages: # [1] VariantAnnotation_1.20.3 Rsamtools_1.26.2 SummarizedExperiment_1.4.0 # [4] GenomicFeatures_1.26.4 AnnotationDbi_1.36.2 GEOquery_2.40.0 # [7] Biobase_2.34.0 BiocInstaller_1.24.0 BSgenome.Hsapiens.UCSC.hg38_1.4.1 # [10] BSgenome_1.42.0 rtracklayer_1.34.2 Biostrings_2.42.1 # [13] XVector_0.14.1 GenomicRanges_1.26.4 GenomeInfoDb_1.10.3 # [16] IRanges_2.8.2 S4Vectors_0.12.2 BiocGenerics_0.20.0 # [19] setwidth_1.0-4 colorout_1.1-2 # # loaded via a namespace (and not attached): # [1] Rcpp_0.12.12 compiler_3.4.1 bitops_1.0-6 tools_3.4.1 # [5] zlibbioc_1.20.0 biomaRt_2.30.0 digest_0.6.12 bit_1.1-12 # [9] RSQLite_2.0 memoise_1.1.0 tibble_1.3.4 lattice_0.20-35 # [13] pkgconfig_2.0.1 rlang_0.1.2 Matrix_1.2-11 DBI_0.7 # [17] yaml_2.1.14 httr_1.3.1 bit64_0.9-7 grid_3.4.1 # [21] R6_2.2.2 XML_3.98-1.9 BiocParallel_1.8.2 blob_1.1.0 # [25] GenomicAlignments_1.10.1 RCurl_1.95-4.8
Could any one please check and explain to me where the problem is? Thanks!
You need to make your Google Drive links publicly accessible.
Hi Nanda,
Sorry about that. I have now figured out how to make the links publicly accessible. Could you please check it again? Thanks!
Yuzhu