HLA genes missing from TxDb.Hsapiens.UCSC.hg38.knownGene
1
@kamil-slowikowski-6901
Last seen 8 months ago
United States
Why are these genes missing?
HLA-A
https://www.ncbi.nlm.nih.gov/gene/3105
HLA-B
https://www.ncbi.nlm.nih.gov/gene/3106
For example, I found that EnsDb.Hsapiens.v86
has them, but this package does not provide the same type of object. So, it's not compatible with the same functions.
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
regions <- genes(TxDb.Hsapiens.UCSC.hg38.knownGene)
regions$gene_id[1:5]
#> [1] "1" "10" "100" "1000" "100009613"
ids <- c("HLA-A" = "3105", "HLA-B" = "3106", "VIM" = "7431")
ids %in% regions$gene_id
#> [1] FALSE FALSE TRUE
> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] reprex_0.3.0
[2] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[3] EnsDb.Hsapiens.v86_2.99.0
[4] ensembldb_2.8.1
[5] AnnotationFilter_1.8.0
[6] karyoploteR_1.10.5
[7] regioneR_1.16.5
[8] TxDb.Hsapiens.UCSC.hg38.knownGene_3.4.6
[9] GenomicFeatures_1.36.4
[10] AnnotationDbi_1.46.1
[11] Biobase_2.44.0
[12] GenomicRanges_1.36.1
[13] GenomeInfoDb_1.20.0
[14] IRanges_2.18.2
[15] S4Vectors_0.22.1
[16] BiocGenerics_0.30.0
[17] devtools_2.2.1
[18] usethis_1.5.1
Created on 2020-01-12 by the reprex package (v0.3.0)
HLA
TxDb.Hsapiens.UCSC.hg38.knownGene
• 980 views
@kaylainterdonato-17327
Last seen 10 months ago
United States
Try setting the single.strand.genes.only
argument for genes()
to FALSE
, then you will be able to access these genes.
> library(TxDb.Hsapiens.UCSC.hg38.knownGene)
> regions <- genes(TxDb.Hsapiens.UCSC.hg38.knownGene, single.strand.genes.only = FALSE)
> names(regions)[1:5]
[1] "1" "10" "100" "1000" "10000"
> ids <- c("HLA-A" = "3105", "HLA-B" = "3106", "VIM" = "7431")
> ids %in% names(regions)
[1] TRUE TRUE TRUE
> which(names(regions) == "3105")
[1] 13534
> which(names(regions) == "3106")
[1] 13535
> regions[13534]
GRangesList object of length 1:
$`3105`
GRanges object with 8 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr6 29941260-29945884 +
[2] chr6_GL000250v2_alt 1147348-1267669 +
[3] chr6_GL000251v2_alt 1369022-1489345 +
[4] chr6_GL000252v2_alt 1144955-1265390 +
[5] chr6_GL000253v2_alt 1195801-1200442 +
[6] chr6_GL000254v2_alt 1286636-1289979 +
[7] chr6_GL000255v2_alt 1196218-1264879 +
[8] chr6_GL000256v2_alt 1239073-1307767 +
-------
seqinfo: 595 sequences (1 circular) from hg38 genome
> regions[13535]
GRangesList object of length 1:
$`3106`
GRanges object with 6 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr6 31269491-31357188 -
[2] chr6_GL000251v2_alt 2750417-2837604 -
[3] chr6_GL000253v2_alt 2578541-2665857 -
[4] chr6_GL000254v2_alt 2612219-2699230 -
[5] chr6_GL000255v2_alt 2524921-2612898 -
[6] chr6_GL000256v2_alt 2571447-2659467 -
-------
seqinfo: 595 sequences (1 circular) from hg38 genome
>
Hope this helps!
Login before adding your answer.
Traffic: 638 users visited in the last hour