HLA genes missing from TxDb.Hsapiens.UCSC.hg38.knownGene
1
1
Entering edit mode
@kamil-slowikowski-6901
Last seen 7 months ago
United States

Why are these genes missing?

HLA-A https://www.ncbi.nlm.nih.gov/gene/3105

HLA-B https://www.ncbi.nlm.nih.gov/gene/3106

For example, I found that EnsDb.Hsapiens.v86 has them, but this package does not provide the same type of object. So, it's not compatible with the same functions.

library(TxDb.Hsapiens.UCSC.hg38.knownGene)

regions <- genes(TxDb.Hsapiens.UCSC.hg38.knownGene)

regions$gene_id[1:5]
#> [1] "1"         "10"        "100"       "1000"      "100009613"

ids <- c("HLA-A" = "3105", "HLA-B" = "3106", "VIM" = "7431")

ids %in% regions$gene_id
#> [1] FALSE FALSE  TRUE
> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] reprex_0.3.0
 [2] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
 [3] EnsDb.Hsapiens.v86_2.99.0
 [4] ensembldb_2.8.1
 [5] AnnotationFilter_1.8.0
 [6] karyoploteR_1.10.5
 [7] regioneR_1.16.5
 [8] TxDb.Hsapiens.UCSC.hg38.knownGene_3.4.6
 [9] GenomicFeatures_1.36.4
[10] AnnotationDbi_1.46.1
[11] Biobase_2.44.0
[12] GenomicRanges_1.36.1
[13] GenomeInfoDb_1.20.0
[14] IRanges_2.18.2
[15] S4Vectors_0.22.1
[16] BiocGenerics_0.30.0
[17] devtools_2.2.1
[18] usethis_1.5.1

Created on 2020-01-12 by the reprex package (v0.3.0)

HLA TxDb.Hsapiens.UCSC.hg38.knownGene • 966 views
ADD COMMENT
2
Entering edit mode
@kaylainterdonato-17327
Last seen 9 months ago
United States

Try setting the single.strand.genes.only argument for genes() to FALSE, then you will be able to access these genes.

> library(TxDb.Hsapiens.UCSC.hg38.knownGene)
> regions <- genes(TxDb.Hsapiens.UCSC.hg38.knownGene, single.strand.genes.only = FALSE)
> names(regions)[1:5]
[1] "1"     "10"    "100"   "1000"  "10000"
> ids <- c("HLA-A" = "3105", "HLA-B" = "3106", "VIM" = "7431")
> ids %in% names(regions)
[1] TRUE TRUE TRUE
> which(names(regions) == "3105")
[1] 13534
> which(names(regions) == "3106")
[1] 13535
> regions[13534]
GRangesList object of length 1:
$`3105`
GRanges object with 8 ranges and 0 metadata columns:
                 seqnames            ranges strand
                    <Rle>         <IRanges>  <Rle>
  [1]                chr6 29941260-29945884      +
  [2] chr6_GL000250v2_alt   1147348-1267669      +
  [3] chr6_GL000251v2_alt   1369022-1489345      +
  [4] chr6_GL000252v2_alt   1144955-1265390      +
  [5] chr6_GL000253v2_alt   1195801-1200442      +
  [6] chr6_GL000254v2_alt   1286636-1289979      +
  [7] chr6_GL000255v2_alt   1196218-1264879      +
  [8] chr6_GL000256v2_alt   1239073-1307767      +
  -------
  seqinfo: 595 sequences (1 circular) from hg38 genome

> regions[13535]
GRangesList object of length 1:
$`3106`
GRanges object with 6 ranges and 0 metadata columns:
                 seqnames            ranges strand
                    <Rle>         <IRanges>  <Rle>
  [1]                chr6 31269491-31357188      -
  [2] chr6_GL000251v2_alt   2750417-2837604      -
  [3] chr6_GL000253v2_alt   2578541-2665857      -
  [4] chr6_GL000254v2_alt   2612219-2699230      -
  [5] chr6_GL000255v2_alt   2524921-2612898      -
  [6] chr6_GL000256v2_alt   2571447-2659467      -
  -------
  seqinfo: 595 sequences (1 circular) from hg38 genome

>

Hope this helps!

ADD COMMENT

Login before adding your answer.

Traffic: 712 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6