Annotation: Use of org.Hs.egCHR to map Gene Entrez to Chromosome
1
0
Entering edit mode
Bine ▴ 50
@bine-23912
Last seen 8 months ago
UK

Dear all,

I want to find out on which Chromosome a Gene is. I am trying to adapt below code which uses the Gene Entrez to get the Chromosome Location. I already have my dataset d1 with one column containing the Gene Entrez, but I don't quite get below code adapted for that:

# select() interface:
## Objects in this package can be accessed using the select() interface
## from the AnnotationDbi package. See ?select for details.
## Bimap interface:
x <- org.Hs.egCHR
# Get the entrez gene identifiers that are mapped to a chromosome
mapped_genes <- mappedkeys(x)
# Convert to a list
xx <- as.list(x[mapped_genes])
if(length(xx) > 0) {
# Get the CHR for the first five genes
xx[1:5]
# Get the first one
xx[[1]]
}

Can anyone help me?

Thank you, Bine

org.Hs.egCHR org.Hs.eg.db • 1.1k views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 7 hours ago
United States

The OrgDb packages do contain genetic location data, but that's something that is intended to change, so you shouldn't rely on that.

You can use select on a TxDb package.

> z <- head(keys(org.Hs.eg.db))
> z
[1] "1"  "2"  "3"  "9"  "10" "11"

## CDS
> select(TxDb.Hsapiens.UCSC.hg38.knownGene, z, "CDSCHROM", "GENEID")
'select()' returned 1:1 mapping between keys and columns
  GENEID CDSCHROM
1      1    chr19
2      2    chr12
3      3     <NA>
4      9     chr8
5     10     chr8
6     11     <NA>

## Transcript
> select(TxDb.Hsapiens.UCSC.hg38.knownGene, z, "TXCHROM", "GENEID")
'select()' returned 1:1 mapping between keys and columns
  GENEID TXCHROM
1      1   chr19
2      2   chr12
3      3   chr12
4      9    chr8
5     10    chr8
6     11    <NA>

## Exons
> select(TxDb.Hsapiens.UCSC.hg38.knownGene, z, "EXONCHROM", "GENEID")
'select()' returned 1:1 mapping between keys and columns
  GENEID EXONCHROM
1      1     chr19
2      2     chr12
3      3     chr12
4      9      chr8
5     10      chr8
6     11      <NA>

Or probably a more modern approach

> zz <- genes(TxDb.Hsapiens.UCSC.hg38.knownGene, single.strand.genes.only = FALSE)
> zz
GRangesList object of length 27363:
$`1`
GRanges object with 1 range and 0 metadata columns:
      seqnames            ranges strand
         <Rle>         <IRanges>  <Rle>
  [1]    chr19 58345178-58362751      -
  -------
  seqinfo: 595 sequences (1 circular) from hg38 genome

$`10`
GRanges object with 1 range and 0 metadata columns:
      seqnames            ranges strand
         <Rle>         <IRanges>  <Rle>
  [1]     chr8 18391282-18401218      +
  -------
  seqinfo: 595 sequences (1 circular) from hg38 genome

$`100`
GRanges object with 1 range and 0 metadata columns:
      seqnames            ranges strand
         <Rle>         <IRanges>  <Rle>
  [1]    chr20 44619522-44652233      -
  -------
  seqinfo: 595 sequences (1 circular) from hg38 genome

...
<27360 more elements>

## Subscript
> zz[z]
Error: subscript contains invalid names

## But we already knew that from above, no?
## filter first
> z <- z[z %in% names(zz)]
> unlist(zz[z])
GRanges object with 5 ranges and 0 metadata columns:
     seqnames            ranges strand
        <Rle>         <IRanges>  <Rle>
   1    chr19 58345178-58362751      -
   2    chr12   9067664-9116229      -
   3    chr12   9228533-9275817      -
   9     chr8 18170477-18223689      +
  10     chr8 18391282-18401218      +
  -------
  seqinfo: 595 sequences (1 circular) from hg38 genome
ADD COMMENT
0
Entering edit mode

Thank you very much, I will try it later

ADD REPLY

Login before adding your answer.

Traffic: 620 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6