Hi,
I create a TxDb from UCSC for the genome hg19 and then retrieve all genes via 'genes()'. In case I restrict the seqlevels to chr1-chr22, chrX, chrY, more genes are returned for these chromosomes, which clearly should not be the case. See code below for reproducibility. Thanks a lot for any help sorting this out.
> library(GenomicFeatures)
> TxDb.refseq.hg19 <- makeTxDbFromUCSC(genome="hg19", tablename="refGene")
> my.chr <- c(paste0('chr', seq(1,22)), 'chrX', 'chrY')
> genes.all_contigs <- genes(TxDb.refseq.hg19)
> table.genes.all_contigs <- table(seqnames(genes.all_contigs))
> seqlevels(TxDb.refseq.hg19) <- my.chr
> genes.chr <- genes(TxDb.refseq.hg19)
> table.genes.chr <- table(seqnames(genes.chr))
> table.genes.all_contigs[my.chr] - table.genes.chr[my.chr]
chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 chr22 chrX chrY
-1 0 -4 -6 0 -229 0 0 0 0 0 -3 0 0 0 0 -23 0 -1 -1 -3 0 0 0
> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.3 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=de_DE.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=de_DE.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenomicFeatures_1.22.13 AnnotationDbi_1.32.3 Biobase_2.30.0 GenomicRanges_1.22.4 GenomeInfoDb_1.6.3
[6] IRanges_2.4.7 S4Vectors_0.8.11 BiocGenerics_0.16.1
loaded via a namespace (and not attached):
[1] XVector_0.10.0 zlibbioc_1.16.0 GenomicAlignments_1.6.3 BiocParallel_1.4.3 tools_3.2.3
[6] SummarizedExperiment_1.0.2 DBI_0.3.1 lambda.r_1.1.7 futile.logger_1.4.1 rtracklayer_1.30.2
[11] futile.options_1.0.0 bitops_1.0-6 RCurl_1.95-4.7 biomaRt_2.26.1 RSQLite_1.0.0
[16] Biostrings_2.38.4 Rsamtools_1.22.0 XML_3.98-1.3
Hi Mike,
thanks a lot for this great and detailed answer!
Best, Kajetan