Map the enriched chromosome bands to entrez genes
1
0
Entering edit mode
Xi Zhao ▴ 20
@xi-zhao-4245
Last seen 10.4 years ago
Dear list, Im struggling retrieving the full list of the entrez geneIDs for each of the enriched chromosome bands (obtained by "GOstats"). revmap(org.Hs.egCHR) doesnt give the entrezIDs for the sub-arms, only for the whole arm: mget(c("16", "1q"), revmap(org.Hs.egCHR), ifnotfound=NA) # Map between Entrez Gene IDs and Chromosomes $`1q` [1] NA revmap(org.Hs.egMAP) only gives a few genes locate on that chromosome... (or did I do it wrong?) mget(c("16", "1q"), revmap(org.Hs.egMAP), ifnotfound=NA) # Map between Entrez Gene Identifiers and cytogenetic maps/bands $`16` [1] "8720" $`1q` [1] "4030" "7254" "100113374" "100302291" "100313962" Any suggestion / hint is appreciated! Kindest regards, Xi [[alternative HTML version deleted]]
• 904 views
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 5 months ago
EMBL European Molecular Biology Laborat…
Dear Xi, try this: library("org.Hs.egCHR") uu = as.list( revmap(org.Hs.egMAP) ) print(uu) i = grep("^1q", names(uu)) uu[i] length(unique(unlist(uu[i]))) # [1] 1796 > sessionInfo() R version 2.12.0 Under development (unstable) (2010-09-07 r52876) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=la_AU.utf8 LC_NUMERIC=C [3] LC_TIME=la_AU.utf8 LC_COLLATE=la_AU.utf8 [5] LC_MONETARY=C LC_MESSAGES=la_AU.utf8 [7] LC_PAPER=la_AU.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=la_AU.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] org.Hs.eg.db_2.4.1 RSQLite_0.9-2 DBI_0.2-5 [4] AnnotationDbi_1.11.4 Biobase_2.9.0 fortunes_1.3-7 loaded via a namespace (and not attached): [1] tools_2.12.0 Xi Zhao scripsit 08/09/10 10:10: > > Dear list, > > Im struggling retrieving the full list of the entrez geneIDs for each of the enriched chromosome bands (obtained by "GOstats"). > > revmap(org.Hs.egCHR) doesnt give the entrezIDs for the sub-arms, only for the whole arm: > > mget(c("16", "1q"), revmap(org.Hs.egCHR), ifnotfound=NA) # Map between Entrez Gene IDs and Chromosomes > $`1q` > [1] NA > > revmap(org.Hs.egMAP) only gives a few genes locate on that chromosome... (or did I do it wrong?) > > mget(c("16", "1q"), revmap(org.Hs.egMAP), ifnotfound=NA) # Map between Entrez Gene Identifiers and cytogenetic maps/bands > $`16` > [1] "8720" > $`1q` > [1] "4030" "7254" "100113374" "100302291" "100313962" > > Any suggestion / hint is appreciated! > > Kindest regards, > Xi > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber
ADD COMMENT
0
Entering edit mode
Dear Huber, Thanks for replying. I I still have problem retrieving the entrez genes on the chromosome sub-arm, such as 1q, 16q. By running the sample code you gave, I retrieved 5 genes locate on "1q", but isn´t "1q" supposed refer to the whole q arm of chromosome 1, which should harbor >> than 5 genes... uu = as.list( revmap(org.Hs.egMAP) ) i = grep("^1q", names(uu)) uu[i] $`1q` [1] "4030" "7254" "100113374" "100302291" "100313962" Look at 16q from the results by GOstats, there are 357 genes from 16q (appeared in my array), but revmap(org.Hs.egMAP) only gives 5 genes on 16q. Does the notation "16q" not mean the whole q arm on chr 16 but only a cytoband in package "org.Hs.eg.db"?? id Pvalue OddsRatio ExpCount Count Size Chr 16q 1.210938e-88 20.332155 10.038699 113 357 > get("16q", revmap(org.Hs.egMAP)) [1] "8136" "140454" "171013" "100125393" "100303743" And I guess by library("org.Hs.egCHR") you meant library("org.Hs.eg.db")? Thanks again! Xi R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] C attached base packages: [1] tcltk grid stats graphics grDevices utils datasets methods [9] base other attached packages: [1] humanCHRLOC_2.1.6 GO.db_2.4.1 org.Hs.eg.db_2.4.1 [4] qvalue_1.22.0 GOstats_2.14.0 RSQLite_0.9-1 [7] DBI_0.2-5 graph_1.26.0 Category_2.14.0 [10] AnnotationDbi_1.10.1 Biobase_2.8.0 ggplot2_0.8.8 [13] proto_0.3-8 reshape_0.8.3 plyr_1.0.3 loaded via a namespace (and not attached): [1] GSEABase_1.10.0 RBGL_1.24.0 XML_3.1-0 annotate_1.26.0 [5] genefilter_1.30.0 splines_2.11.1 survival_2.35-8 tools_2.11.1 [9] xtable_1.5-6 On Sep 8, 2010, at 10:41 AM, Wolfgang Huber wrote: > Dear Xi, > > try this: > > library("org.Hs.egCHR") > uu = as.list( revmap(org.Hs.egMAP) ) > print(uu) > i = grep("^1q", names(uu)) > uu[i] > > length(unique(unlist(uu[i]))) > # [1] 1796 > > > > sessionInfo() > R version 2.12.0 Under development (unstable) (2010-09-07 r52876) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=la_AU.utf8 LC_NUMERIC=C > [3] LC_TIME=la_AU.utf8 LC_COLLATE=la_AU.utf8 > [5] LC_MONETARY=C LC_MESSAGES=la_AU.utf8 > [7] LC_PAPER=la_AU.utf8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=la_AU.utf8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] org.Hs.eg.db_2.4.1 RSQLite_0.9-2 DBI_0.2-5 > [4] AnnotationDbi_1.11.4 Biobase_2.9.0 fortunes_1.3-7 > > loaded via a namespace (and not attached): > [1] tools_2.12.0 > > > Xi Zhao scripsit 08/09/10 10:10: >> >> Dear list, >> >> Im struggling retrieving the full list of the entrez geneIDs for each of the enriched chromosome bands (obtained by "GOstats"). >> >> revmap(org.Hs.egCHR) doesnt give the entrezIDs for the sub-arms, only for the whole arm: >> >> mget(c("16", "1q"), revmap(org.Hs.egCHR), ifnotfound=NA) # Map between Entrez Gene IDs and Chromosomes >> $`1q` >> [1] NA >> >> revmap(org.Hs.egMAP) only gives a few genes locate on that chromosome... (or did I do it wrong?) >> >> mget(c("16", "1q"), revmap(org.Hs.egMAP), ifnotfound=NA) # Map between Entrez Gene Identifiers and cytogenetic maps/bands >> $`16` >> [1] "8720" >> $`1q` >> [1] "4030" "7254" "100113374" "100302291" "100313962" >> >> Any suggestion / hint is appreciated! >> >> Kindest regards, >> Xi >> >> >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > > > Wolfgang Huber > EMBL > http://www.embl.de/research/units/genome_biology/huber > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Dear Xi yes, the code should be: library("org.Hs.eg.db") uu = as.list( revmap(org.Hs.egMAP) ) i = grep("^1q", names(uu)) uu[i] I get a list with 106 elements, and 1796 unique Entrez-IDs: > head(uu[i]) $`1q` [1] "4030" "7254" "100113374" "100302291" "100313962" $`1q12` [1] "2369" "9557" "9659" "9939" "11243" "27444" [7] "29765" "114814" "171419" "401131" "644450" "100270895" $`1q12-q21` [1] "10262" "10903" $`1q12-q21.2` [1] "25832" $`1q12-q22` [1] "6063" $`1q12-q23` [1] "1805" "4002" "4209" > length(unique(unlist(uu[i]))) [1] 1796 sessionInfo() as below - i.e. the same version of org.Hs.eg.db, 2.4.1, as you use. I have no idea why you only get 5 Entrez-IDs. What is the value of 'i' in your session after running the code above? Can you try again, from a clean R session, just to make sure there are no typos / remnants of previous expressions? And, yes, I do think that the notation `1q12` means that the gene is on chromosome 1q, and it is not separately annotated in the `1q` list element. The creator of the "org.Hs.eg.db" package might have more insight here. Best wishes Wolfgang Xi Zhao scripsit 08/09/10 11:46: > > Dear Huber, > > Thanks for replying. I I still have problem retrieving the entrez genes > on the chromosome sub-arm, such as 1q, 16q. > > By running the sample code you gave, I retrieved 5 genes locate on "1q", > but isn?t "1q" supposed refer to the whole q arm of chromosome 1, which > should harbor >> than 5 genes... > uu = as.list( revmap(org.Hs.egMAP) ) > i = grep("^1q", names(uu)) > uu[i] > $`1q` > [1] "4030" "7254" "100113374" "100302291" "100313962" > > Look at 16q from the results by GOstats, there are 357 genes from 16q > (appeared in my array), but revmap(org.Hs.egMAP) only gives 5 genes on > 16q. Does the notation "16q" not mean the whole q arm on chr 16 but only > a cytoband in package "org.Hs.eg.db"?? > > id Pvalue OddsRatio ExpCount Count Size > Chr 16q 1.210938e-88 20.332155 10.038699 113 357 > >> get("16q", revmap(org.Hs.egMAP)) > [1] "8136" "140454" "171013" "100125393" "100303743" > > And I guess by library("org.Hs.egCHR") you meant library("org.Hs.eg.db")? > > Thanks again! > Xi > > > > > R version 2.11.1 (2010-05-31) > x86_64-apple-darwin9.8.0 > > locale: > [1] C > > attached base packages: > [1] tcltk grid stats graphics grDevices utils datasets methods > [9] base > > other attached packages: > [1] humanCHRLOC_2.1.6 GO.db_2.4.1 org.Hs.eg.db_2.4.1 > [4] qvalue_1.22.0 GOstats_2.14.0 RSQLite_0.9-1 > [7] DBI_0.2-5 graph_1.26.0 Category_2.14.0 > [10] AnnotationDbi_1.10.1 Biobase_2.8.0 ggplot2_0.8.8 > [13] proto_0.3-8 reshape_0.8.3 plyr_1.0.3 > > loaded via a namespace (and not attached): > [1] GSEABase_1.10.0 RBGL_1.24.0 XML_3.1-0 annotate_1.26.0 > [5] genefilter_1.30.0 splines_2.11.1 survival_2.35-8 tools_2.11.1 > [9] xtable_1.5-6 > > > > On Sep 8, 2010, at 10:41 AM, Wolfgang Huber wrote: > >> Dear Xi, >> >> try this: >> >> library("org.Hs.egCHR") >> uu = as.list( revmap(org.Hs.egMAP) ) >> print(uu) >> i = grep("^1q", names(uu)) >> uu[i] >> >> length(unique(unlist(uu[i]))) >> # [1] 1796 >> >> >> > sessionInfo() >> R version 2.12.0 Under development (unstable) (2010-09-07 r52876) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=la_AU.utf8 LC_NUMERIC=C >> [3] LC_TIME=la_AU.utf8 LC_COLLATE=la_AU.utf8 >> [5] LC_MONETARY=C LC_MESSAGES=la_AU.utf8 >> [7] LC_PAPER=la_AU.utf8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=la_AU.utf8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] org.Hs.eg.db_2.4.1 RSQLite_0.9-2 DBI_0.2-5 >> [4] AnnotationDbi_1.11.4 Biobase_2.9.0 fortunes_1.3-7 >> >> loaded via a namespace (and not attached): >> [1] tools_2.12.0 >> >> >> Xi Zhao scripsit 08/09/10 10:10: >>> >>> Dear list, >>> >>> Im struggling retrieving the full list of the entrez geneIDs for each >>> of the enriched chromosome bands (obtained by "GOstats"). >>> >>> revmap(org.Hs.egCHR) doesnt give the entrezIDs for the sub-arms, only >>> for the whole arm: >>> >>> mget(c("16", "1q"), revmap(org.Hs.egCHR), ifnotfound=NA) # Map >>> between Entrez Gene IDs and Chromosomes >>> $`1q` >>> [1] NA >>> >>> revmap(org.Hs.egMAP) only gives a few genes locate on that >>> chromosome... (or did I do it wrong?) >>> >>> mget(c("16", "1q"), revmap(org.Hs.egMAP), ifnotfound=NA) # Map >>> between Entrez Gene Identifiers and cytogenetic maps/bands >>> $`16` >>> [1] "8720" >>> $`1q` >>> [1] "4030" "7254" "100113374" "100302291" "100313962" >>> >>> Any suggestion / hint is appreciated! >>> >>> Kindest regards, >>> Xi >>> >>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch <mailto:bioconductor at="" stat.math.ethz.ch=""> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> -- >> >> >> Wolfgang Huber >> EMBL >> http://www.embl.de/research/units/genome_biology/huber >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber
ADD REPLY

Login before adding your answer.

Traffic: 514 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6