hyperGTest interpretation of 'under' and finding all Entrez associated with a GO term
1
0
Entering edit mode
Mary Kindall ▴ 70
@mary-kindall-5600
Last seen 10.2 years ago

1. How do I interpret the result of 'hyperGTest' if the argument    "testDirection='under'" is supplied? The hypergtest result is given below.

2. Is there a way to find all Entrez IDs associated with a particular GO term (For example, GO:0034660).

3. How do I find list of all Entrez IDs which has a GO BP/CC/MF term? My gene universe 'uGenes' initially had a total of ~22,000 genes, whereas the result below show it to be 19,703. This mean that some of my genes in the universe did not have an associated GO term. How to find those which were left out?

> library(GOstats)
> hypTest <- new("GOHyperGParams", ontology = "BP", geneIds = dGenes, universeGeneIds = uGenes,
+                annotation = "org.Mm.eg.db", pvalueCutoff =0.05, testDirection='under')
Warning message:
In makeValidParams(.Object) : removing geneIds not in universeGeneIds
> hypRes <- hyperGTest(hypTest)
> hypRes
Gene to GO BP  test for under-representation
10243 GO BP ids tested (13 have p < 0.05)
Selected gene set size: 575
    Gene universe size: 19703
    Annotation package: org.Mm.eg
> goRes <- summary(hypRes)
> dim(goRes)
[1] 13  7
> head(goRes)
      GOBPID       Pvalue OddsRatio  ExpCount Count Size                    Term
1 GO:0034660 0.0009096559 0.0000000  6.858093     0  235 ncRNA metabolic process
2 GO:0006396 0.0012500319 0.2704202 14.212303     4  487          RNA processing
3 GO:0034470 0.0043152380 0.0000000  5.340557     0  183        ncRNA processing
4 GO:0006457 0.0146830925 0.0000000  4.144039     0  142         protein folding
5 GO:0071843 0.0255629678 0.1774194  5.457291     1  187 cellular component biogenesis at cellular level
6 GO:0022613 0.0345792941 0.1897753  5.107090     1  175 ribonucleoprotein complex biogenesis

-------------
Mary Kindall
Yorktown Heights, NY
USA

Annotation GO gostats • 2.4k views
ADD COMMENT
0
Entering edit mode

(Duplicate question)

ADD REPLY
0
Entering edit mode
@steve-lianoglou-2771
Last seen 20 months ago
United States
Hi, On Mon, Nov 12, 2012 at 9:30 AM, Mary Kindall <mary.kindall at="" gmail.com=""> wrote: > 1. > > How do I interpret the result of 'hyperGTest' if the argument > "testDirection='under'" is supplied? The hypergtest result is given below. Are you looking for help in interpreting this test in a statistical sense, or a biological one? > 2. > > Is there a way to find all Entrez IDs associated with a particular GO > term (For example, GO:0034660). Here is the "old school" way -- I bet there is now a nicer way to do this with the Homo.sapiens package (for instance). But let's say you wanted all human genes that are annotated with SMAD binding ("GO:0046332") R> library(org.Hs.eg.db) R> head(org.Hs.egGO2ALLEGS[["GO:0046332"]]) R> IDA IDA IDA IDA IDA IDA "90" "91" "94" "650" "657" "658" The names of the vector tell you the evidence code for the term -> gene association > 3. > > How do I find list of all Entrez IDs which has a GO BP/CC/MF term? My > gene universe 'uGenes' initially had a total of ~22,000 genes, where as the > result below show it to be 19,703. This mean that some of my genes in the > universe did not have an associated GO term. How to find those which were > left out? Once you have your GO term of interest, this is essentially the same question (2), no? HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT
0
Entering edit mode
Thanks steve. I figured out how to solve the 2and and 3rd problem, but I am still unable to interpret the "under" concept with HyperGTest. I can interpret the result statistically but how you interpret that biologically? On Mon, Nov 12, 2012 at 10:34 AM, Steve Lianoglou < mailinglist.honeypot@gmail.com> wrote: > Hi, > > On Mon, Nov 12, 2012 at 9:30 AM, Mary Kindall <mary.kindall@gmail.com> > wrote: > > 1. > > > > How do I interpret the result of 'hyperGTest' if the argument > > "testDirection='under'" is supplied? The hypergtest result is given > below. > > Are you looking for help in interpreting this test in a statistical > sense, or a biological one? > > > 2. > > > > Is there a way to find all Entrez IDs associated with a particular GO > > term (For example, GO:0034660). > > Here is the "old school" way -- I bet there is now a nicer way to do > this with the Homo.sapiens package (for instance). But let's say you > wanted all human genes that are annotated with SMAD binding > ("GO:0046332") > > R> library(org.Hs.eg.db) > R> head(org.Hs.egGO2ALLEGS[["GO:0046332"]]) > R> IDA IDA IDA IDA IDA IDA > "90" "91" "94" "650" "657" "658" > > The names of the vector tell you the evidence code for the term -> > gene association > > > 3. > > > > How do I find list of all Entrez IDs which has a GO BP/CC/MF term? My > > gene universe 'uGenes' initially had a total of ~22,000 genes, where > as the > > result below show it to be 19,703. This mean that some of my genes in > the > > universe did not have an associated GO term. How to find those which > were > > left out? > > Once you have your GO term of interest, this is essentially the same > question (2), no? > > HTH, > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > -- ------------- Mary Kindall Yorktown Heights, NY USA [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Steve Thanks for reply. I am able to find all Entrez for a particular GO term, however, the 'count' and 'size' of HyperGTest result do not match the numbers. For example, I got this as result of my HyperGTest. "GOBPID" "Pvalue" "OddsRatio" "ExpCount" "Count" "Size" "Term" "GO:0032501" "9.62590502703856e-31" "2.84380308273042" "116.003908034309" "235" "3975" "multicellular organismal process" count = 235 Size = 3975 When I do : > length(org.Mm.egGO2ALLEGS[["GO:0032501"]]) [1] 8163 > length(org.Mm.egGO2EG[["GO:0032501"]]) [1] 0 None of them equal to 3975. Thanks On Mon, Nov 12, 2012 at 10:34 AM, Steve Lianoglou < mailinglist.honeypot@gmail.com> wrote: > Hi, > > On Mon, Nov 12, 2012 at 9:30 AM, Mary Kindall <mary.kindall@gmail.com> > wrote: > > 1. > > > > How do I interpret the result of 'hyperGTest' if the argument > > "testDirection='under'" is supplied? The hypergtest result is given > below. > > Are you looking for help in interpreting this test in a statistical > sense, or a biological one? > > > 2. > > > > Is there a way to find all Entrez IDs associated with a particular GO > > term (For example, GO:0034660). > > Here is the "old school" way -- I bet there is now a nicer way to do > this with the Homo.sapiens package (for instance). But let's say you > wanted all human genes that are annotated with SMAD binding > ("GO:0046332") > > R> library(org.Hs.eg.db) > R> head(org.Hs.egGO2ALLEGS[["GO:0046332"]]) > R> IDA IDA IDA IDA IDA IDA > "90" "91" "94" "650" "657" "658" > > The names of the vector tell you the evidence code for the term -> > gene association > > > 3. > > > > How do I find list of all Entrez IDs which has a GO BP/CC/MF term? My > > gene universe 'uGenes' initially had a total of ~22,000 genes, where > as the > > result below show it to be 19,703. This mean that some of my genes in > the > > universe did not have an associated GO term. How to find those which > were > > left out? > > Once you have your GO term of interest, this is essentially the same > question (2), no? > > HTH, > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > -- ------------- Mary Kindall Yorktown Heights, NY USA [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi, On Mon, Nov 12, 2012 at 12:28 PM, Mary Kindall <mary.kindall at="" gmail.com=""> wrote: > Hi Steve > Thanks for reply. > I am able to find all Entrez for a particular GO term, however, the 'count' > and 'size' of HyperGTest result do not match the numbers. > > For example, I got this as result of my HyperGTest. > > "GOBPID" "Pvalue" "OddsRatio" "ExpCount" "Count" "Size" "Term" > "GO:0032501" "9.62590502703856e-31" "2.84380308273042" "116.003908034309" > "235" "3975" "multicellular organismal process" > > count = 235 > Size = 3975 > > When I do : >> length(org.Mm.egGO2ALLEGS[["GO:0032501"]]) > [1] 8163 >> length(org.Mm.egGO2EG[["GO:0032501"]]) > [1] 0 > > None of them equal to 3975. What if you only take the subset of genes annotated with "GO:0032501" that is in the universe you specified? `org.Mm.egGO2ALLEGS[["GO:0032501"]]` will return all genes annotated with this term in the "mouse universe". In your previous email, it seems as if your background gene set is stored in `uGenes`, so does: length(intersect(org.Mm.egGO2ALLEGS[["GO:0032501"]], uGenes)) Get you closer? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY
0
Entering edit mode
Hi Mary, On 11/12/2012 12:28 PM, Mary Kindall wrote: > Hi Steve > Thanks for reply. > I am able to find all Entrez for a particular GO term, however, the 'count' > and 'size' of HyperGTest result do not match the numbers. > > For example, I got this as result of my HyperGTest. > > "GOBPID" "Pvalue" "OddsRatio" "ExpCount" "Count" "Size" "Term" > "GO:0032501" "9.62590502703856e-31" "2.84380308273042" "116.003908034309" > "235" "3975" "multicellular organismal process" > > count = 235 > Size = 3975 > > When I do : >> length(org.Mm.egGO2ALLEGS[["GO:0032501"]]) > [1] 8163 >> length(org.Mm.egGO2EG[["GO:0032501"]]) > [1] 0 > > None of them equal to 3975. Did you use a conditional test? If so, the number of terms under consideration is conditional on those that were significant at a 'lower' level in the DAG for that term. Best, Jim > > Thanks > > > > > > > On Mon, Nov 12, 2012 at 10:34 AM, Steve Lianoglou< > mailinglist.honeypot at gmail.com> wrote: > >> Hi, >> >> On Mon, Nov 12, 2012 at 9:30 AM, Mary Kindall<mary.kindall at="" gmail.com=""> >> wrote: >>> 1. >>> >>> How do I interpret the result of 'hyperGTest' if the argument >>> "testDirection='under'" is supplied? The hypergtest result is given >> below. >> >> Are you looking for help in interpreting this test in a statistical >> sense, or a biological one? >> >>> 2. >>> >>> Is there a way to find all Entrez IDs associated with a particular GO >>> term (For example, GO:0034660). >> Here is the "old school" way -- I bet there is now a nicer way to do >> this with the Homo.sapiens package (for instance). But let's say you >> wanted all human genes that are annotated with SMAD binding >> ("GO:0046332") >> >> R> library(org.Hs.eg.db) >> R> head(org.Hs.egGO2ALLEGS[["GO:0046332"]]) >> R> IDA IDA IDA IDA IDA IDA >> "90" "91" "94" "650" "657" "658" >> >> The names of the vector tell you the evidence code for the term -> >> gene association >> >>> 3. >>> >>> How do I find list of all Entrez IDs which has a GO BP/CC/MF term? My >>> gene universe 'uGenes' initially had a total of ~22,000 genes, where >> as the >>> result below show it to be 19,703. This mean that some of my genes in >> the >>> universe did not have an associated GO term. How to find those which >> were >>> left out? >> Once you have your GO term of interest, this is essentially the same >> question (2), no? >> >> HTH, >> -steve >> >> -- >> Steve Lianoglou >> Graduate Student: Computational Systems Biology >> | Memorial Sloan-Kettering Cancer Center >> | Weill Medical College of Cornell University >> Contact Info: http://cbio.mskcc.org/~lianos/contact >> > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY

Login before adding your answer.

Traffic: 900 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6