GOstats gene set size selection
2
0
Entering edit mode
alex lam RI ▴ 30
@alex-lam-ri-2752
Last seen 10.2 years ago
Dear colleagues, I have been following the GOstats vignette to test GO terms association. I would like to know whether it is possible to set limits on the number of selected genes in GO term and the size of that term on my affy chip? For example, can I tell hyperGTest to skip testing a GO term if the number of significant genes in that term is under, say, 3, or if there are more than 400 genes of that GO term on the chip? Currently I found many of my significant GO terms not very specific. As I am trying to incorporate GOstats to an expression QTL (eQTL) genome scan, I get a lot of output. Therefore, ideally I would like to filter out these terms before test rather than screening the results after test. Is there such an option with hyperGTest? Many thanks for your advice, Alex > sessionInfo() R version 2.6.2 Patched (2008-03-24 r44882) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US .U TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UT F- 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ ID ENTIFICATION=C attached base packages: [1] splines tools stats graphics grDevices utils datasets [8] methods base other attached packages: [1] GOstats_2.4.0 Category_2.4.0 genefilter_1.16.0 [4] survival_2.34 RBGL_1.14.0 annotate_1.16.1 [7] xtable_1.5-2 GO.db_2.0.2 AnnotationDbi_1.0.6 [10] RSQLite_0.6-8 DBI_0.2-4 Biobase_1.16.3 [13] graph_1.16.1 loaded via a namespace (and not attached): [1] cluster_1.11.10 > -------------------------------------------- Alex C. Lam Roslin Institute (Edinburgh) Midlothian EH25 9PS United Kingdom Tel: +44 131 5274471 Former email address: alex.lam at bbsrc.ac.uk New email address: alex.lam at roslin.ed.ac.uk Both addresses are functional Roslin Institute is a company limited by guarantee, registered in Scotland (registered number SC157100) and a Scottish Charity (registered number SC023592). Our registered office is at Roslin, Midlothian, EH25 9PS. VAT registration number 847380013. The information contained in this e-mail (including any attachments) is confidential and is intended for the use of the addressee only. The opinions expressed within this e-mail (including any attachments) are the opinions of the sender and do not necessarily constitute those of Roslin Institute (Edinburgh) ("the Institute") unless specifically stated by a sender who is duly authorised to do so on behalf of the Institute
GO GOstats GO GOstats • 1.0k views
ADD COMMENT
0
Entering edit mode
@sean-maceachern-2684
Last seen 10.2 years ago
Hi Alex, I'm not too sure if this helps with your question, but I'll put my two cents in... I am working with chickens and trying to create a large list of genes for an eQTL study from an initial simple microarray design that compares resistant vs susceptible birds, due to the small number of genes that I have found with differential expression I have attempted to increase the size of my list by examining significant GO terms. Most of the GO terms I have pulled out using hyperGTest are not very helpful due to their breadth. I have found the Category package a little more helpful. Kegg pathways are a little more specific and you can create an adjacency matrix and use the rowSums() command to filter your dataset. I think you can also treat GO terms as categories if you need to. It might be a little of topic, but it could be worth looking at. Cheers, Sean On 4/17/08 7:28 AM, "alex lam (RI)" <alex.lam at="" roslin.ed.ac.uk=""> wrote: > Dear colleagues, > > I have been following the GOstats vignette to test GO terms association. > I would like to know whether it is possible to set limits on the number > of selected genes in GO term and the size of that term on my affy chip? > > For example, can I tell hyperGTest to skip testing a GO term if the > number of significant genes in that term is under, say, 3, or if there > are more than 400 genes of that GO term on the chip? > > Currently I found many of my significant GO terms not very specific. As > I am trying to incorporate GOstats to an expression QTL (eQTL) genome > scan, I get a lot of output. Therefore, ideally I would like to filter > out these terms before test rather than screening the results after > test. Is there such an option with hyperGTest? > > Many thanks for your advice, > Alex > >> sessionInfo() > R version 2.6.2 Patched (2008-03-24 r44882) > x86_64-unknown-linux-gnu > > locale: > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_ US.U > TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US. UTF- > 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;L C_ID > ENTIFICATION=C > > attached base packages: > [1] splines tools stats graphics grDevices utils datasets > [8] methods base > > other attached packages: > [1] GOstats_2.4.0 Category_2.4.0 genefilter_1.16.0 > [4] survival_2.34 RBGL_1.14.0 annotate_1.16.1 > [7] xtable_1.5-2 GO.db_2.0.2 AnnotationDbi_1.0.6 > [10] RSQLite_0.6-8 DBI_0.2-4 Biobase_1.16.3 > [13] graph_1.16.1 > > loaded via a namespace (and not attached): > [1] cluster_1.11.10 >> > > -------------------------------------------- > Alex C. Lam > Roslin Institute (Edinburgh) > Midlothian > EH25 9PS > United Kingdom > Tel: +44 131 5274471 > > Former email address: alex.lam at bbsrc.ac.uk > New email address: alex.lam at roslin.ed.ac.uk > Both addresses are functional > > Roslin Institute is a company limited by guarantee, registered in > Scotland (registered number SC157100) and a Scottish Charity (registered > number SC023592). Our registered office is at Roslin, Midlothian, EH25 > 9PS. VAT registration number 847380013. > > The information contained in this e-mail (including any attachments) is > confidential and is intended for the use of the addressee only. The > opinions expressed within this e-mail (including any attachments) are > the opinions of the sender and do not necessarily constitute those of > Roslin Institute (Edinburgh) ("the Institute") unless specifically > stated by a sender who is duly authorised to do so on behalf of the > Institute > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
rgentleman ★ 5.5k
@rgentleman-7725
Last seen 9.6 years ago
United States
Hi, alex lam (RI) wrote: > Dear colleagues, > > I have been following the GOstats vignette to test GO terms association. > I would like to know whether it is possible to set limits on the number > of selected genes in GO term and the size of that term on my affy chip? > > For example, can I tell hyperGTest to skip testing a GO term if the > number of significant genes in that term is under, say, 3, or if there > are more than 400 genes of that GO term on the chip? It is not possible to skip the testing, but you can skip the reporting, and only for small gene sets, there is no upper limit, although I may have time to add one. It also lets you filter out GO categories on p-value. Please have a look at the vignette, which does discuss this in some detail. > > Currently I found many of my significant GO terms not very specific. As > I am trying to incorporate GOstats to an expression QTL (eQTL) genome > scan, I get a lot of output. Therefore, ideally I would like to filter > out these terms before test rather than screening the results after > test. Is there such an option with hyperGTest? The vignette and the code for summary should give you some reasonable options for filtering the results, best wishes Robert > > Many thanks for your advice, > Alex > > > sessionInfo() > R version 2.6.2 Patched (2008-03-24 r44882) > x86_64-unknown-linux-gnu > > locale: > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_ US.U > TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US. UTF- > 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;L C_ID > ENTIFICATION=C > > attached base packages: > [1] splines tools stats graphics grDevices utils datasets > [8] methods base > > other attached packages: > [1] GOstats_2.4.0 Category_2.4.0 genefilter_1.16.0 > [4] survival_2.34 RBGL_1.14.0 annotate_1.16.1 > [7] xtable_1.5-2 GO.db_2.0.2 AnnotationDbi_1.0.6 > [10] RSQLite_0.6-8 DBI_0.2-4 Biobase_1.16.3 > [13] graph_1.16.1 > > loaded via a namespace (and not attached): > [1] cluster_1.11.10 > > -------------------------------------------- > Alex C. Lam > Roslin Institute (Edinburgh) > Midlothian > EH25 9PS > United Kingdom > Tel: +44 131 5274471 > > Former email address: alex.lam at bbsrc.ac.uk > New email address: alex.lam at roslin.ed.ac.uk > Both addresses are functional > > Roslin Institute is a company limited by guarantee, registered in > Scotland (registered number SC157100) and a Scottish Charity (registered > number SC023592). Our registered office is at Roslin, Midlothian, EH25 > 9PS. VAT registration number 847380013. > > The information contained in this e-mail (including any attachments) is > confidential and is intended for the use of the addressee only. The > opinions expressed within this e-mail (including any attachments) are > the opinions of the sender and do not necessarily constitute those of > Roslin Institute (Edinburgh) ("the Institute") unless specifically > stated by a sender who is duly authorised to do so on behalf of the > Institute > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
ADD COMMENT

Login before adding your answer.

Traffic: 858 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6