Question about the appropriate uniververse for GOHyperG or hyperGtable
1
0
Entering edit mode
Scott Ochsner ▴ 300
@scott-ochsner-599
Last seen 10.3 years ago
Hello BioC, I have a list of 1164 differentially expressed probe sets extracted from an experiment done with mouse4302 chips. To arrive at the list I first filtered log2 expression values by removing those below a log2 of 6. This left me with 24654 probe sets from 45101. Next I used limma to model treatment effects and used an fdr adjusted fit2$F.p.value to extract those genes displaying differential expression in at least one of two contrasts (p<0.001). This left me with 2612 probe sets. My list of 1164 probe sets are those probe sets which are upregulated in both contrasts within the significant set of 2612. I would like to use GOHyperG or hyperGtable to evaluate overrepresented BP GO terms within the 1164 list. Should my BP GO universe come from the 45101, 24654, or 2612 probe set groups? I hope this is clear, Thanks for any help, Scott A. Ochsner, Ph.D. Baylor College of Medicine One Baylor Plaza, N810 Houston, TX. 77030 lab phone: 713-798-1620 office phone: 713-798-1585 fax: 713-798-4161
GO mouse4302 probe GO mouse4302 probe • 834 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 hour ago
United States
Hi Scott, Ochsner, Scott A wrote: > Hello BioC, > > I have a list of 1164 differentially expressed probe sets extracted > from an experiment done with mouse4302 chips. To arrive at the list I > first filtered log2 expression values by removing those below a log2 > of 6. This left me with 24654 probe sets from 45101. Next I used > limma to model treatment effects and used an fdr adjusted > fit2$F.p.value to extract those genes displaying differential > expression in at least one of two contrasts (p<0.001). This left me > with 2612 probe sets. My list of 1164 probe sets are those probe > sets which are upregulated in both contrasts within the significant > set of 2612. I would like to use GOHyperG or hyperGtable to evaluate > overrepresented BP GO terms within the 1164 list. Should my BP GO > universe come from the 45101, 24654, or 2612 probe set groups? I think you could make a convincing argument for either the 45101 or 24654 probe set groups, but personally I would go with the 45101. My rationale would be that all 45101 probe sets were measured, so any of them could theoretically have been significant. It doesn't matter IMO that some were removed because of low expression rather than a statistical test. HTH, Jim > > I hope this is clear, > > Thanks for any help, > > Scott A. Ochsner, Ph.D. Baylor College of Medicine One Baylor Plaza, > N810 Houston, TX. 77030 lab phone: 713-798-1620 office phone: > 713-798-1585 fax: 713-798-4161 > > _______________________________________________ Bioconductor mailing > list Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
ADD COMMENT
0
Entering edit mode
Hi, James W. MacDonald wrote: > Hi Scott, > > Ochsner, Scott A wrote: >> Hello BioC, >> >> I have a list of 1164 differentially expressed probe sets extracted >> from an experiment done with mouse4302 chips. To arrive at the list I >> first filtered log2 expression values by removing those below a log2 >> of 6. This left me with 24654 probe sets from 45101. Next I used >> limma to model treatment effects and used an fdr adjusted >> fit2$F.p.value to extract those genes displaying differential >> expression in at least one of two contrasts (p<0.001). This left me >> with 2612 probe sets. My list of 1164 probe sets are those probe >> sets which are upregulated in both contrasts within the significant >> set of 2612. I would like to use GOHyperG or hyperGtable to evaluate >> overrepresented BP GO terms within the 1164 list. Should my BP GO >> universe come from the 45101, 24654, or 2612 probe set groups? > > I think you could make a convincing argument for either the 45101 or > 24654 probe set groups, but personally I would go with the 45101. My > rationale would be that all 45101 probe sets were measured, so any of > them could theoretically have been significant. It doesn't matter IMO > that some were removed because of low expression rather than a > statistical test. > First, let me suggest that you not filter on value (log 2 of six, is generally not optimal), but rather on variability. Genes that show little variability across your conditions have no information (for any phenotype). Second, you need to be careful about duplicates - that is you must also at some point reduce all probes down to a single representative for each Entrez Gene ID (see the GOstats vignette for some more details), but basically you end up doing something pretty odd if you do not. My preference here is to take the most extreme test statistic (since I don't think all probes are reliable and I am looking for evidence in favor of joint behavior; other approaches are also valid, but you need to pick something). You also need to remove that that have no mappings for the ontology you are going to use (MF, BP or CC). That should happen for either all genes on the chip, or for those that survive your non-specific filtering. Second given that, Jim is correct, and I know of no sound statistical argument for preferring one over the other. My own approach is to take the 24654 (of course I would have a different number if I used the approach described above). Mostly because I think alot of the probes on the chip are not measuring anything in my cells, and if I was richer I could have designed a purpose built chip. But Jim's argument is also valid. In most of these cases there is no right answer, but you need to choose something that you agree with philosophically. best wishes Robert > HTH, > > Jim > > >> I hope this is clear, >> >> Thanks for any help, >> >> Scott A. Ochsner, Ph.D. Baylor College of Medicine One Baylor Plaza, >> N810 Houston, TX. 77030 lab phone: 713-798-1620 office phone: >> 713-798-1585 fax: 713-798-4161 >> >> _______________________________________________ Bioconductor mailing >> list Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the >> archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
ADD REPLY

Login before adding your answer.

Traffic: 722 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6