short time-course design. Any suggestion?
3
0
Entering edit mode
@stecalzatiscaliit-259
Last seen 10.2 years ago
Hi everybody. I'm looking at a small experiment with 12 chips (Affy), from 3 different cell lines measured at 4 different time points (0,2 hours, 8 h, 24 h). 1) mas5 expression values 2) selected about 1500 genes (out of ~22000) using GO annotations for those BP of possible interest 3) selected genes with at least 25% Presence/Calls (I know this is quite arbitrary). 4) ANOVA using gls with Compound Symmetry correlation structure 5) p value corrected either using p.adjust(...,"fdr") or computing Q values. I actually get few "significant" genes and mostly with low fold-change (relative to time 0) and overall low expression intensities. Any objection about all this and/or any suggestion for improvement? Thanks in advance, Ste
GO GO • 1.2k views
ADD COMMENT
0
Entering edit mode
@matthew-hannah-621
Last seen 10.2 years ago
Not too in depth but in my view it would be improved by using GCRMA or RMA, ignoring PA calls. Doing an unbiased analysis and then looking at the GO annotations of the differentially expressed genes after the analysis. I can't really advise on the ANOVA, but I guess Limma would be worth a look. HTH, Matt
ADD COMMENT
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 3.6 years ago
United States
You appear to have no replicates. Without replication you cannot do any statistical analysis such as ANOVA or limma. --Naomi At 06:10 PM 7/19/2004 +0000, Stefano Calza wrote: >Hi everybody. > >I'm looking at a small experiment with 12 chips (Affy), from 3 different >cell lines measured at 4 different time points (0,2 hours, 8 h, 24 h). > >1) mas5 expression values >2) selected about 1500 genes (out of ~22000) using GO annotations for >those BP of possible interest >3) selected genes with at least 25% Presence/Calls (I know this is quite >arbitrary). >4) ANOVA using gls with Compound Symmetry correlation structure >5) p value corrected either using p.adjust(...,"fdr") or computing Q values. > >I actually get few "significant" genes and mostly with low fold- change >(relative to time 0) and overall low expression intensities. >Any objection about all this and/or any suggestion for improvement? > >Thanks in advance, >Ste > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD COMMENT
0
Entering edit mode
a) Are you interested in the difference in cell lines over times OR b) are you treating the different cell lines as biological replicates Assuming the latter, you have a oneway anova with time as a main factor and 3 replicates at each time point. I would suggest you try RMA and GC-RMA on the whole dataset first and truncating your list later. The truncation at step 2 ignores more than 90% of the genes and your number of true positives will be quite low. You can use GO tools (I think BioConductor have some packages to handle these) on the final gene list to see if your favourite pathway is involved. On Tue, 2004-07-20 at 18:17, Naomi Altman wrote: > You appear to have no replicates. Without replication you cannot do any > statistical analysis such as ANOVA or limma. > > --Naomi > > At 06:10 PM 7/19/2004 +0000, Stefano Calza wrote: > >Hi everybody. > > > >I'm looking at a small experiment with 12 chips (Affy), from 3 different > >cell lines measured at 4 different time points (0,2 hours, 8 h, 24 h). > > > >1) mas5 expression values > >2) selected about 1500 genes (out of ~22000) using GO annotations for > >those BP of possible interest > >3) selected genes with at least 25% Presence/Calls (I know this is quite > >arbitrary). > >4) ANOVA using gls with Compound Symmetry correlation structure > >5) p value corrected either using p.adjust(...,"fdr") or computing Q values. > > > >I actually get few "significant" genes and mostly with low fold- change > >(relative to time 0) and overall low expression intensities. > >Any objection about all this and/or any suggestion for improvement? > > > >Thanks in advance, > >Ste > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor@stat.math.ethz.ch > >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Bioinformatics Consulting Center > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >
ADD REPLY
0
Entering edit mode
On Tue, Jul 20, 2004 at 07:00:37PM +0100, Adaikalavan Ramasamy wrote: > a) Are you interested in the difference in cell lines over times OR > b) are you treating the different cell lines as biological replicates > > Assuming the latter, you have a oneway anova with time as a main factor > and 3 replicates at each time point. That's right. Sorry, my description was not that clear. This is what I did, an ANOVA with time as a main factor, but assuming a correlation structure among observations > > I would suggest you try RMA and GC-RMA on the whole dataset first and > truncating your list later. The truncation at step 2 ignores more than > 90% of the genes and your number of true positives will be quite low. 1) Using all the genes (or most of the genes after a bit of unspecified filtering such as on the lowest expression value across samples and on the CV) brings to such a big number of comparison that after correction none appears to be significant. Nevertheless I could use this as an exploratory approach, i.e. to rank genes. 2) Prefiltering using an "a priori" biological framework would mean (but please correct me if I'm wrong) asking a different question: among those genes related to some biological process I'm interested in, which are actually differentially expressed? Why shall I use RMA? E.g. with a very naive approach (i.e. computing F statistics without considering correlation among observations with arrayMagic = faster!) I get that mas5 values gives more higher F values (a simple qqplot can help). Also the overall analysis doesn't improve using rma. I know of affycomp but I never used it. I'll try. Thanks. Ste > You can use GO tools (I think BioConductor have some packages to handle > these) on the final gene list to see if your favourite pathway is > involved. > > > > On Tue, 2004-07-20 at 18:17, Naomi Altman wrote: > > You appear to have no replicates. Without replication you cannot do any > > statistical analysis such as ANOVA or limma. > > > > --Naomi > > > > At 06:10 PM 7/19/2004 +0000, Stefano Calza wrote: > > >Hi everybody. > > > > > >I'm looking at a small experiment with 12 chips (Affy), from 3 different > > >cell lines measured at 4 different time points (0,2 hours, 8 h, 24 h). > > > > > >1) mas5 expression values > > >2) selected about 1500 genes (out of ~22000) using GO annotations for > > >those BP of possible interest > > >3) selected genes with at least 25% Presence/Calls (I know this is quite > > >arbitrary). > > >4) ANOVA using gls with Compound Symmetry correlation structure > > >5) p value corrected either using p.adjust(...,"fdr") or computing Q values. > > > > > >I actually get few "significant" genes and mostly with low fold- change > > >(relative to time 0) and overall low expression intensities. > > >Any objection about all this and/or any suggestion for improvement? > > > > > >Thanks in advance, > > >Ste > > > > > >_______________________________________________ > > >Bioconductor mailing list > > >Bioconductor@stat.math.ethz.ch > > >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > > Naomi S. Altman 814-865-3791 (voice) > > Associate Professor > > Bioinformatics Consulting Center > > Dept. of Statistics 814-863-7114 (fax) > > Penn State University 814-865-1348 (Statistics) > > University Park, PA 16802-2111 > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor -- Stefano Calza, Sezione di Statistica Medica Dip. di Scienze Biomediche e Biotecnologie Universit? degli Studi di Brescia - Italy Viale Europa, 11 25123 Brescia email: calza@med.unibs.it Telefono/Phone: +390303717532 Fax: +390303701157
ADD REPLY
0
Entering edit mode
@stecalzatiscaliit-259
Last seen 10.2 years ago
On Tue, Jul 20, 2004 at 07:00:37PM +0100, Adaikalavan Ramasamy wrote: > a) Are you interested in the difference in cell lines over times OR > b) are you treating the different cell lines as biological replicates > > Assuming the latter, you have a oneway anova with time as a main factor > and 3 replicates at each time point. That's right. Sorry, my description was not that clear. This is what I did, an ANOVA with time as a main factor, but assuming a correlation structure among observations > > I would suggest you try RMA and GC-RMA on the whole dataset first and > truncating your list later. The truncation at step 2 ignores more than > 90% of the genes and your number of true positives will be quite low. 1) Using all the genes (or most of the genes after a bit of unspecified filtering such as on the lowest expression value across samples and on the CV) brings to such a big number of comparison that after correction none appears to be significant. Nevertheless I could use this as an exploratory approach, i.e. to rank genes. 2) Prefiltering using an "a priori" biological framework would mean (but please correct me if I'm wrong) asking a different question: among those genes related to some biological process I'm interested in, which are actually differentially expressed? Why shall I use RMA? E.g. with a very naive approach (i.e. computing F statistics without considering correlation among observations with arrayMagic = faster!) I get that mas5 values gives more higher F values (a simple qqplot can help). Also the overall analysis doesn't improve using rma. I know of affycomp but I never used it. I'll try. Thanks. Ste > You can use GO tools (I think BioConductor have some packages to handle > these) on the final gene list to see if your favourite pathway is > involved. > > > > On Tue, 2004-07-20 at 18:17, Naomi Altman wrote: > > You appear to have no replicates. Without replication you cannot do any > > statistical analysis such as ANOVA or limma. > > > > --Naomi > > > > At 06:10 PM 7/19/2004 +0000, Stefano Calza wrote: > > >Hi everybody. > > > > > >I'm looking at a small experiment with 12 chips (Affy), from 3 different > > >cell lines measured at 4 different time points (0,2 hours, 8 h, 24 h). > > > > > >1) mas5 expression values > > >2) selected about 1500 genes (out of ~22000) using GO annotations for > > >those BP of possible interest > > >3) selected genes with at least 25% Presence/Calls (I know this is quite > > >arbitrary). > > >4) ANOVA using gls with Compound Symmetry correlation structure > > >5) p value corrected either using p.adjust(...,"fdr") or computing Q values. > > > > > >I actually get few "significant" genes and mostly with low fold- change > > >(relative to time 0) and overall low expression intensities. > > >Any objection about all this and/or any suggestion for improvement? > > > > > >Thanks in advance, > > >Ste > > > > > >_______________________________________________ > > >Bioconductor mailing list > > >Bioconductor@stat.math.ethz.ch > > >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > > Naomi S. Altman 814-865-3791 (voice) > > Associate Professor > > Bioinformatics Consulting Center > > Dept. of Statistics 814-863-7114 (fax) > > Penn State University 814-865-1348 (Statistics) > > University Park, PA 16802-2111 > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENT

Login before adding your answer.

Traffic: 615 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6