RE: Bioconductor Digest, Vol 8, Issue 15

0

Entering edit mode

Baker, Stephen ▴ 160

@baker-stephen-469

Last seen 10.2 years ago

Gary Churchill at the Jackson Labs in Maine has an R program on his website for performing mixed models ANOVA on microarray data. The only problem with this is it uses least squares to fit the model (which would include a within-subjects factor for the time effect) and would requires that there are no missing data points and all subjects being measured at the same time points. This is because the least squares solution involves inverting a matrix and missing data would make it not of full rank. An alternative approach which wouldn't be done in R would be to use PROC MIXED in the SAS stats package. This uses maximum likelihood to fit mixed models and works well. If you really want to try to do it in R, Yudi Pawitan at Dept. of Stats at University of Cork in Ireland has a book and a set of R programs which would give you a leg up on it: http://statistics.ucc.ie/staff/yudi/likelihood/index.htm -.- -.. .---- .--. ..-. Stephen P. Baker, MScPH, PhD (ABD) (508) 856-2625 Sr. Biostatistician- Information Services Lecturer in Biostatistics (775) 254-4885 fax Graduate School of Biomedical Sciences University of Massachusetts Medical School, Worcester 55 Lake Avenue North stephen.baker@umassmed.edu Worcester, MA 01655 USA -----Original Message----- From: bioconductor-request@stat.math.ethz.ch [mailto:bioconductor-request@stat.math.ethz.ch] Sent: Wednesday, October 08, 2003 10:22 AM To: bioconductor@stat.math.ethz.ch Subject: Bioconductor Digest, Vol 8, Issue 15 Send Bioconductor mailing list submissions to bioconductor@stat.math.ethz.ch To subscribe or unsubscribe via the World Wide Web, visit https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor or, via email, send a message with subject or body 'help' to bioconductor-request@stat.math.ethz.ch You can reach the person managing the list at bioconductor-owner@stat.math.ethz.ch When replying, please edit your Subject line so it is more specific than "Re: Contents of Bioconductor digest..." Today's Topics: 1. Re: "Rgraphviz" installation problem (Martin Maechler) 2. Re: Rdbi (Vincent Carey 525-2265) 3. Re: Rdbi (John Zhang) 4. Affy: Present calls in an eset (Arne.Muller@aventis.com) 5. order restricted inference (Stefano Barbi) 6. Re: Affy: Present calls in an eset (A.J. Rossini) 7. RE: Affy: Present calls in an eset (Arne.Muller@aventis.com) 8. Histogram and boxplot of MM data (donghu@itsa.ucsf.edu) 9. marrayClass - how to extract expr values for a gene (Naomi Altman) 10. time-course experiments (edoardo missiaglia) ---------------------------------------------------------------------- Message: 1 Date: Wed, 8 Oct 2003 12:19:12 +0200 From: Martin Maechler <maechler@stat.math.ethz.ch> Subject: Re: [BioC] "Rgraphviz" installation problem To: "Weiming Zhang" <weiming.zhang@uchsc.edu> Cc: bioconductor@stat.math.ethz.ch Message-ID: <16259.58528.241239.832103@gargle.gargle.HOWL> Content-Type: text/plain; charset=us-ascii >>>>> "Weiming" == Weiming Zhang <weiming.zhang@uchsc.edu> >>>>> on 07 Oct 2003 10:52:06 -0600 writes: Weiming> Hi, Weiming> Problem solved. I changed linux kernel-headers from 2.4.20 to 2.4.9 and Weiming> it worked. Weiming> Thank you all for the help, especially Robert. Interesting.... but quite problematic, since 2.4.20 is not a beta Linux kernel but a "production" one, and a very widely used one too: Redhat 9's kernel (at least here) *is* 2.4.20. So I assume the graphviz need to be notified about this as well, right? Martin Weiming> On Mon, 2003-10-06 at 20:21, Vincent Carey 525-2265 wrote: >> > from common.h:10, >> > from Rgraphviz.c:1: >> > /usr/include/bits/local_lim.h:36:26: linux/limits.h: No such file or >> > directory >> > In file included from Rgraphviz.c:1: >> > common.h:19:20: render.h: No such file or directory >> > common.h:20:19: graph.h: No such file or directory >> > common.h:21:22: dotprocs.h: No such file or directory >> > common.h:22:24: neatoprocs.h: No such file or directory >> > common.h:23:20: adjust.h: No such file or directory >> > Rgraphviz.c:2:20: circle.h: No such file or directory >> >> these errors suggest that the graphviz installation from >> RPM is not supporting development level resources. >> >> > -IboostIncl -ftemplate-depth-30 -fPIC -g -O2 -c bfsBGL.cpp -o bfsBGL.o >> > In file included from /usr/include/bits/posix1_lim.h:126, >> > from /usr/include/limits.h:144, >> > from >> > /usr/lib/gcc-lib/i386-redhat-linux/2.96/include/limits.h:130, >> > from >> > /usr/lib/gcc-lib/i386-redhat- linux/2.96/include/syslimits.h:7, >> > from >> > /usr/lib/gcc-lib/i386-redhat-linux/2.96/include/limits.h:11, >> > from boostIncl/boost/config/suffix.hpp:26, >> > from boostIncl/boost/config.hpp:57, >> > from RBGL.h:4, >> > from bfsBGL.cpp:5: >> > /usr/include/bits/local_lim.h:36:26: linux/limits.h: No such file or >> >> these errors suggest inadequacy of your linux installation. >> the missing files are basic development resources. you may >> have chosen an "end-user-only" distribution, or the linux >> installation is nonstandard. >> >> are you able to build R from source? i suspect not. >> ------------------------------ Message: 2 Date: Wed, 8 Oct 2003 08:12:17 -0400 (EDT) From: Vincent Carey 525-2265 <stvjc@channing.harvard.edu> Subject: Re: [BioC] Rdbi To: Kasper Daniel Hansen <k.hansen@biostat.ku.dk> Cc: bioconductor@stat.math.ethz.ch Message-ID: <pine.gso.4.40.0310080806390.19036-100000@capecod.bwh.harvard.edu> Content-Type: TEXT/PLAIN; charset=US-ASCII > I'm using R 1.7.1 and Bioconductor release 1.2 under Redhat Linux 8. > > The package SAGElyzer (which is version 1.1.17 in BioC 1.2) requires > the library Rdbi, when running under UNIX. This package is not > available from CRAN anymore. You may still get it from > http://rdbi.sourceforge.net/. Rdbi seems to have been replaced by the > package DBI, which is available from CRAN. I have looked at the > development sources for SAGElyzer (v 1.2.4) and it seems to have the > same requirements as v1.1.17. > > Questions/thoughts: > 1) Is DBI sufficiently developed to replace Rdbi? If so, I guess it > needs to be fixed in SAGElyzer. DBI is pretty mature, the problem is that no one has written DBI-compliant drivers for postgres. ROracle and RMySQL from CRAN do satisfy DBI. > 2) Otherwise I think a mention of this issue ought to be placed in the > FAQ since Rdbi seems to have disappeared from CRAN. agreed. And anyone with an interest in/time to write a DBI-compliant RPostgres is encouraged to do this! ------------------------------ Message: 3 Date: Wed, 8 Oct 2003 08:39:57 -0400 (EDT) From: John Zhang <jzhang@jimmy.harvard.edu> Subject: Re: [BioC] Rdbi To: K.Hansen@biostat.ku.dk, bioconductor@stat.math.ethz.ch Message-ID: <200310081239.IAA17449@blaise.dfci.harvard.edu> Content-Type: TEXT/plain; charset=us-ascii >From: Kasper Daniel Hansen <k.hansen@biostat.ku.dk> >To: bioconductor@stat.math.ethz.ch >Date: Wed, 8 Oct 2003 12:00:18 +0200 >User-Agent: KMail/1.4.3 >MIME-Version: 1.0 >Content-Transfer-Encoding: 8bit >X-Virus-Scanned: by amavisd-milter (http://amavis.org/) >X-Virus-Scanned: by amavisd-milter (http://amavis.org/) >X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on >hypatia >X-Spam-Status: No, hits=-4.6 required=5.0 tests=AWL, BAYES_00 autolearn=ham version=2.60 >Questions/thoughts: >1) Is DBI sufficiently developed to replace Rdbi? If so, I guess it >needs to >be fixed in SAGElyzer. I am evaluating DBI and will make a decision on that soon. >2) Otherwise I think a mention of this issue ought to be placed in the >FAQ >since Rdbi seems to have disappeared from CRAN. I have included the source for Rdbi and Rdbi.PgSQL in the vignette. Thank you for your comments. >-- >Kasper Daniel Hansen, Research Assistent >Department of Biostatistics, University of Copenhagen > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor Jianhua Zhang Department of Biostatistics Dana-Farber Cancer Institute 44 Binney Street Boston, MA 02115-6084 ------------------------------ Message: 4 Date: Wed, 8 Oct 2003 14:54:56 +0200 From: <arne.muller@aventis.com> Subject: [BioC] Affy: Present calls in an eset To: <bioconductor@stat.math.ethz.ch> Message-ID: <c80ecafa2acc1b45be45d133ed660ade410b0d@crbsmxsusr04.pharma.aventis.co m=""> Content-Type: text/plain; charset="iso-8859-1" Hello, I'm quite new to Bioconductor/affy, and I was wondering if there's a simple way to include the absent/present call for a gene in the outputfile generated with write.exprs(eset, file='boo') in theaffy package. the eset was generated with eset <- expresso(cel, bgcorrect.method = 'rma', normalize.method = 'qspline', pmcorrect.method = 'pmonly', summary.method='liwong') For further analyses I'd like to exclude genes that are absent in all chips. thanks a lot for your help, Arne ------------------------------ Message: 5 Date: Wed, 8 Oct 2003 14:59:11 +0200 From: "Stefano Barbi" <stefanobarbi@libero.it> Subject: [BioC] order restricted inference To: <bioconductor@stat.math.ethz.ch> Message-ID: <002101c38d9b$f7c95500$81081b9d@BARBI> Content-Type: text/plain; charset="iso-8859-1" Dear all, I wonder if anyone has implemented or is intentioned to implement the procedure described in Peddada et al. "Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference" in the Bioconductor or R environment. If not, do you know if there are other packages avalaible for R dealing with order restricted inference? Lastly, I would appreciate any suggestions of other approches to classify conveniently time profiles. Thank you in advance. Best wishes, Stefano. ------------------------------ Message: 6 Date: Wed, 08 Oct 2003 06:33:01 -0700 From: rossini@blindglobe.net (A.J. Rossini) Subject: Re: [BioC] Affy: Present calls in an eset To: <arne.muller@aventis.com> Cc: bioconductor@stat.math.ethz.ch Message-ID: <85llrvn9bm.fsf@blindglobe.net> Content-Type: text/plain; charset=us-ascii <arne.muller@aventis.com> writes: > Hello, > > I'm quite new to Bioconductor/affy, and I was wondering if there's a > simple way to include the absent/present call for a gene in the > outputfile generated with write.exprs(eset, file='boo') in theaffy > package. > > the eset was generated with > > eset <- expresso(cel, bgcorrect.method = 'rma', normalize.method = > 'qspline', pmcorrect.method = 'pmonly', > summary.method='liwong') > > For further analyses I'd like to exclude genes that are absent in all > chips. That's tough. It isn't clear what a sensible definition of absent is. Or present. Do you mean "expressed" ? "Differentially expressed" ? "sort of differentially expressed but not too weakly expressed?". For any of these, you'll need a precise definition (there isn't any in Bioconductor), and you can compute your own. (I know that MAS will make these calls; I'm only familiar with Rosetta Resolver's variant, and they don't really make sense to me -- to be precise, I know numerically how they are derived, but fail to why they realistically connect biologically or technologically without a great deal of assumptions and a wild imagination). best, -tony -- rossini@u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}} ------------------------------ Message: 7 Date: Wed, 8 Oct 2003 15:47:42 +0200 From: <arne.muller@aventis.com> Subject: RE: [BioC] Affy: Present calls in an eset To: <rossini@u.washington.edu> Cc: bioconductor@stat.math.ethz.ch Message-ID: <c80ecafa2acc1b45be45d133ed660ade410b0e@crbsmxsusr04.pharma.aventis.co m=""> Content-Type: text/plain; charset="iso-8859-1" Hi, I get your point with interpreting absent/present calls. Technically it's a nice feature, becasue one can just discard the majority of the genes on the chip for further analysis. In fact I think absent/present calls make sense in terms of biology, since just a fraction of the genes are realy expressed at a time. How to express this numerially is a different story (and I guess a difficult one). Anyway, with MAS the calls are calculated anyway, can't they? So, I'd be nice (at least for "completness") to add a "mascall" method to the exprSet objects generated by affy. What do you think? By the way, if you ignore the call, do you set an arbitrary intensity cutoff later in your analysis, or do just reley on the statistics (anova p-value or whatever)? regards, Arne > -----Original Message----- > From: A.J. Rossini [mailto:rossini@blindglobe.net] > Sent: 08 October 2003 15:33 > To: Muller, Arne PH/FR > Cc: bioconductor@stat.math.ethz.ch > Subject: Re: [BioC] Affy: Present calls in an eset > > > <arne.muller@aventis.com> writes: > > > Hello, > > > > I'm quite new to Bioconductor/affy, and I was wondering if > there's a simple > > way to include the absent/present call for a gene in the > outputfile generated > > with write.exprs(eset, file='boo') in theaffy package. > > > > the eset was generated with > > > > eset <- expresso(cel, bgcorrect.method = 'rma', > normalize.method = > > 'qspline', pmcorrect.method = 'pmonly', > > summary.method='liwong') > > > > For further analyses I'd like to exclude genes that are > absent in all chips. > > That's tough. It isn't clear what a sensible definition of absent is. > Or present. > > Do you mean "expressed" ? "Differentially expressed" ? "sort of > differentially expressed but not too weakly expressed?". For any of > these, you'll need a precise definition (there isn't any in > Bioconductor), and you can compute your own. > > (I know that MAS will make these calls; I'm only familiar with Rosetta > Resolver's variant, and they don't really make sense to me -- to be > precise, I know numerically how they are derived, but fail to why they > realistically connect biologically or technologically without a great > deal of assumptions and a wild imagination). > > best, > -tony > > -- > rossini@u.washington.edu > http://www.analytics.washington.edu/ > Biomedical and Health Informatics University of Washington > Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer > Research Center > UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable > FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email > > CONFIDENTIALITY NOTICE: This e-mail message and any attachments may be > confidential and privileged. If you received this message in error, > please destroy it and notify the sender. Thank you. > ------------------------------ Message: 8 Date: Tue, 07 Oct 2003 17:18:14 PDT From: donghu@itsa.ucsf.edu Subject: [BioC] Histogram and boxplot of MM data To: bioconductor@stat.math.ethz.ch Message-ID: <200310080018.h980IEv19068@itsa.ucsf.edu> Content-Type: text/plain Hi, In Bioconductor, what data does "hist" or "boxplot" use by default? Is it PM data? How can I make similar plots with MM data? Thanks. Donglei Hu ------------------------------ Message: 9 Date: Wed, 08 Oct 2003 09:18:22 -0400 From: Naomi Altman <naomi@stat.psu.edu> Subject: [BioC] marrayClass - how to extract expr values for a gene To: bioconductor@stat.math.ethz.ch Message-ID: <6.0.0.22.2.20031008091634.01c52d10@stat.psu.edu> Content-Type: text/plain; charset="us-ascii"; format=flowed On spotted arrays with duplicate spots for each gene, I would like to extract all the normalized expression values for each gene. How can I do this? Thanks, Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111 ------------------------------ Message: 10 Date: Wed, 8 Oct 2003 12:02:36 +0200 (CEST) From: edoardo missiaglia <edo_missiaglia@yahoo.it> Subject: [BioC] time-course experiments To: bioconductor@stat.math.ethz.ch Message-ID: <20031008100236.12628.qmail@web11701.mail.yahoo.com> Content-Type: text/plain; charset=iso-8859-1 Dear all, I am now working on some time-course experiments and I have applied to them some classical statistic methods to identify genes that change their expression between time points. However I have read few papers (such as Peddada et al. Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference; GUO, X et al Statistical significance analysis of longitudinal gene expression data; etc..) where they describe specific methods for the analysis of this type of data. Unfortunately my background (I am biologist) make difficult to transform the algorithms reported in these papers in something usable in R. In the same time, I could not find packages in bioconductor that face this kind of problems ( there is only GeneTS written by Korbinian Strimmer, that is useful in a cyclic time-course experiment). I was wondering if anybody has already developed a package or some functions usable in R specifically designed for time-course experiment that consider the particular structure of this data. Otherwise is there anybody interest in developing something from scratch? Thank you very much in advance for your help. Best wishes, edoardo ______________________________________________________________________ http://it.yahoo.com/mail_it/foot/?http://it.mail.yahoo.com/ ------------------------------ _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor End of Bioconductor Digest, Vol 8, Issue 15

Microarray Clustering Cancer affy graph SAGElyzer Rgraphviz Rdbi RBGL Microarray Cancer • 1.3k views

ADD COMMENT • link updated 21.1 years ago by Douglas Bates ▴ 180 • written 21.1 years ago by Baker, Stephen ▴ 160

0

Entering edit mode

Douglas Bates ▴ 180

@douglas-bates-5

Last seen 10.2 years ago

Although you didn't say so I presume you were replying to the message on time-course experiments. (You quoted 10 different messages from a digest in your reply.) Is there a reason for not using the lme function in the nlme package to obtain the maximum likelihood or REML estimates for the mixed-effects model? Even though I am one of the authors of lme I am not being coy in asking this. I'm not sure exactly what the mixed-effects model would be and it is possible that the model could be fit easily in SAS PROC MIXED but not with lme. If that is the case then we (Saikat DebRoy and I) could take this into account as we redesign lme for R. "Baker, Stephen" <stephen.baker@umassmed.edu> writes: > Gary Churchill at the Jackson Labs in Maine has an R program on his > website for performing mixed models ANOVA on microarray data. The only > problem with this is it uses least squares to fit the model (which would > include a within-subjects factor for the time effect) and would requires > that there are no missing data points and all subjects being measured at > the same time points. This is because the least squares solution > involves inverting a matrix and missing data would make it not of full > rank. > > An alternative approach which wouldn't be done in R would be to use PROC > MIXED in the SAS stats package. This uses maximum likelihood to fit > mixed models and works well. If you really want to try to do it in R, > Yudi Pawitan at Dept. of Stats at University of Cork in Ireland has a > book and a set of R programs which would give you a leg up on it: > > http://statistics.ucc.ie/staff/yudi/likelihood/index.htm > Date: Wed, 8 Oct 2003 12:02:36 +0200 (CEST) > From: edoardo missiaglia <edo_missiaglia@yahoo.it> > Subject: [BioC] time-course experiments > To: bioconductor@stat.math.ethz.ch > Message-ID: <20031008100236.12628.qmail@web11701.mail.yahoo.com> > Content-Type: text/plain; charset=iso-8859-1 > > Dear all, > > I am now working on some time-course experiments and I > have applied to them some classical statistic methods > to identify genes that change their expression between > time points. However I have read few papers (such as > Peddada et al. Gene selection and clustering for > time-course and dose-response microarray experiments > using order-restricted inference; GUO, X et al > Statistical significance analysis of longitudinal gene expression data; > etc..) where they describe specific methods for the analysis of this > type of data. Unfortunately my background (I am biologist) make > difficult to transform the algorithms reported in these papers in > something usable in R. In the same time, I could not find packages in > bioconductor that face this kind of problems ( there is only GeneTS > written by Korbinian Strimmer, that is useful in a cyclic time- course > experiment). I was wondering if anybody has already developed a package > or some functions usable in R specifically designed for time-course > experiment that consider the particular structure of this data. > Otherwise is there anybody interest in developing something from > scratch? Thank you very much in advance for your help. > > Best wishes, > > edoardo -- Douglas Bates bates@stat.wisc.edu Statistics Department 608/262-2598 University of Wisconsin - Madison http://www.stat.wisc.edu/~bates/

ADD COMMENT • link 21.1 years ago Douglas Bates ▴ 180

Login before adding your answer.