intraspotCorrelation vs duplicateCorrelation
1
0
Entering edit mode
@gordon-smyth
Last seen 3 minutes ago
WEHI, Melbourne, Australia
Dear Naomi, duplicateCorrelation() calls mixedModel2Fit() (in the statmod package), which in turn calls glmgam.fit(). glmgam.fit() implements a Levenberg-Marquardt modification to the Fisher scoring algorithm to prevent divergence, so it is extremely reliable, much more so than a call to glm() would be. On the other hand, intraspotCorrelation() calls remlscore() (also in the statmod package), which does ordinary Fisher scoring for a related model without any modification to prevent divergence. So it is entirely possible that duplicateCorrelation() will work for a dataset for which intraspotCorrelation() does not. However, to get a NULL result from intraspotCorrelation(), the model fit would have to fail for every single gene in your dataset. Just failing for one or two would not cause a problem. Best wishes Gordon > Date: Thu, 24 Feb 2011 20:20:12 -0500 > From: Naomi Altman <naomi at="" stat.psu.edu=""> > To: bioconductor at r-project.org > Subject: [BioC] intraspotCorrelation vs duplicateCorrelation > Message-ID: <6.2.3.4.1.20110224200727.01ffea98 at imap.stat.psu.edu> > Content-Type: text/plain; charset="us-ascii"; format=flowed > > I was having problems with intraspotCorrelation for a single channel > analysis, so I decided to reorganize the data to use > duplicateCorrelation. I thought I would get the same answer, but I > did not. I hope someone can tell me why. (As you will see below, I > like the answer with duplicateCorrelation better!) > > MAp is the normalized 2-channel data. > I have a targets list called targetp with the treatment > information. The code is below. > > ############################################################## > # separate channel approach > ############################################################# > targetSingle <- targetsA2C(targetp) > rep=c(0,0,1,1,0,0,1,1,0,0,1,1) > designp=model.matrix(~-1+targetSingle$Target+rep,levels=unique(c(cy3 ,cy5))) > corfit =intraspotCorrelation(MAp, designp) > corfit$cor > NULL > ########################################################## > # the warning was > #In remlscore(y, X, Z) : reml: Max iterations exceeded > ########################################################## > > ############################################################ > # approach treating channels as separate arrays > ############################################################ > RGp=RG.MA(MAp) > Rp=log2(RGp$R) > Gp=log2(RGp$G) > exprP=cbind(Rp[,1],Gp[,1],Rp[,2],Gp[,2],Rp[,3],Gp[,3],Rp[,4],Gp[,4], Rp[,5],Gp[,5],Rp[,6],Gp[,6]) > corfitP = duplicateCorrelation(exprP, designp, block = > factor(c(1,1,2,2,3,3,4,4,5,5,6,6))) > corfitP$cor > [1] 0.4921254 > > sessionInfo() > R version 2.11.1 (2010-05-31) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C LC_TIME=English_United > States.1252 > > attached base packages: > [1] tools stats graphics grDevices > utils datasets methods base > > other attached packages: > [1] statmod_1.4.8 rat2302cdf_2.6.0 > limma_3.4.5 affy_1.26.1 Biobase_2.8.0 > > loaded via a namespace (and not attached): > [1] affyio_1.16.0 preprocessCore_1.10.0 > > > Thanks, > Naomi ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
• 994 views
ADD COMMENT
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 3.7 years ago
United States
Dear Gordon, Thanks for the information. There are 50 warnings: In remlscore(y, X, Z) : reml: Max iterations exceeded Since 50 is the maximum number stored by R in my current implementation, it does look like the model fit must have failed for every single gene. Thank you for your the explanation. --Naomi At 10:35 PM 2/25/2011, Gordon K Smyth wrote: >Dear Naomi, > >duplicateCorrelation() calls mixedModel2Fit() (in the statmod >package), which in turn calls glmgam.fit(). glmgam.fit() implements >a Levenberg-Marquardt modification to the Fisher scoring algorithm >to prevent divergence, so it is extremely reliable, much more so >than a call to glm() would be. On the other hand, >intraspotCorrelation() calls remlscore() (also in the statmod >package), which does ordinary Fisher scoring for a related model >without any modification to prevent divergence. So it is entirely >possible that duplicateCorrelation() will work for a dataset for >which intraspotCorrelation() does not. > >However, to get a NULL result from intraspotCorrelation(), the model >fit would have to fail for every single gene in your dataset. Just >failing for one or two would not cause a problem. > >Best wishes >Gordon > >>Date: Thu, 24 Feb 2011 20:20:12 -0500 >>From: Naomi Altman <naomi at="" stat.psu.edu=""> >>To: bioconductor at r-project.org >>Subject: [BioC] intraspotCorrelation vs duplicateCorrelation >>Message-ID: <6.2.3.4.1.20110224200727.01ffea98 at imap.stat.psu.edu> >>Content-Type: text/plain; charset="us-ascii"; format=flowed >> >>I was having problems with intraspotCorrelation for a single channel >>analysis, so I decided to reorganize the data to use >>duplicateCorrelation. I thought I would get the same answer, but I >>did not. I hope someone can tell me why. (As you will see below, I >>like the answer with duplicateCorrelation better!) >> >>MAp is the normalized 2-channel data. >>I have a targets list called targetp with the treatment >>information. The code is below. >> >>############################################################## >># separate channel approach >>############################################################# >>targetSingle <- targetsA2C(targetp) >>rep=c(0,0,1,1,0,0,1,1,0,0,1,1) >>designp=model.matrix(~-1+targetSingle$Target+rep,levels=unique(c(cy3 ,cy5))) >>corfit =intraspotCorrelation(MAp, designp) >>corfit$cor >>NULL >>########################################################## >># the warning was >>#In remlscore(y, X, Z) : reml: Max iterations exceeded >>########################################################## >> >>############################################################ >># approach treating channels as separate arrays >>############################################################ >>RGp=RG.MA(MAp) >>Rp=log2(RGp$R) >>Gp=log2(RGp$G) >>exprP=cbind(Rp[,1],Gp[,1],Rp[,2],Gp[,2],Rp[,3],Gp[,3],Rp[,4],Gp[,4], Rp[,5],Gp[,5],Rp[,6],Gp[,6]) >>corfitP = duplicateCorrelation(exprP, designp, block = >>factor(c(1,1,2,2,3,3,4,4,5,5,6,6))) >>corfitP$cor >>[1] 0.4921254 >> >>sessionInfo() >>R version 2.11.1 (2010-05-31) >>i386-pc-mingw32 >> >>locale: >>[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United >>States.1252 LC_MONETARY=English_United States.1252 >>[4] LC_NUMERIC=C LC_TIME=English_United >>States.1252 >> >>attached base packages: >>[1] tools stats graphics grDevices >>utils datasets methods base >> >>other attached packages: >>[1] statmod_1.4.8 rat2302cdf_2.6.0 >>limma_3.4.5 affy_1.26.1 Biobase_2.8.0 >> >>loaded via a namespace (and not attached): >>[1] affyio_1.16.0 preprocessCore_1.10.0 >> >> >>Thanks, >>Naomi > >_____________________________________________________________________ _ >The information in this email is confidential and inten...{{dropped:7}}
ADD COMMENT

Login before adding your answer.

Traffic: 680 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6