edgeR, comparing models
1
0
Entering edit mode
Alpesh Querer ▴ 220
@alpesh-querer-4895
Last seen 10.1 years ago
Hello edgeR patrons, I have RNA-seq data for multiple samples with biological replicates, I want to look at the goodness of fit for fitting Poisson and NB models used by edgeR for common, trended and tag-wise dispersion. Does setting dispersion=0 in glmFit use the Poisson model? Also I am using the following code to generate and compare qqplots for the models(figure 2 of the most recent edgeR manuscript),please let me know if I am using the appropriate method. Also how to plot the blue points (genes with poor fit)? y<-estimateGLMCommonDisp(y,design) fitcommon <- glmFit(y,design) y <- estimateGLMTrendedDisp(y,design) fittrended <- glmFit(y,design) y <- estimateGLMTagwiseDisp(y,design) fittagwise <- glmFit(y,design) gcom <- gof(fitcommon) gtren <- gof(fittrended) gtag <- gof(fittagwise) zcom <- zscoreGamma(gcom$gof.statistics,shape=gcom$df/2,scale=2) ztren <- zscoreGamma(gtren$gof.statistics,shape=gtren$df/2,scale=2) ztag <- zscoreGamma(gtag$gof.statistics,shape=gtag$df/2,scale=2) pcom=zcom[is.finite(zcom) ] ptren=ztren[is.finite(ztren) ] ptag=ztag[is.finite(ztag) ] par(mfrow=c(3,1)) qqnorm(pcom) qqline(pcom) qqnorm(ptren) qqline(ptren) qqnorm(ptag) qqline(ptag) > sessionInfo() R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] edgeR_2.6.3 limma_3.12.0 BiocInstaller_1.4.4 loaded via a namespace (and not attached): [1] tools_2.15.0 Thanks, Al [[alternative HTML version deleted]]
edgeR edgeR • 1.4k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia
Dear Alpesh, What you're doing looks correct. However, if you are doing this analysis (Figure 2 from http://nar.oxfordjournals.org/content/40/10/4288 ) to decide whether you need to use tagwise dispersion for your data, note that we have never observed a real data set for which tagwise estimation is not required. Hence it seems unnecessary to make these plots as a routine diagnostic. Yes, dispersion=0 is Poisson. To plot highlighted blue points, see ?points as well as xy <- qqnorm(pcom) etc. Best wishes Gordon > Date: Mon, 21 May 2012 10:15:37 -0500 > From: Alpesh Querer <alpeshq at="" gmail.com=""> > To: Bioconductor mailing list <bioconductor at="" r-project.org=""> > Subject: [BioC] edgeR, comparing models > > Hello edgeR patrons, > > I have RNA-seq data for multiple samples with biological replicates, I want > to look at the goodness of fit for > fitting Poisson and NB models used by edgeR for common, trended and > tag-wise dispersion. Does setting dispersion=0 > in glmFit use the Poisson model? Also I am using the following code to > generate and compare qqplots for the models(figure 2 of > the most recent edgeR manuscript),please let me know if I am using the > appropriate method. Also how to plot the blue points (genes with poor fit)? > > y<-estimateGLMCommonDisp(y,design) > > fitcommon <- glmFit(y,design) > > y <- estimateGLMTrendedDisp(y,design) > > fittrended <- glmFit(y,design) > > y <- estimateGLMTagwiseDisp(y,design) > > > fittagwise <- glmFit(y,design) > > gcom <- gof(fitcommon) > gtren <- gof(fittrended) > gtag <- gof(fittagwise) > > zcom <- zscoreGamma(gcom$gof.statistics,shape=gcom$df/2,scale=2) > ztren <- zscoreGamma(gtren$gof.statistics,shape=gtren$df/2,scale=2) > ztag <- zscoreGamma(gtag$gof.statistics,shape=gtag$df/2,scale=2) > > > pcom=zcom[is.finite(zcom) ] > ptren=ztren[is.finite(ztren) ] > ptag=ztag[is.finite(ztag) ] > > par(mfrow=c(3,1)) > > qqnorm(pcom) > qqline(pcom) > > qqnorm(ptren) > qqline(ptren) > > qqnorm(ptag) > qqline(ptag) > > >> sessionInfo() > R version 2.15.0 (2012-03-30) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] splines stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] edgeR_2.6.3 limma_3.12.0 BiocInstaller_1.4.4 > > loaded via a namespace (and not attached): > [1] tools_2.15.0 > > > Thanks, > Al ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT
0
Entering edit mode
Dear Alpesh, I've commited a change to edgeR today, so that you can now produce the qq-plot of the goodness of fit statistics by gof(fit, plot=TRUE) Best wishes Gordon --------------------------------------------- Professor Gordon K Smyth, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. smyth at wehi.edu.au http://www.wehi.edu.au http://www.statsci.org/smyth On Wed, 23 May 2012, Gordon K Smyth wrote: > Dear Alpesh, > > What you're doing looks correct. > > However, if you are doing this analysis (Figure 2 from > http://nar.oxfordjournals.org/content/40/10/4288 ) to decide whether you need > to use tagwise dispersion for your data, note that we have never observed a > real data set for which tagwise estimation is not required. Hence it seems > unnecessary to make these plots as a routine diagnostic. > > Yes, dispersion=0 is Poisson. > > To plot highlighted blue points, see ?points as well as > > xy <- qqnorm(pcom) > > etc. > > Best wishes > Gordon > >> Date: Mon, 21 May 2012 10:15:37 -0500 >> From: Alpesh Querer <alpeshq at="" gmail.com=""> >> To: Bioconductor mailing list <bioconductor at="" r-project.org=""> >> Subject: [BioC] edgeR, comparing models >> >> Hello edgeR patrons, >> >> I have RNA-seq data for multiple samples with biological replicates, I want >> to look at the goodness of fit for >> fitting Poisson and NB models used by edgeR for common, trended and >> tag-wise dispersion. Does setting dispersion=0 >> in glmFit use the Poisson model? Also I am using the following code to >> generate and compare qqplots for the models(figure 2 of >> the most recent edgeR manuscript),please let me know if I am using the >> appropriate method. Also how to plot the blue points (genes with poor fit)? >> >> y<-estimateGLMCommonDisp(y,design) >> >> fitcommon <- glmFit(y,design) >> >> y <- estimateGLMTrendedDisp(y,design) >> >> fittrended <- glmFit(y,design) >> >> y <- estimateGLMTagwiseDisp(y,design) >> >> >> fittagwise <- glmFit(y,design) >> >> gcom <- gof(fitcommon) >> gtren <- gof(fittrended) >> gtag <- gof(fittagwise) >> >> zcom <- zscoreGamma(gcom$gof.statistics,shape=gcom$df/2,scale=2) >> ztren <- zscoreGamma(gtren$gof.statistics,shape=gtren$df/2,scale=2) >> ztag <- zscoreGamma(gtag$gof.statistics,shape=gtag$df/2,scale=2) >> >> >> pcom=zcom[is.finite(zcom) ] >> ptren=ztren[is.finite(ztren) ] >> ptag=ztag[is.finite(ztag) ] >> >> par(mfrow=c(3,1)) >> >> qqnorm(pcom) >> qqline(pcom) >> >> qqnorm(ptren) >> qqline(ptren) >> >> qqnorm(ptag) >> qqline(ptag) >> >> >>> sessionInfo() >> R version 2.15.0 (2012-03-30) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=English_United States.1252 >> [2] LC_CTYPE=English_United States.1252 >> [3] LC_MONETARY=English_United States.1252 >> [4] LC_NUMERIC=C >> [5] LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] splines stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] edgeR_2.6.3 limma_3.12.0 BiocInstaller_1.4.4 >> >> loaded via a namespace (and not attached): >> [1] tools_2.15.0 >> >> >> Thanks, >> Al > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD REPLY

Login before adding your answer.

Traffic: 300 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6