Question

edgeR warning message when running Trended Dispersion

0

Entering edit mode

Natasha ▴ 440

@natasha-4640

Last seen 10.6 years ago

Dear List, I am also trying edger on my data (3 groups, 2 reps each). Bacterial samples. design condition pair 1 Cont 1 2 Cont 3 3 Trt1 1 4 Trt1 3 5 Trt2 1 6 Trt2 3 However, when I run the following code: I get a warning message and wanted to know it's significance in downstream analysis. ---------- y = DGEList(counts=gene.counts, group=group) str(y) y$samples dim(y$counts) #5578 6 keep = rowSums(cpm(y)>10) >= 3 table(keep) #FALSE TRUE # 1064 4514 y.filt = y[keep, ] y.filt$samples$lib.size = colSums(y.filt$counts) y.filt = calcNormFactors(y.filt) ## Design Matrix design = model.matrix(~pair+group) colnames(design) = gsub("group","",colnames(design)) design ## Estimating Dispersion y.filt = estimateGLMCommonDisp(y.filt, design, verbose=T) #Disp = 0.03799 , BCV = 0.1949 y.filt = estimateGLMTrendedDisp(y.filt,design) #Warning message: #In binGLMDispersion(y, design, min.n = min.n, offset = offset, method = method.bin, : # With 4514 genes and setting the parameter minimum number (min.n) of genes per bin to 500, there are only 5 bins. Using 5 bins here means that the minimum number of genes in each of the 5 bins is in fact 515. This number of bins and minimum number of genes per bin may not be sufficient for reliable estimation of a trend on the dispersions. y.filt = estimateGLMTagwiseDisp(y.filt,design) -------------- sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] gdata_2.12.0 WriteXLS_2.3.0 edgeR_2.6.10 limma_3.14.3 loaded via a namespace (and not attached): [1] gtools_2.7.0 ------- Any help, suggestion and advice much appreciated. Many Thanks, Natasha

edgeR edgeR • 3.1k views

ADD COMMENT • link updated 11.9 years ago by Gordon Smyth 52k • written 11.9 years ago by Natasha ▴ 440

score 0 · Answer 1 · 2013-05-15

Dear List, Sorry for the repost realised I forgot to add some info, also attached BCV plot. I am also trying edgeR on my data (3 groups, 2 reps each). Bacterial samples. However, when I run the following code: I get a warning message and wanted to know it's significance in downstream analysis. ---------- >Targets group pair 1 Cont 1 2 Cont 3 3 Trt1 1 4 Trt1 3 5 Trt2 1 6 Trt2 3 >y = DGEList(counts=gene.counts, group=group) >dim(y$counts) #5578 6 > keep = rowSums(cpm(y)>10) >= 3 > table(keep) #FALSE TRUE # 1064 4514 > y.filt = y[keep, ] > y.filt$samples$lib.size = colSums(y.filt$counts) > y.filt = calcNormFactors(y.filt) > y.filt$samples group lib.size norm.factors Cont_1 Cont 1356517 0.9656755 Cont_3 Cont 1414900 1.1070829 Trt1_1 Trt1 1382278 1.0074343 Trt1_3 Trt1 1470642 1.0018683 Trt2_1 Trt2 1379381 0.8713614 Trt2_3 Trt2 1383889 1.0635623 > design = model.matrix(~pair+group) >y.filt = estimateGLMCommonDisp(y.filt, design, verbose=T) #Disp = 0.03799 , BCV = 0.1949 >y.filt = estimateGLMTrendedDisp(y.filt,design) #Warning message: #In binGLMDispersion(y, design, min.n = min.n, offset = offset, method = method.bin, : # With 4514 genes and setting the parameter minimum number (min.n) of genes per bin to 500, there are only 5 bins. Using 5 bins here means that the minimum number of genes in each of the 5 bins is in fact 515. This number of bins and minimum number of genes per bin may not be sufficient for reliable estimation of a trend on the dispersions. >y.filt = estimateGLMTagwiseDisp(y.filt,design) > plotBCV(y.filt) # Plot attached -------------- sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] gdata_2.12.0 WriteXLS_2.3.0 edgeR_2.6.10 limma_3.14.3 loaded via a namespace (and not attached): [1] gtools_2.7.0 ------- Any help, suggestion and advice much appreciated. Many Thanks, Natasha -------------- next part -------------- A non-text attachment was scrubbed... Name: BCVplots.png Type: image/png Size: 18115 bytes Desc: BCVplots.png URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20130515="" 1b4aef0e="" attachment.png="">

score 0 · Answer 2 · 2013-05-16

0

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 54 minutes ago

WEHI, Melbourne, Australia

Dear Natasha, Please follow the posting guide http://www.bioconductor.org/help/mailing-list/posting-guide/ and "Ensure that you are using the latest Bioconductor release". Your software is two bioconductor releases behind. Best wishes Gordon > Date: Tue, 14 May 2013 16:58:46 +0000 > From: Natasha Sahgal <nsahgal at="" well.ox.ac.uk=""> > To: "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> > Subject: [BioC] edgeR warning message when running Trended Dispersion > > Dear List, > > I am also trying edger on my data (3 groups, 2 reps each). Bacterial samples. > > design > condition pair > 1 Cont 1 > 2 Cont 3 > 3 Trt1 1 > 4 Trt1 3 > 5 Trt2 1 > 6 Trt2 3 > > However, when I run the following code: I get a warning message and > wanted to know it's significance in downstream analysis. > ---------- > y = DGEList(counts=gene.counts, group=group) > str(y) > y$samples > > dim(y$counts) #5578 6 > > keep = rowSums(cpm(y)>10) >= 3 > table(keep) > #FALSE TRUE > # 1064 4514 > > y.filt = y[keep, ] > y.filt$samples$lib.size = colSums(y.filt$counts) > y.filt = calcNormFactors(y.filt) > > ## Design Matrix > design = model.matrix(~pair+group) > colnames(design) = gsub("group","",colnames(design)) > design > > ## Estimating Dispersion > y.filt = estimateGLMCommonDisp(y.filt, design, verbose=T) > #Disp = 0.03799 , BCV = 0.1949 > y.filt = estimateGLMTrendedDisp(y.filt,design) > #Warning message: > #In binGLMDispersion(y, design, min.n = min.n, offset = offset, method = method.bin, : > # With 4514 genes and setting the parameter minimum number (min.n) of genes per bin to 500, there are only 5 bins. Using 5 bins here means that the minimum number of genes in each of the 5 bins is in fact 515. This number of bins and minimum number of genes per bin may not be sufficient for reliable estimation of a trend on the dispersions. > y.filt = estimateGLMTagwiseDisp(y.filt,design) > -------------- > sessionInfo() > R version 2.15.2 (2012-10-26) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 > [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] splines stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] gdata_2.12.0 WriteXLS_2.3.0 edgeR_2.6.10 limma_3.14.3 > > loaded via a namespace (and not attached): > [1] gtools_2.7.0 > ------- > > Any help, suggestion and advice much appreciated. > > Many Thanks, > Natasha > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD COMMENT • link 11.9 years ago Gordon Smyth 52k

0

Entering edit mode

Dear Prof. Smyth, Thank you for your reply. Yes, I eventually did upgrade and realised that the warning message was no longer there. However, if I may ask a related question. I decided to try the same data below (but as an unpaired analysis) and got an error at the commonTagwiseDisp step. (Latest version of R). Code: > y2.filt = y[keep, ] > design2 = model.matrix(~group) > y2.filt = estimateCommonDisp(y2.filt, design2, verbose=T) #Disp = 0.60835 , BCV = 0.78 > y2.filt = estimateTagwiseDisp(y2.filt,design2) #Error in prior.n/ntags * m0 : non-conformable arrays I do not understand the error above. I also tried, the Trened dispersion below and got an error >y3.filt = y2.filt >y3.filt = estimateTrendedDisp(y3.filt,design2) #Error in estimateTrendedDisp(y3.filt, design2) : # object 'dispersion' not found Many Thanks, Natasha -----Original Message----- From: Gordon K Smyth [mailto:smyth@wehi.EDU.AU] Sent: 16 May 2013 00:01 To: Natasha Sahgal Cc: Bioconductor mailing list Subject: edgeR warning message when running Trended Dispersion Dear Natasha, Please follow the posting guide http://www.bioconductor.org/help/mailing-list/posting-guide/ and "Ensure that you are using the latest Bioconductor release". Your software is two bioconductor releases behind. Best wishes Gordon > Date: Tue, 14 May 2013 16:58:46 +0000 > From: Natasha Sahgal <nsahgal at="" well.ox.ac.uk=""> > To: "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> > Subject: [BioC] edgeR warning message when running Trended Dispersion > > Dear List, > > I am also trying edger on my data (3 groups, 2 reps each). Bacterial samples. > > design > condition pair > 1 Cont 1 > 2 Cont 3 > 3 Trt1 1 > 4 Trt1 3 > 5 Trt2 1 > 6 Trt2 3 > > However, when I run the following code: I get a warning message and > wanted to know it's significance in downstream analysis. > ---------- > y = DGEList(counts=gene.counts, group=group) > str(y) > y$samples > > dim(y$counts) #5578 6 > > keep = rowSums(cpm(y)>10) >= 3 > table(keep) > #FALSE TRUE > # 1064 4514 > > y.filt = y[keep, ] > y.filt$samples$lib.size = colSums(y.filt$counts) y.filt = > calcNormFactors(y.filt) > > ## Design Matrix > design = model.matrix(~pair+group) > colnames(design) = gsub("group","",colnames(design)) design > > ## Estimating Dispersion > y.filt = estimateGLMCommonDisp(y.filt, design, verbose=T) #Disp = > 0.03799 , BCV = 0.1949 y.filt = estimateGLMTrendedDisp(y.filt,design) > #Warning message: > #In binGLMDispersion(y, design, min.n = min.n, offset = offset, method = method.bin, : > # With 4514 genes and setting the parameter minimum number (min.n) of genes per bin to 500, there are only 5 bins. Using 5 bins here means that the minimum number of genes in each of the 5 bins is in fact 515. This number of bins and minimum number of genes per bin may not be sufficient for reliable estimation of a trend on the dispersions. > y.filt = estimateGLMTagwiseDisp(y.filt,design) > -------------- > sessionInfo() > R version 2.15.2 (2012-10-26) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 > [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] splines stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] gdata_2.12.0 WriteXLS_2.3.0 edgeR_2.6.10 limma_3.14.3 > > loaded via a namespace (and not attached): > [1] gtools_2.7.0 > ------- > > Any help, suggestion and advice much appreciated. > > Many Thanks, > Natasha > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:6}}

ADD REPLY • link 11.9 years ago Natasha ▴ 440

0

Entering edit mode

Hi Natasha, On Thu, May 16, 2013 at 2:01 AM, Natasha Sahgal <nsahgal at="" well.ox.ac.uk=""> wrote: > Dear Prof. Smyth, > > Thank you for your reply. > > Yes, I eventually did upgrade and realised that the warning message was no longer there. > > However, if I may ask a related question. I decided to try the same data below (but as an unpaired analysis) and got an error at the commonTagwiseDisp step. (Latest version of R). > > Code: >> y2.filt = y[keep, ] >> design2 = model.matrix(~group) > >> y2.filt = estimateCommonDisp(y2.filt, design2, verbose=T) > #Disp = 0.60835 , BCV = 0.78 > >> y2.filt = estimateTagwiseDisp(y2.filt,design2) > #Error in prior.n/ntags * m0 : non-conformable arrays What is the output of `dim(y2.filt$counts)` and `dim(design2)`? -steve -- Steve Lianoglou Computational Biologist Department of Bioinformatics and Computational Biology Genentech

ADD REPLY • link 11.9 years ago Steve Lianoglou ★ 13k

0

Entering edit mode

Hi Steve, Thanks your reply. The information you asked for is below: > dim(y2.filt$counts) [1] 4514 6 > dim(design2) [1] 6 3 Many Thanks, Natasha -- -----Original Message----- From: mailinglist.honeypot@gmail.com [mailto:mailinglist.honeypot@gmail.com] On Behalf Of Steve Lianoglou Sent: 16 May 2013 14:18 To: Natasha Sahgal Cc: Gordon K Smyth; Bioconductor mailing list Subject: Re: [BioC] edgeR warning message when running Trended Dispersion Hi Natasha, On Thu, May 16, 2013 at 2:01 AM, Natasha Sahgal <nsahgal at="" well.ox.ac.uk=""> wrote: > Dear Prof. Smyth, > > Thank you for your reply. > > Yes, I eventually did upgrade and realised that the warning message was no longer there. > > However, if I may ask a related question. I decided to try the same data below (but as an unpaired analysis) and got an error at the commonTagwiseDisp step. (Latest version of R). > > Code: >> y2.filt = y[keep, ] >> design2 = model.matrix(~group) > >> y2.filt = estimateCommonDisp(y2.filt, design2, verbose=T) > #Disp = 0.60835 , BCV = 0.78 > >> y2.filt = estimateTagwiseDisp(y2.filt,design2) > #Error in prior.n/ntags * m0 : non-conformable arrays What is the output of `dim(y2.filt$counts)` and `dim(design2)`? -steve -- Steve Lianoglou Computational Biologist Department of Bioinformatics and Computational Biology Genentech

ADD REPLY • link 11.9 years ago Natasha ▴ 440

0

Entering edit mode

Dear Natasha, You are trying to pass a design matrix to "classic" edgeR commands (estimateCommonDisp etc) that do not accept a design matrix as an argument and, not surprisingly, this results in an error. If you want to specify a design matrix, regardless of the design, then you must use estimateGLMCommonDisp etc instead. Please refer to the help pages for these functions to see what arguments can be passed. BTW, in the current version of edgeR, there is a simpler interface available if you wish to use it which subsumes both the classic and GLM estimation routines. You can use: y.filt <- estimateDisp(y.filt) and this will work with or without a design matrix, and will compute the common, trended and tagwise dispersions all in one step. Best wishes Gordon --------------------------------------------- Professor Gordon K Smyth, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. http://www.statsci.org/smyth On Thu, 16 May 2013, Natasha Sahgal wrote: > Dear Prof. Smyth, > > Thank you for your reply. > > Yes, I eventually did upgrade and realised that the warning message was > no longer there. > > However, if I may ask a related question. I decided to try the same data > below (but as an unpaired analysis) and got an error at the > commonTagwiseDisp step. (Latest version of R). > > Code: >> y2.filt = y[keep, ] >> design2 = model.matrix(~group) > >> y2.filt = estimateCommonDisp(y2.filt, design2, verbose=T) > #Disp = 0.60835 , BCV = 0.78 > >> y2.filt = estimateTagwiseDisp(y2.filt,design2) > #Error in prior.n/ntags * m0 : non-conformable arrays > > I do not understand the error above. I also tried, the Trened dispersion below and got an error >> y3.filt = y2.filt > >> y3.filt = estimateTrendedDisp(y3.filt,design2) > #Error in estimateTrendedDisp(y3.filt, design2) : > # object 'dispersion' not found > > > Many Thanks, > Natasha > > -----Original Message----- > From: Gordon K Smyth [mailto:smyth at wehi.EDU.AU] > Sent: 16 May 2013 00:01 > To: Natasha Sahgal > Cc: Bioconductor mailing list > Subject: edgeR warning message when running Trended Dispersion > > Dear Natasha, > > Please follow the posting guide > > http://www.bioconductor.org/help/mailing-list/posting-guide/ > > and "Ensure that you are using the latest Bioconductor release". > > Your software is two bioconductor releases behind. > > Best wishes > Gordon > >> Date: Tue, 14 May 2013 16:58:46 +0000 >> From: Natasha Sahgal <nsahgal at="" well.ox.ac.uk=""> >> To: "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> >> Subject: [BioC] edgeR warning message when running Trended Dispersion >> >> Dear List, >> >> I am also trying edger on my data (3 groups, 2 reps each). Bacterial samples. >> >> design >> condition pair >> 1 Cont 1 >> 2 Cont 3 >> 3 Trt1 1 >> 4 Trt1 3 >> 5 Trt2 1 >> 6 Trt2 3 >> >> However, when I run the following code: I get a warning message and >> wanted to know it's significance in downstream analysis. > >> ---------- >> y = DGEList(counts=gene.counts, group=group) >> str(y) >> y$samples >> >> dim(y$counts) #5578 6 >> >> keep = rowSums(cpm(y)>10) >= 3 >> table(keep) >> #FALSE TRUE >> # 1064 4514 >> >> y.filt = y[keep, ] >> y.filt$samples$lib.size = colSums(y.filt$counts) y.filt = >> calcNormFactors(y.filt) >> >> ## Design Matrix >> design = model.matrix(~pair+group) >> colnames(design) = gsub("group","",colnames(design)) design >> >> ## Estimating Dispersion >> y.filt = estimateGLMCommonDisp(y.filt, design, verbose=T) #Disp = >> 0.03799 , BCV = 0.1949 y.filt = estimateGLMTrendedDisp(y.filt,design) >> #Warning message: >> #In binGLMDispersion(y, design, min.n = min.n, offset = offset, method = method.bin, : >> # With 4514 genes and setting the parameter minimum number (min.n) of genes per bin to 500, there are only 5 bins. Using 5 bins here means that the minimum number of genes in each of the 5 bins is in fact 515. This number of bins and minimum number of genes per bin may not be sufficient for reliable estimation of a trend on the dispersions. >> y.filt = estimateGLMTagwiseDisp(y.filt,design) >> -------------- >> sessionInfo() >> R version 2.15.2 (2012-10-26) >> Platform: x86_64-pc-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 >> [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 >> [7] LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] splines stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] gdata_2.12.0 WriteXLS_2.3.0 edgeR_2.6.10 limma_3.14.3 >> >> loaded via a namespace (and not attached): >> [1] gtools_2.7.0 >> ------- >> >> Any help, suggestion and advice much appreciated. >> >> Many Thanks, >> Natasha ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD REPLY • link 11.9 years ago Gordon Smyth 52k

0

Entering edit mode

Dear Prof. Smyth, Thank you for your response. True, I did not realise my mistake of passing a design matrix to the classic method! I guess, in trying various combinations I managed to confuse myself. It worked now! Many Thanks, Natasha -----Original Message----- From: Gordon K Smyth [mailto:smyth@wehi.EDU.AU] Sent: 17 May 2013 03:32 To: Natasha Sahgal Cc: Yunshun Chen; Bioconductor mailing list Subject: RE: edgeR warning message when running Trended Dispersion Dear Natasha, You are trying to pass a design matrix to "classic" edgeR commands (estimateCommonDisp etc) that do not accept a design matrix as an argument and, not surprisingly, this results in an error. If you want to specify a design matrix, regardless of the design, then you must use estimateGLMCommonDisp etc instead. Please refer to the help pages for these functions to see what arguments can be passed. BTW, in the current version of edgeR, there is a simpler interface available if you wish to use it which subsumes both the classic and GLM estimation routines. You can use: y.filt <- estimateDisp(y.filt) and this will work with or without a design matrix, and will compute the common, trended and tagwise dispersions all in one step. Best wishes Gordon --------------------------------------------- Professor Gordon K Smyth, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. http://www.statsci.org/smyth On Thu, 16 May 2013, Natasha Sahgal wrote: > Dear Prof. Smyth, > > Thank you for your reply. > > Yes, I eventually did upgrade and realised that the warning message > was no longer there. > > However, if I may ask a related question. I decided to try the same > data below (but as an unpaired analysis) and got an error at the > commonTagwiseDisp step. (Latest version of R). > > Code: >> y2.filt = y[keep, ] >> design2 = model.matrix(~group) > >> y2.filt = estimateCommonDisp(y2.filt, design2, verbose=T) > #Disp = 0.60835 , BCV = 0.78 > >> y2.filt = estimateTagwiseDisp(y2.filt,design2) > #Error in prior.n/ntags * m0 : non-conformable arrays > > I do not understand the error above. I also tried, the Trened > dispersion below and got an error >> y3.filt = y2.filt > >> y3.filt = estimateTrendedDisp(y3.filt,design2) > #Error in estimateTrendedDisp(y3.filt, design2) : > # object 'dispersion' not found > > > Many Thanks, > Natasha > > -----Original Message----- > From: Gordon K Smyth [mailto:smyth at wehi.EDU.AU] > Sent: 16 May 2013 00:01 > To: Natasha Sahgal > Cc: Bioconductor mailing list > Subject: edgeR warning message when running Trended Dispersion > > Dear Natasha, > > Please follow the posting guide > > http://www.bioconductor.org/help/mailing-list/posting-guide/ > > and "Ensure that you are using the latest Bioconductor release". > > Your software is two bioconductor releases behind. > > Best wishes > Gordon > >> Date: Tue, 14 May 2013 16:58:46 +0000 >> From: Natasha Sahgal <nsahgal at="" well.ox.ac.uk=""> >> To: "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> >> Subject: [BioC] edgeR warning message when running Trended Dispersion >> >> Dear List, >> >> I am also trying edger on my data (3 groups, 2 reps each). Bacterial samples. >> >> design >> condition pair >> 1 Cont 1 >> 2 Cont 3 >> 3 Trt1 1 >> 4 Trt1 3 >> 5 Trt2 1 >> 6 Trt2 3 >> >> However, when I run the following code: I get a warning message and >> wanted to know it's significance in downstream analysis. > >> ---------- >> y = DGEList(counts=gene.counts, group=group) >> str(y) >> y$samples >> >> dim(y$counts) #5578 6 >> >> keep = rowSums(cpm(y)>10) >= 3 >> table(keep) >> #FALSE TRUE >> # 1064 4514 >> >> y.filt = y[keep, ] >> y.filt$samples$lib.size = colSums(y.filt$counts) y.filt = >> calcNormFactors(y.filt) >> >> ## Design Matrix >> design = model.matrix(~pair+group) >> colnames(design) = gsub("group","",colnames(design)) design >> >> ## Estimating Dispersion >> y.filt = estimateGLMCommonDisp(y.filt, design, verbose=T) #Disp = >> 0.03799 , BCV = 0.1949 y.filt = estimateGLMTrendedDisp(y.filt,design) >> #Warning message: >> #In binGLMDispersion(y, design, min.n = min.n, offset = offset, method = method.bin, : >> # With 4514 genes and setting the parameter minimum number (min.n) of genes per bin to 500, there are only 5 bins. Using 5 bins here means that the minimum number of genes in each of the 5 bins is in fact 515. This number of bins and minimum number of genes per bin may not be sufficient for reliable estimation of a trend on the dispersions. >> y.filt = estimateGLMTagwiseDisp(y.filt,design) >> -------------- >> sessionInfo() >> R version 2.15.2 (2012-10-26) >> Platform: x86_64-pc-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 >> [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 >> [7] LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] splines stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] gdata_2.12.0 WriteXLS_2.3.0 edgeR_2.6.10 limma_3.14.3 >> >> loaded via a namespace (and not attached): >> [1] gtools_2.7.0 >> ------- >> >> Any help, suggestion and advice much appreciated. >> >> Many Thanks, >> Natasha ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:6}}

ADD REPLY • link 11.9 years ago Natasha ▴ 440