Background correction (Normexp+offset)

0

Entering edit mode

Kachroo, Priyanka ▴ 60

@kachroo-priyanka-4292

Last seen 10.6 years ago

Dear All, I needed your help with some 2-color microarray data analysis. So the problem is that after sorting by pvalue and fold change cut off of 1.5, I am left with very few differentially expressed genes. I use Normexp method for background correction with an offset value of 50 (default). 1. So if i use offset=50, i get downregulated genes=11, upregulated genes=31 2. If i use offset=25, i get downregulated=14 , upregulated=46 3. If i use offset=10, i get downregulated=20 ,upregulated=93 I read on the Limma-bioconductor forum that making a boxplot of foreground and background (green and red channels) should help decide if background correction is needed or not. I made that boxplot but do not know how to interpret it. I could not attach the MA plots for the offset values 10 and 25 with this email. Can someone guide me as to how to interpret MA plots after background correction and what offset values to use. Also this is what the moderator writes for a way to decide the offset value " You can judge a good value for the offset by inspection of the MA-plots. If you really want a quantitative way to judge this, look at the component fit$df.prior after you use the eBayes() function in limma. The better you stabilize the variances, the larger will be df.prior and the greater will be the power to detect DE genes. Hence the offset which maximises df.prior is, in sense, optimal " So, when i run my code and type fit$df.prior i get a value of 1.481457. How does this number help me decide the offset. Priyanka Kachroo Graduate Assistant Research Texas A&M University

Microarray Microarray • 2.3k views

ADD COMMENT • link updated 13.8 years ago by Juan C Oliveros Collazos ▴ 190 • written 13.8 years ago by Kachroo, Priyanka ▴ 60

0

Entering edit mode

Kachroo, Priyanka ▴ 60

@kachroo-priyanka-4292

Last seen 10.6 years ago

Dear All, I needed your help with some 2-color microarray data analysis. So the problem is that after sorting by pvalue and fold change cut off of 1.5, I am left with very few differentially expressed genes. I use Normexp method for background correction with an offset value of 50 (default). 1. So if i use offset=50, i get downregulated genes=11, upregulated genes=31 2. If i use offset=25, i get downregulated=14 , upregulated=46 3. If i use offset=10, i get downregulated=20 ,upregulated=93 I read on the Limma-bioconductor forum that making a boxplot of foreground and background (green and red channels) should help decide if background correction is needed or not. I made that boxplot but do not know how to interpret it. I have also attached MA plots before and after background correction for each offset (10 and 25) with this email. Can someone guide me in this regard. Also this is what the moderator writes for a way to decide the offset value " You can judge a good value for the offset by inspection of the MA-plots. If you really want a quantitative way to judge this, look at the component fit$df.prior after you use the eBayes() function in limma. The better you stabilize the variances, the larger will be df.prior and the greater will be the power to detect DE genes. Hence the offset which maximises df.prior is, in sense, optimal " So, when i run my code and type fit$df.prior i get a value of 1.481457. How does this number help me decide the offset. Priyanka Kachroo Graduate Assistant Research Texas A&M University

ADD COMMENT • link 13.8 years ago Kachroo, Priyanka ▴ 60

0

Entering edit mode

Juan C Oliveros Collazos ▴ 190

@juan-c-oliveros-collazos-2665

Last seen 10.6 years ago

Hi Priyanka, I am pretty sure that the differences in the cases 1. 2. and 3. are due to low expressed genes. The offset can affect only the results for genes with very low intensities. I always use normexp and offset=50. Then, I look up the MA plot to decide which combination of Fold Change and p.value (adjusted) is better (I highlight the results of each filter on the MA plot). If you have difficulties for generating MA plots and comparing different filters you can use our on-line tool FIESTA: http://bioinfogp.cnb.csic.es/tools/FIESTA This tool was developed to help the users to see (graphically) the effect of different thresholds in their results. You just need a table (text tabulated format) with all numerical results for all genes in the microarray, including A and M values, and p.values. (apologies if this is not the proper place to talk about a non-bioconductor tool) Hope that helps, OLI On 07/12/2011 05:40 PM, Kachroo, Priyanka wrote: > Dear All, > > I needed your help with some 2-color microarray data analysis. So the problem is that after sorting by pvalue and fold change cut off of 1.5, I am left with very few differentially expressed genes. I use Normexp method for background correction with an offset value of 50 (default). > > 1. So if i use offset=50, i get downregulated genes=11, upregulated genes=31 > > 2. If i use offset=25, i get downregulated=14 , upregulated=46 > > 3. If i use offset=10, i get downregulated=20 ,upregulated=93 > > > I read on the Limma-bioconductor forum that making a boxplot of foreground and background (green and red channels) should help decide if background correction is needed or not. I made that boxplot but do not know how to interpret it. I could not attach the MA plots for the offset values 10 and 25 with this email. Can someone guide me as to how to interpret MA plots after background correction and what offset values to use. > > Also this is what the moderator writes for a way to decide the offset value " You can judge a good value for the offset by inspection of the MA-plots. If you really want a quantitative way to judge this, look at the component fit$df.prior after you use the eBayes() function in limma. The better you stabilize the variances, the larger will be df.prior and the greater will be the power to detect DE genes. Hence the offset which maximises df.prior is, in sense, optimal " > > So, when i run my code and type fit$df.prior i get a value of 1.481457. How does this number help me decide the offset. > > > Priyanka Kachroo > Graduate Assistant Research > Texas A&M University > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 13.8 years ago Juan C Oliveros Collazos ▴ 190

0

Entering edit mode

Juan C Oliveros Collazos ▴ 190

@juan-c-oliveros-collazos-2665

Last seen 10.6 years ago

Hi, You can even use offset = 0 to get more genes with higher fold changes but, many of them will have very low intensities, will not be biologically relevant and can be artifacts that introduce noise in the functional analysis. Unfortunately, using an offset or another will not solve the problem of the sample size. I recommend you to use offset = 50 (the default) and to check graphically the effect of different filters based on Fold Changes with an MA plot. By the way, you do not need p.values to use FIESTA. Best, OLI On 07/12/2011 10:30 PM, Kachroo, Priyanka wrote: > Dear Mr. Oliveros, > > I appreciate you taking time to reply to my query. However my sample size is small so i do not get good adjusted p-value and as you know for any functional/ Gene-ontology software less number of differentially expressed genes hardly yield any results. Therefore i wanted to know how to decide what offset to use for normexp since lower offset values yield more genes. Is there a way to determine that? > > Priyanka Kachroo > Graduate Assistant Research > Texas A&M University > > ----- Original Message ----- > From: "Juan Carlos Oliveros"<oliveros at="" cnb.csic.es=""> > To: "Priyanka Kachroo"<priya_coll at="" neo.tamu.edu=""> > Cc: bioconductor at r-project.org > Sent: Tuesday, July 12, 2011 11:03:39 AM GMT -06:00 US/Canada Central > Subject: Re: [BioC] Background correction (Normexp+offset) > > Hi Priyanka, > > I am pretty sure that the differences in the cases 1. 2. and 3. are due > to low expressed genes. The offset can affect only the results for genes > with very low intensities. > > I always use normexp and offset=50. Then, I look up the MA plot to > decide which combination of Fold Change and p.value (adjusted) is better > (I highlight the results of each filter on the MA plot). > > If you have difficulties for generating MA plots and comparing different > filters you can use our on-line tool FIESTA: > > http://bioinfogp.cnb.csic.es/tools/FIESTA > > This tool was developed to help the users to see (graphically) the > effect of different thresholds in their results. > > You just need a table (text tabulated format) with all numerical results > for all genes in the microarray, including A and M values, and p.values. > > (apologies if this is not the proper place to talk about a > non-bioconductor tool) > > Hope that helps, > > OLI > > > > > On 07/12/2011 05:40 PM, Kachroo, Priyanka wrote: >> Dear All, >> >> I needed your help with some 2-color microarray data analysis. So the problem is that after sorting by pvalue and fold change cut off of 1.5, I am left with very few differentially expressed genes. I use Normexp method for background correction with an offset value of 50 (default). >> >> 1. So if i use offset=50, i get downregulated genes=11, upregulated genes=31 >> >> 2. If i use offset=25, i get downregulated=14 , upregulated=46 >> >> 3. If i use offset=10, i get downregulated=20 ,upregulated=93 >> >> >> I read on the Limma-bioconductor forum that making a boxplot of foreground and background (green and red channels) should help decide if background correction is needed or not. I made that boxplot but do not know how to interpret it. I could not attach the MA plots for the offset values 10 and 25 with this email. Can someone guide me as to how to interpret MA plots after background correction and what offset values to use. >> >> Also this is what the moderator writes for a way to decide the offset value " You can judge a good value for the offset by inspection of the MA-plots. If you really want a quantitative way to judge this, look at the component fit$df.prior after you use the eBayes() function in limma. The better you stabilize the variances, the larger will be df.prior and the greater will be the power to detect DE genes. Hence the offset which maximises df.prior is, in sense, optimal " >> >> So, when i run my code and type fit$df.prior i get a value of 1.481457. How does this number help me decide the offset. >> >> >> Priyanka Kachroo >> Graduate Assistant Research >> Texas A&M University >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 13.8 years ago Juan C Oliveros Collazos ▴ 190

Login before adding your answer.