LIMMA: Why does eBayes expression differ from observed?

0

Entering edit mode

Edwin Groot ▴ 230

@edwin-groot-3606

Last seen 10.2 years ago

I am getting odd results from a common reference design analysis of two-colour data in LIMMA. Previously I had analyzed only simple-comparison designs. Hopefully you can help restore credibility of Bioconductor to my supervisor. Why does the AveExpr and logFC reported in topTable() differ from the replicates in my MA object? The analysis is a textbook example of comparing a series of mutants to the wild type. Wild type is always green. The summary is as follows: lmfit(MA, refdesign) eBayes(fit) topTable(fit, coef="mut1") #Regenerate the RG from MA RG.MA(MA) Backcalculating the topTable AveExpr and logFC to green, red and FC gives the following expressions as an example: WT: 31 mut1: 298 FC: 9.5 Compare that to the average of 9 WT and 3 mut1 in the RG.MA(MA) list: WT: 58 mut1: 611 FC: 10.5 A survey of other probes gives an over and underestimate of the observed RG from 1.5 to 5 times. What is the explanation for that? What should I troubleshoot, besides looking at the usual BG and FG distributions and MA plots? Regards, Edwin --- Dr. Edwin Groot, postdoctoral associate AG Laux Institut fuer Biologie III Schaenzlestr. 1 79104 Freiburg, Deutschland +49 761-2032945

limma limma • 1.4k views

ADD COMMENT • link 15.3 years ago Edwin Groot ▴ 230

0

Entering edit mode

Edwin Groot ▴ 230

@edwin-groot-3606

Last seen 10.2 years ago

On Thu, 30 Jul 2009 14:57:43 +0200 Axel.Klenk at Actelion.Com wrote: > > Dear Edwin, > > first of all, the code you have posted cannot work because it never > assigns > the results > of your computations to any variables. In order to help you, we need > to > know at least > 1) what code you have really run, --- I shall clean out the comments and sent the code out tomorrow. Two of the mutants are in chromatin remodelling factors, and are expected to derepress a significant percentage of the ORFs. One of the mutants has a relatively trivial effect on the transcriptome. Does that violate any assumptions in the fitting steps? --- > 2) your refdesign matrix, > 3) the values from topTable() and RG.MA() for at least one example > probe, > 4) HOW you backcalculated R, G, and FC from topTable(), and > 5) the obligatory output of sessionInfo() although this doesn't > really > sound like a version issue. :-) > --- > sessionInfo() R version 2.7.1 (2008-06-23) i486-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US .UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_N AME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTI FICATION=C attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] geneplotter_1.18.0 annotate_1.18.0 xtable_1.5-5 [4] AnnotationDbi_1.2.2 RSQLite_0.7-1 DBI_0.2-4 [7] lattice_0.17-10 Biobase_2.0.1 limma_2.14.7 loaded via a namespace (and not attached): [1] grid_2.7.1 KernSmooth_2.22-22 RColorBrewer_1.0-2 --- > AFAIK, eBayes() will affect the t, p, and B statistics computed from > your M > values but not > the AveExpr and logFC. > > RG.MA() backcalculates background-adjusted and normalized R and G > values, > so I am > not sure what you mean by "observed" RG -- you don't get back the > original > R, Rb, G, Gb > from an MAList via RG.MA(). > > Cheers, > > - axel > > > Axel Klenk > Research Informatician > Actelion Pharmaceuticals Ltd / Gewerbestrasse 16 / CH-4123 Allschwil > / > Switzerland > > > > > > "Edwin Groot" > > <edwin.groot at="" biol=""> > ogie.uni-freiburg > To > .de> bioconductor at stat.math.ethz.ch > > Sent by: > cc > bioconductor-boun > > ces at stat.math.eth > Subject > z.ch [BioC] LIMMA: Why does eBayes > > expression differ from > observed? > > > 30.07.2009 12:38 > > > > > > > > > > > > > > I am getting odd results from a common reference design analysis of > two-colour data in LIMMA. Previously I had analyzed only > simple-comparison designs. Hopefully you can help restore credibility > of Bioconductor to my supervisor. > Why does the AveExpr and logFC reported in topTable() differ from the > replicates in my MA object? > > The analysis is a textbook example of comparing a series of mutants > to > the wild type. Wild type is always green. The summary is as follows: > > lmfit(MA, refdesign) > eBayes(fit) > topTable(fit, coef="mut1") > #Regenerate the RG from MA > RG.MA(MA) > > Backcalculating the topTable AveExpr and logFC to green, red and FC > gives the following expressions as an example: > WT: 31 mut1: 298 FC: 9.5 > Compare that to the average of 9 WT and 3 mut1 in the RG.MA(MA) list: > WT: 58 mut1: 611 FC: 10.5 > > A survey of other probes gives an over and underestimate of the > observed RG from 1.5 to 5 times. > What is the explanation for that? > What should I troubleshoot, besides looking at the usual BG and FG > distributions and MA plots? > > Regards, > Edwin > --- > Dr. Edwin Groot, postdoctoral associate > AG Laux > Institut fuer Biologie III > Schaenzlestr. 1 > 79104 Freiburg, Deutschland > +49 761-2032945 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > The information of this email and in any file transmitted with it is > strictly confidential and may be legally privileged. > It is intended solely for the addressee. If you are not the intended > recipient, any copying, distribution or any other use of this email > is prohibited and may be unlawful. In such case, you should please > notify the sender immediately and destroy this email. > The content of this email is not legally binding unless confirmed by > letter. > Any views expressed in this message are those of the individual > sender, except where the message states otherwise and the sender is > authorised to state them to be the views of the sender's company. For > further information about Actelion please see our website at > http://www.actelion.com > Dr. Edwin Groot, postdoctoral associate AG Laux Institut fuer Biologie III Schaenzlestr. 1 79104 Freiburg, Deutschland +49 761-2032945

ADD COMMENT • link 15.3 years ago Edwin Groot ▴ 230

0

Entering edit mode

Edwin Groot ▴ 230

@edwin-groot-3606

Last seen 10.2 years ago

On Thu, 30 Jul 2009 14:57:43 +0200 Axel.Klenk at Actelion.Com wrote: > > Dear Edwin, > > first of all, the code you have posted cannot work because it never > assigns > the results > of your computations to any variables. In order to help you, we need > to > know at least > 1) what code you have really run, > 2) your refdesign matrix, > 3) the values from topTable() and RG.MA() for at least one example > probe, > 4) HOW you backcalculated R, G, and FC from topTable(), and > 5) the obligatory output of sessionInfo() although this doesn't > really > sound like a version issue. :-) > --- Here is the code, with minimal commenting: > refdesign <- modelMatrix(targets, ref="WT") > refdesign cf1 mut1 mut1_cf1 [1,] 0 1 0 [2,] 1 0 0 [3,] 0 0 1 [4,] 0 1 0 [5,] 1 0 0 [6,] 0 0 1 [7,] 0 1 0 [8,] 1 0 0 [9,] 0 0 1 > RG <- read.maimages(targets$FileName, columns = list(G = "gProcessedSignal", Gb = "gBGMeanSignal", R = "rProcessedSignal", Rb = "rBGMeanSignal"), annotation= c("FeatureNum", "Row", "Col", "ProbeUID", "ControlType", "ProbeName", "Description", "GeneName", "SystematicName")) > i <- RG$genes$ControlType==0 > RGnc <- RG[i,] #RGnc already contains BG-subtracted and normalized intensities > MA <- MA.RG(RGnc, bc.method="none") #Intensity spread among arrays dissimilar > MAAq <- normalizeBetweenArrays(MA, method="Aquantile") > fit <- lmFit(MAAq, refdesign) > fitWT <- eBayes(fit) > RGAq <- RG.MA(MAAq) > options(digits=2) > out.mut1 <- topTable(fitWT, coef="mut1", genelist=cbind(fit$genes$GeneName,RGnc$G[,1],RGnc$R[,1],RGAq$G[,1:9],R Gnc$G[,1],RGAq$R[,1],RGAq$R[,4],RGAq$R[,7],RGnc$R[,1])) #Average WT (green) intensity > out.mut1[,2] <- (2*2^out.mut1$AveExpr)/(2^out.mut1$logFC+1) #Average mut1 (red) intensity > out.mut1[,3] <- (2*2^out.mut1$AveExpr)/(2^-out.mut1$logFC+1) > out.mut1[,2:17] <- as.numeric(as.matrix(out.mut1[,2:17])) #Calculated mean WT intensity > for (g in 1:10) out.mut1[g,13] <- mean(as.numeric(as.matrix(out.mut1[g,4:12]))) #Calculated mean mut1 intensity > for (g in 1:10) out.mut1[g,17] <- mean(as.numeric(as.matrix(out.mut1[g,14:16]))) #Fold Change > out.mut1$logFC <- 2^out.mut1$logFC > colnames(out.mut1)[1:18] <- c("GeneName","AvWT","Avmut1","WT1","WT2","WT3","WT4","WT5","WT6","WT7" , "WT8","WT9","CalcWT","mut11","mut12","mut13","Calcmut1","FoldChange") > out.mut1[9,] GeneName AvWT Avmut1 WT1 WT2 WT3 WT4 WT5 WT6 WT7 WT8 WT9 CalcWT 10943 AT1G65370 31 298 64 57 37 56 47 59 69 64 69 58 mut11 10943 423 mut12 mut13 Calcmut1 FoldChange AveExpr t P.Value adj.P.Val B 10943 683 727 611 9.5 7.4 12 1.2e-06 0.0055 4.7 --- Pardon the hack with mean(). It did not like the factor levels in the data frame. The WT and mut1 replicates, "observed RG", from the MAAq object are in WT1 to Wt9 and mut11 to mut13. In the output table out.mut1, the expression calculated from topTable (AvWT and Avmut1) is about half that of the expression calculated from the MAAq object (CalcWT and Calcmut1). The question is why are they so different? The FC calculated from the MAAq object (10.5) also differs from the FC from topTable (9.5). Thanks in advance for your help, Edwin --- > AFAIK, eBayes() will affect the t, p, and B statistics computed from > your M > values but not > the AveExpr and logFC. > > RG.MA() backcalculates background-adjusted and normalized R and G > values, > so I am > not sure what you mean by "observed" RG -- you don't get back the > original > R, Rb, G, Gb > from an MAList via RG.MA(). > > Cheers, > > - axel > > > Axel Klenk > Research Informatician > Actelion Pharmaceuticals Ltd / Gewerbestrasse 16 / CH-4123 Allschwil > / > Switzerland > > > > > > "Edwin Groot" > > <edwin.groot at="" biol=""> > ogie.uni-freiburg > To > .de> bioconductor at stat.math.ethz.ch > > Sent by: > cc > bioconductor-boun > > ces at stat.math.eth > Subject > z.ch [BioC] LIMMA: Why does eBayes > > expression differ from > observed? > > > 30.07.2009 12:38 > > > > > > > > > > > > > > I am getting odd results from a common reference design analysis of > two-colour data in LIMMA. Previously I had analyzed only > simple-comparison designs. Hopefully you can help restore credibility > of Bioconductor to my supervisor. > Why does the AveExpr and logFC reported in topTable() differ from the > replicates in my MA object? > > The analysis is a textbook example of comparing a series of mutants > to > the wild type. Wild type is always green. The summary is as follows: > > lmfit(MA, refdesign) > eBayes(fit) > topTable(fit, coef="mut1") > #Regenerate the RG from MA > RG.MA(MA) > > Backcalculating the topTable AveExpr and logFC to green, red and FC > gives the following expressions as an example: > WT: 31 mut1: 298 FC: 9.5 > Compare that to the average of 9 WT and 3 mut1 in the RG.MA(MA) list: > WT: 58 mut1: 611 FC: 10.5 > > A survey of other probes gives an over and underestimate of the > observed RG from 1.5 to 5 times. > What is the explanation for that? > What should I troubleshoot, besides looking at the usual BG and FG > distributions and MA plots? > > Regards, > Edwin > --- > Dr. Edwin Groot, postdoctoral associate > AG Laux > Institut fuer Biologie III > Schaenzlestr. 1 > 79104 Freiburg, Deutschland > +49 761-2032945 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 15.3 years ago Edwin Groot ▴ 230

0

Entering edit mode

Dear Edwin, as far as I understand what you're trying to do, there are two problems: First, according to ?topTable, AveExpr is the average expression over all (18) channels whereas logFC is the fold change for the three mut1 vs, WT only and you would need to also use the average fold change over all conditions to backcalculate the mean of your original R and G. Second, I think the formula you should use for that is the one used in RG.MA(): R <- 2^(A + M/2) G <- 2^(A - M/2) and not the one you're using. Last, I'm guessing from your original column names that these are Agilent arrays and AFAIK [gr]ProcessedSignal are loess-normalized by default (among others) which may be inappropriate in a common reference design if your common reference is very different from your second channel as you have indicated -- if you want to avoid that you should start from raw data using source = "agilent" in limma's read.maimages() and then apply the preprocessing of your choice. Hope this helps, - axel Axel Klenk Research Informatician Actelion Pharmaceuticals Ltd / Gewerbestrasse 16 / CH-4123 Allschwil / Switzerland "Edwin Groot" <edwin.groot at="" biol="" ogie.uni-freiburg="" to="" .de=""> "Bioconductor List" Sent by: <bioconductor at="" stat.math.ethz.ch=""> bioconductor-boun cc ces at stat.math.eth z.ch Subject Re: [BioC] LIMMA: Why does eBayes expression differ from observed? 31.07.2009 11:51 On Thu, 30 Jul 2009 14:57:43 +0200 Axel.Klenk at Actelion.Com wrote: > > Dear Edwin, > > first of all, the code you have posted cannot work because it never > assigns > the results > of your computations to any variables. In order to help you, we need > to > know at least > 1) what code you have really run, > 2) your refdesign matrix, > 3) the values from topTable() and RG.MA() for at least one example > probe, > 4) HOW you backcalculated R, G, and FC from topTable(), and > 5) the obligatory output of sessionInfo() although this doesn't > really > sound like a version issue. :-) > --- Here is the code, with minimal commenting: > refdesign <- modelMatrix(targets, ref="WT") > refdesign cf1 mut1 mut1_cf1 [1,] 0 1 0 [2,] 1 0 0 [3,] 0 0 1 [4,] 0 1 0 [5,] 1 0 0 [6,] 0 0 1 [7,] 0 1 0 [8,] 1 0 0 [9,] 0 0 1 > RG <- read.maimages(targets$FileName, columns = list(G = "gProcessedSignal", Gb = "gBGMeanSignal", R = "rProcessedSignal", Rb = "rBGMeanSignal"), annotation= c("FeatureNum", "Row", "Col", "ProbeUID", "ControlType", "ProbeName", "Description", "GeneName", "SystematicName")) > i <- RG$genes$ControlType==0 > RGnc <- RG[i,] #RGnc already contains BG-subtracted and normalized intensities > MA <- MA.RG(RGnc, bc.method="none") #Intensity spread among arrays dissimilar > MAAq <- normalizeBetweenArrays(MA, method="Aquantile") > fit <- lmFit(MAAq, refdesign) > fitWT <- eBayes(fit) > RGAq <- RG.MA(MAAq) > options(digits=2) > out.mut1 <- topTable(fitWT, coef="mut1", genelist=cbind(fit$genes$GeneName,RGnc$G[,1],RGnc$R[,1],RGAq$G[,1:9],R Gnc$G[,1],RGAq$R[,1],RGAq$R[,4],RGAq$R[,7],RGnc$R[,1])) #Average WT (green) intensity > out.mut1[,2] <- (2*2^out.mut1$AveExpr)/(2^out.mut1$logFC+1) #Average mut1 (red) intensity > out.mut1[,3] <- (2*2^out.mut1$AveExpr)/(2^-out.mut1$logFC+1) > out.mut1[,2:17] <- as.numeric(as.matrix(out.mut1[,2:17])) #Calculated mean WT intensity > for (g in 1:10) out.mut1[g,13] <- mean(as.numeric(as.matrix(out.mut1[g,4:12]))) #Calculated mean mut1 intensity > for (g in 1:10) out.mut1[g,17] <- mean(as.numeric(as.matrix(out.mut1[g,14:16]))) #Fold Change > out.mut1$logFC <- 2^out.mut1$logFC > colnames(out.mut1)[1:18] <- c("GeneName","AvWT","Avmut1","WT1","WT2","WT3","WT4","WT5","WT6","WT7" , "WT8","WT9","CalcWT","mut11","mut12","mut13","Calcmut1","FoldChange") > out.mut1[9,] GeneName AvWT Avmut1 WT1 WT2 WT3 WT4 WT5 WT6 WT7 WT8 WT9 CalcWT 10943 AT1G65370 31 298 64 57 37 56 47 59 69 64 69 58 mut11 10943 423 mut12 mut13 Calcmut1 FoldChange AveExpr t P.Value adj.P.Val B 10943 683 727 611 9.5 7.4 12 1.2e-06 0.0055 4.7 --- Pardon the hack with mean(). It did not like the factor levels in the data frame. The WT and mut1 replicates, "observed RG", from the MAAq object are in WT1 to Wt9 and mut11 to mut13. In the output table out.mut1, the expression calculated from topTable (AvWT and Avmut1) is about half that of the expression calculated from the MAAq object (CalcWT and Calcmut1). The question is why are they so different? The FC calculated from the MAAq object (10.5) also differs from the FC from topTable (9.5). Thanks in advance for your help, Edwin --- > AFAIK, eBayes() will affect the t, p, and B statistics computed from > your M > values but not > the AveExpr and logFC. > > RG.MA() backcalculates background-adjusted and normalized R and G > values, > so I am > not sure what you mean by "observed" RG -- you don't get back the > original > R, Rb, G, Gb > from an MAList via RG.MA(). > > Cheers, > > - axel > > > Axel Klenk > Research Informatician > Actelion Pharmaceuticals Ltd / Gewerbestrasse 16 / CH-4123 Allschwil > / > Switzerland > > > > > > "Edwin Groot" > > <edwin.groot at="" biol=""> > ogie.uni-freiburg > To > .de> bioconductor at stat.math.ethz.ch > > Sent by: > cc > bioconductor-boun > > ces at stat.math.eth > Subject > z.ch [BioC] LIMMA: Why does eBayes > > expression differ from > observed? > > > 30.07.2009 12:38 > > > > > > > > > > > > > > I am getting odd results from a common reference design analysis of > two-colour data in LIMMA. Previously I had analyzed only > simple-comparison designs. Hopefully you can help restore credibility > of Bioconductor to my supervisor. > Why does the AveExpr and logFC reported in topTable() differ from the > replicates in my MA object? > > The analysis is a textbook example of comparing a series of mutants > to > the wild type. Wild type is always green. The summary is as follows: > > lmfit(MA, refdesign) > eBayes(fit) > topTable(fit, coef="mut1") > #Regenerate the RG from MA > RG.MA(MA) > > Backcalculating the topTable AveExpr and logFC to green, red and FC > gives the following expressions as an example: > WT: 31 mut1: 298 FC: 9.5 > Compare that to the average of 9 WT and 3 mut1 in the RG.MA(MA) list: > WT: 58 mut1: 611 FC: 10.5 > > A survey of other probes gives an over and underestimate of the > observed RG from 1.5 to 5 times. > What is the explanation for that? > What should I troubleshoot, besides looking at the usual BG and FG > distributions and MA plots? > > Regards, > Edwin > --- > Dr. Edwin Groot, postdoctoral associate > AG Laux > Institut fuer Biologie III > Schaenzlestr. 1 > 79104 Freiburg, Deutschland > +49 761-2032945 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor The information of this email and in any file transmitted with it is strictly confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, any copying, distribution or any other use of this email is prohibited and may be unlawful. In such case, you should please notify the sender immediately and destroy this email. The content of this email is not legally binding unless confirmed by letter. Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorised to state them to be the views of the sender's company. For further information about Actelion please see our website at http://www.actelion.com

ADD REPLY • link 15.3 years ago Axel Klenk ★ 1.0k

0

Entering edit mode

Hello Axel, Thanks for looking over my code. Yes, the arrays are Agilent, and the Feature Extraction program uses either Linear or LOWESS methods for normalization. Are you suggesting that I use control transcripts for normalization, when there are large changes in the transcriptome? The hybridizations include Agilent SpikeIn controls, but I do not know how to implement this in LIMMA. What do you think of only normalizing between arrays, using a Quantile method, as illustrated in Yang and Thorne (2003)? Do you have any idea of how to tell topTable() to output AveExpr and LogFC for only the contrast specified? This point is bothering my collegue, who wanted to know why certain transcripts were not at the top of the list. He wanted to see the replicate intensities. My short answer to this quibble, is that the expression was too variable, of course! Regards, Edwin --- On Fri, 31 Jul 2009 15:50:21 +0200 Axel.Klenk at Actelion.Com wrote: > > Dear Edwin, > > as far as I understand what you're trying to do, there are two > problems: > > First, according to ?topTable, AveExpr is the average expression over > all > (18) > channels whereas logFC is the fold change for the three mut1 vs, WT > only > and > you would need to also use the average fold change over all > conditions to > backcalculate the mean of your original R and G. > > Second, I think the formula you should use for that is the one used > in > RG.MA(): > R <- 2^(A + M/2) > G <- 2^(A - M/2) > and not the one you're using. > > Last, I'm guessing from your original column names that these are > Agilent > arrays > and AFAIK [gr]ProcessedSignal are loess-normalized by default (among > others) > which may be inappropriate in a common reference design if your > common > reference > is very different from your second channel as you have indicated -- > if you > want to avoid > that you should start from raw data using source = "agilent" in > limma's > read.maimages() > and then apply the preprocessing of your choice. > > Hope this helps, > > - axel > > > Axel Klenk > Research Informatician > Actelion Pharmaceuticals Ltd / Gewerbestrasse 16 / CH-4123 Allschwil > / > Switzerland > > > > > > "Edwin Groot" > > <edwin.groot at="" biol=""> > ogie.uni-freiburg > To > .de> "Bioconductor List" > > Sent by: > <bioconductor at="" stat.math.ethz.ch=""> > bioconductor-boun > cc > ces at stat.math.eth > > z.ch > Subject > Re: [BioC] LIMMA: Why does > eBayes > expression differ from > observed? > 31.07.2009 11:51 > > > > > > > > > > > > > > > > On Thu, 30 Jul 2009 14:57:43 +0200 > Axel.Klenk at Actelion.Com wrote: > > > > Dear Edwin, > > > > first of all, the code you have posted cannot work because it never > > assigns > > the results > > of your computations to any variables. In order to help you, we > need > > to > > know at least > > 1) what code you have really run, > > 2) your refdesign matrix, > > 3) the values from topTable() and RG.MA() for at least one example > > probe, > > 4) HOW you backcalculated R, G, and FC from topTable(), and > > 5) the obligatory output of sessionInfo() although this doesn't > > really > > sound like a version issue. :-) > > > > --- > Here is the code, with minimal commenting: > > refdesign <- modelMatrix(targets, ref="WT") > > refdesign > cf1 mut1 mut1_cf1 > [1,] 0 1 0 > [2,] 1 0 0 > [3,] 0 0 1 > [4,] 0 1 0 > [5,] 1 0 0 > [6,] 0 0 1 > [7,] 0 1 0 > [8,] 1 0 0 > [9,] 0 0 1 > > RG <- read.maimages(targets$FileName, columns = list(G = > "gProcessedSignal", Gb = "gBGMeanSignal", R = "rProcessedSignal", Rb > = > "rBGMeanSignal"), annotation= c("FeatureNum", "Row", "Col", > "ProbeUID", > "ControlType", "ProbeName", "Description", "GeneName", > "SystematicName")) > > i <- RG$genes$ControlType==0 > > RGnc <- RG[i,] > #RGnc already contains BG-subtracted and normalized intensities > > MA <- MA.RG(RGnc, bc.method="none") > #Intensity spread among arrays dissimilar > > MAAq <- normalizeBetweenArrays(MA, method="Aquantile") > > fit <- lmFit(MAAq, refdesign) > > > fitWT <- eBayes(fit) > > RGAq <- RG.MA(MAAq) > > options(digits=2) > > out.mut1 <- topTable(fitWT, coef="mut1", > genelist=cbind(fit$genes$GeneName,RGnc$G[,1],RGnc$R[,1],RGAq$G[,1:9],R Gnc$G[,1],RGAq$R[,1],RGAq$R[,4],RGAq$R[,7],RGnc$R[,1])) > > #Average WT (green) intensity > > out.mut1[,2] <- (2*2^out.mut1$AveExpr)/(2^out.mut1$logFC+1) > #Average mut1 (red) intensity > > out.mut1[,3] <- (2*2^out.mut1$AveExpr)/(2^-out.mut1$logFC+1) > > out.mut1[,2:17] <- as.numeric(as.matrix(out.mut1[,2:17])) > #Calculated mean WT intensity > > for (g in 1:10) out.mut1[g,13] <- > mean(as.numeric(as.matrix(out.mut1[g,4:12]))) > #Calculated mean mut1 intensity > > for (g in 1:10) out.mut1[g,17] <- > mean(as.numeric(as.matrix(out.mut1[g,14:16]))) > #Fold Change > > out.mut1$logFC <- 2^out.mut1$logFC > > colnames(out.mut1)[1:18] <- > c("GeneName","AvWT","Avmut1","WT1","WT2","WT3","WT4","WT5","WT6","WT7" , > "WT8","WT9","CalcWT","mut11","mut12","mut13","Calcmut1","FoldChange") > > out.mut1[9,] > GeneName AvWT Avmut1 WT1 WT2 WT3 WT4 WT5 WT6 WT7 WT8 WT9 > CalcWT > 10943 AT1G65370 31 298 64 57 37 56 47 59 69 64 69 > 58 > mut11 > 10943 423 > mut12 mut13 Calcmut1 FoldChange AveExpr t P.Value adj.P.Val > B > 10943 683 727 611 9.5 7.4 12 1.2e-06 0.0055 > 4.7 > --- > > Pardon the hack with mean(). It did not like the factor levels in the > data frame. > The WT and mut1 replicates, "observed RG", from the MAAq object are > in > WT1 to Wt9 and mut11 to mut13. > In the output table out.mut1, the expression calculated from topTable > (AvWT and Avmut1) is about half that of the expression calculated > from > the MAAq object (CalcWT and Calcmut1). The question is why are they > so > different? > The FC calculated from the MAAq object (10.5) also differs from the > FC > from topTable (9.5). > > Thanks in advance for your help, > Edwin > --- > > > AFAIK, eBayes() will affect the t, p, and B statistics computed > from > > your M > > values but not > > the AveExpr and logFC. > > > > RG.MA() backcalculates background-adjusted and normalized R and G > > values, > > so I am > > not sure what you mean by "observed" RG -- you don't get back the > > original > > R, Rb, G, Gb > > from an MAList via RG.MA(). > > > > Cheers, > > > > - axel > > > > > > Axel Klenk > > Research Informatician > > Actelion Pharmaceuticals Ltd / Gewerbestrasse 16 / CH-4123 > Allschwil > > / > > Switzerland > > > > > > > > > > > > > "Edwin Groot" > > > > <edwin.groot at="" biol=""> > > > ogie.uni-freiburg > > To > > .de> > bioconductor at stat.math.ethz.ch > > > > Sent by: > > cc > > bioconductor-boun > > > > ces at stat.math.eth > > Subject > > z.ch [BioC] LIMMA: Why does > eBayes > > > > expression differ from > > observed? > > > > > > > 30.07.2009 12:38 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am getting odd results from a common reference design analysis of > > two-colour data in LIMMA. Previously I had analyzed only > > simple-comparison designs. Hopefully you can help restore > credibility > > of Bioconductor to my supervisor. > > Why does the AveExpr and logFC reported in topTable() differ from > the > > replicates in my MA object? > > > > The analysis is a textbook example of comparing a series of mutants > > to > > the wild type. Wild type is always green. The summary is as > follows: > > > > lmfit(MA, refdesign) > > eBayes(fit) > > topTable(fit, coef="mut1") > > #Regenerate the RG from MA > > RG.MA(MA) > > > > Backcalculating the topTable AveExpr and logFC to green, red and FC > > gives the following expressions as an example: > > WT: 31 mut1: 298 FC: 9.5 > > Compare that to the average of 9 WT and 3 mut1 in the RG.MA(MA) > list: > > WT: 58 mut1: 611 FC: 10.5 > > > > A survey of other probes gives an over and underestimate of the > > observed RG from 1.5 to 5 times. > > What is the explanation for that? > > What should I troubleshoot, besides looking at the usual BG and FG > > distributions and MA plots? > > > > Regards, > > Edwin > > --- > > Dr. Edwin Groot, postdoctoral associate > > AG Laux > > Institut fuer Biologie III > > Schaenzlestr. 1 > > 79104 Freiburg, Deutschland > > +49 761-2032945 > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > The information of this email and in any file transmitted with it is > strictly confidential and may be legally privileged. > It is intended solely for the addressee. If you are not the intended > recipient, any copying, distribution or any other use of this email > is prohibited and may be unlawful. In such case, you should please > notify the sender immediately and destroy this email. > The content of this email is not legally binding unless confirmed by > letter. > Any views expressed in this message are those of the individual > sender, except where the message states otherwise and the sender is > authorised to state them to be the views of the sender's company. For > further information about Actelion please see our website at > http://www.actelion.com > Dr. Edwin Groot, postdoctoral associate AG Laux Institut fuer Biologie III Schaenzlestr. 1 79104 Freiburg, Deutschland +49 761-2032945

ADD REPLY • link 15.3 years ago Edwin Groot ▴ 230

Login before adding your answer.