Paired two-color design
1
0
Entering edit mode
@january-weiner-4252
Last seen 5.5 years ago
European Union
Hello, I have a problem with the design for an experiment that I'm evaluating. There are two parts to that problem. The first part: we have two groups (A and B) after treatment, each in three replicate, and two-color arrays have been created, with a dye-swap for each replicate: Cy3 Cy5 A1 B1 B1 A1 A2 B2 B2 A2 A3 B3 B3 A3 Here is how I'm doing it currently, but I think that this might not be the optimal solution: design <- cbind( ex1 = c( -1, 1, 1, -1, 0, 0, 0, 0, 0, 0, 0, 0 ), ex2 = c( 0, 0, 0, 0, -1, 1, 1, -1, 0, 0, 0, 0 ), ex3 = c( 0, 0, 0, 0, 0, 0, 0, 0, -1, 1, 1, -1 ) ) corfit <- intraspotCorrelation( rg, design ) # rg contains the normalized arrays fit <- lmscFit( rg, design, correlation= corfit$consensus.correlation ) cmtx <- makeContrasts( "(ex1 + ex2 + ex3)/3", levels= design ) fit <- contrasts.fit( fit, cmtx ) fit <- eBayes( fit ) Is this correct? Do I have to use intraspotCorrelation to detach the arrays? I think this should work somehow directly, but I can't figure out how. The second problem is this: we have the same setup as above, except that there are controls for A and B (before treatment). I call them ctrlA and ctrlB, respectively. There is only one biological replicate with a dye swap for each of these controls. The setup looks now like that: targets: Cy3 Cy5 ctrlA ctrlB ctrlB ctrlA A1 B1 B1 A1 A2 B2 B2 A2 A3 B3 B3 A3 I want to look for interaction between the treatment effect and the group, in other words I want the difference between (A vs controlA) - (B vs controlB). I use more or less the same approach as above (using intraspotCorrelation): t2 <- targetsA2C( t ) design <- model.matrix( ~ 0 + t$group ) colnames( design ) <- levels( t$group ) # colnames( design ) are: ctrlA, ctrlB, A1, A2, A3, B1, B2, B3 corfit <- intraSpotCorrelation( rg, design ) fit <- lmsciFit( rg, design, correlation= corfit$consensus.correlation ) cmtx <- makeContrasts( "(( A1 + A2 + A3 )/3 - ctrlA) - ( B1 + B2 + B3 )/3 - ctrlB))", levels= design ) fit <- contrasts.fit( fit, cmtx ) fit <- eBayes( fit ) Is that correct? Kind regards, J. -- -------- Dr. January Weiner 3 --------------------------------------
• 1.4k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 5 hours ago
WEHI, Melbourne, Australia
Dear January, Are your dye-swaps technical replicates of the same RNA samples? That is, does "A1" refer to same RNA sample for the second array as does for the first? lmscFit() can't be used in conjunction with technical replicates, including dye-swaps. On the other hand, your paired design is exactly of the form discussed on page 41 of the limma User's Guide (dated 10 June 2012): http://bioconductor.org/packages/2.10/bioc/vignettes/limma/inst/doc/us ersguide.pdf or on page 56 of the limma User's Guide dated 31 July 2012: http://bioconductor.org/packages/2.11/bioc/vignettes/limma/inst/doc/us ersguide.pdf Best wishes Gordon > Date: Tue, 28 Aug 2012 08:17:39 +0200 > From: January Weiner <january.weiner at="" gmail.com=""> > To: bioconductor at r-project.org > Subject: [BioC] Paired two-color design > > Hello, > > I have a problem with the design for an experiment that I'm evaluating. > > There are two parts to that problem. > > The first part: we have two groups (A and B) after treatment, each in > three replicate, and two-color arrays have been created, with a > dye-swap for each replicate: > > Cy3 Cy5 > A1 B1 > B1 A1 > A2 B2 > B2 A2 > A3 B3 > B3 A3 > > Here is how I'm doing it currently, but I think that this might not be > the optimal solution: > > design <- cbind( ex1 = c( -1, 1, 1, -1, 0, 0, 0, 0, 0, 0, 0, 0 ), ex2 > = c( 0, 0, 0, 0, -1, 1, 1, -1, 0, 0, 0, 0 ), ex3 = c( 0, 0, 0, 0, 0, > 0, 0, 0, -1, 1, 1, -1 ) ) > corfit <- intraspotCorrelation( rg, design ) # rg contains the normalized arrays > fit <- lmscFit( rg, design, correlation= corfit$consensus.correlation ) > cmtx <- makeContrasts( "(ex1 + ex2 + ex3)/3", levels= design ) > fit <- contrasts.fit( fit, cmtx ) > fit <- eBayes( fit ) > > Is this correct? Do I have to use intraspotCorrelation to detach the > arrays? I think this should work somehow directly, but I can't figure > out how. > > The second problem is this: we have the same setup as above, except > that there are controls for A and B (before treatment). I call them > ctrlA and ctrlB, respectively. There is only one biological replicate > with a dye swap for each of these controls. The setup looks now like > that: > > targets: > Cy3 Cy5 > ctrlA ctrlB > ctrlB ctrlA > A1 B1 > B1 A1 > A2 B2 > B2 A2 > A3 B3 > B3 A3 > > I want to look for interaction between the treatment effect and the > group, in other words I want the difference between (A vs controlA) - > (B vs controlB). > > I use more or less the same approach as above (using intraspotCorrelation): > > t2 <- targetsA2C( t ) > design <- model.matrix( ~ 0 + t$group ) > colnames( design ) <- levels( t$group ) > # colnames( design ) are: ctrlA, ctrlB, A1, A2, A3, B1, B2, B3 > > corfit <- intraSpotCorrelation( rg, design ) > fit <- lmsciFit( rg, design, correlation= corfit$consensus.correlation ) > cmtx <- makeContrasts( "(( A1 + A2 + A3 )/3 - ctrlA) - ( B1 + B2 + B3 > )/3 - ctrlB))", levels= design ) > fit <- contrasts.fit( fit, cmtx ) > fit <- eBayes( fit ) > > Is that correct? > > Kind regards, > > J. > > -- > -------- Dr. January Weiner 3 -------------------------------------- > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT
0
Entering edit mode
Dear Gordon, thank you for your quick answer. > Are your dye-swaps technical replicates of the same RNA samples? That is, > does "A1" refer to same RNA sample for the second array as does for the > first? Yes, they are dye-swaps only; the biological replicates are A1, A2, .. > lmscFit() can't be used in conjunction with technical replicates, including > dye-swaps. In that case, I'm totally lost. I have no idea how to bite the second part of my problem, the search for the interaction between treatment and group. > On the other hand, your paired design is exactly of the form discussed on > page 41 of the limma User's Guide (dated 10 June 2012): This is true. However, it was an attempt on the way to achieve the goal in the second part. I have no idea how to do it without separating the channels. Yes, had I designed the experiment, it would have been another design, but I can't go back in time. The two groups (mouse strains, actually) differ in their expression. However, the strains respond very differently to the treatment, and the goal is to find how these differences in response are on the gene expression level. The interaction (treatA-ctrlA)-(treatB-ctrlB) is what I want to test, but I don't know how to do it with the existing setup. Kind regards, j. -- -------- Dr. January Weiner 3 --------------------------------------
ADD REPLY
0
Entering edit mode
Dear January, I would analyse it like this: DyeSwapPair <- c(1,1,2,2,3,3,4,4) ctrlBvsA <- c(1,-1,0,0,0,0,0,0) BvsA=c(0,0,1,-1,1,-1,1,-1) design <- cbind(Dye=1,ctrlBvsA,BvsA) This produces a design matrix with a treatment BvsA effect, and a control BvsA effect, plus a probe-specific dye effect just for good measure. Then estimate the technical correlation. I'll write MA for your normalized MAList object. The value of corfit$consensus should be negative: corfit <- duplicateCorrelation(MA,design,block=DyeSwapPair) fit <- lmFit(MA,design,block=DyeSwapPair,correlation=corfit$consensus) Then test for the interaction effect: fit2 <- contrasts.fit(fit, c(0,-1,1)) fit2 <- eBayes(fit2) topTable(fit2) Best wishes Gordon On Tue, 28 Aug 2012, Gordon K Smyth wrote: >> The second problem is this: we have the same setup as above, except >> that there are controls for A and B (before treatment). I call them >> ctrlA and ctrlB, respectively. There is only one biological replicate >> with a dye swap for each of these controls. The setup looks now like >> that: >> >> targets: >> Cy3 Cy5 >> ctrlA ctrlB >> ctrlB ctrlA >> A1 B1 >> B1 A1 >> A2 B2 >> B2 A2 >> A3 B3 >> B3 A3 >> >> I want to look for interaction between the treatment effect and the >> group, in other words I want the difference between (A vs controlA) - >> (B vs controlB). >> >> I use more or less the same approach as above (using intraspotCorrelation): >> >> t2 <- targetsA2C( t ) >> design <- model.matrix( ~ 0 + t$group ) >> colnames( design ) <- levels( t$group ) >> # colnames( design ) are: ctrlA, ctrlB, A1, A2, A3, B1, B2, B3 >> >> corfit <- intraSpotCorrelation( rg, design ) >> fit <- lmsciFit( rg, design, correlation= corfit$consensus.correlation ) >> cmtx <- makeContrasts( "(( A1 + A2 + A3 )/3 - ctrlA) - ( B1 + B2 + B3 >> )/3 - ctrlB))", levels= design ) >> fit <- contrasts.fit( fit, cmtx ) >> fit <- eBayes( fit ) >> >> Is that correct? >> >> Kind regards, >> >> J. >> >> -- >> -------- Dr. January Weiner 3 -------------------------------------- >> > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD REPLY
0
Entering edit mode
Dear Gordon, thank you for your answer. Gordon K Smyth wrote: > I would analyse it like this: That makes totally sense; however, I have one more question (sorry! and thank you for your patience). In the meanwhile, I have taken an alternative approach, as described in the second part of the chapter on technical replicates in the limma guide (http://bioconductor.org/packages/release/bioc/vignettes/limma/i nst/doc/usersguide.pdf, page 42): fit for each biological replicate separately, then create a contrast corresponding to the average of these and subtract the control: design <- cbind( ctrl= c( 1, -1, rep( 0, 6 ) ), e1= c( 0, 0, 1, -1, rep( 0, 4 ) ), e2= c( rep( 0, 4 ), 1, -1, 0, 0 ), e3= c( rep( 0, 6 ), 1, -1 ) ) cmtx <- makeContrasts( "(e1+e2+e3)/3 - ctrl", levels= design ) fit <- lmFit( MA, design ) ; fit <- contrast.fit( fit, cmtx ) ; fit <- eBayes( fit ) If I understand the text of the limma guide, these are alternative approaches and should give at least similar results. And yes, the estimated logFC are exactly the same. However, the p-values are much different. Using duplicateCorrelation causes a bunch of genes to become statistically non-significant (not the other way round). Either duplicateCorrelation is less sensitive or more specific, and I wonder which is the case. Unfortunately, this bunch of genes changes the results of the functional analysis. Also, maybe I'm lost, but after reading and thinking I don't see why I can't use intraspotCorrelation here. I'm not saying I can, in fact this gives me vastly different results I don't really trust (given the later results of the functional analysis), but I just don't see the problem, since the correlations are calculated within the arrays. I got quite used to apply that, since often I'm confronted with the following problem: Cy3 Cy5 A0 B0 B0 A0 A1 B1 B1 A1 where the job is to compare A1 with A0 and B1 with B0. (the dye swaps are technical replicates). I think that this is an unconnected design, and there is no way of doing that with a normal model, so that the channels should be analysed separately. kind regards, j. -- -------- Dr. January Weiner 3 --------------------------------------
ADD REPLY
0
Entering edit mode
Dear January, You are asking me about the analysis of technical replicates described on pages 42-43 of the limma User's Guide (current release version). These pages describe an analysis approach in which differential expression is assessed against technical variation only. This approach is far from ideal, because it is obviously preferable to evaluate DE relative to biological variation. I wrote this approach in order to give users something that would give some results for awkward designs when a completely rigorous statistical analysis was impossible or impractical. The idea was that the gene ranking might still be useful even if the p-values were too optimistic. I agree that these pages did give the impression that this approach was a valid alternative to other analyses, rather than a last resort when nothing else would work, which is what it was. A few months ago, I decided to remove this approach from the User's Guide entirely. You will see that it is gone in the User's Guide in the developmental version of limma. On Wed, 29 Aug 2012, January Weiner wrote: > Dear Gordon, thank you for your answer. > > Gordon K Smyth wrote: >> I would analyse it like this: > > That makes totally sense; however, I have one more question (sorry! > and thank you for your patience). > > In the meanwhile, I have taken an alternative approach, as described > in the second part of the chapter on technical replicates in the limma > guide (http://bioconductor.org/packages/release/bioc/vignettes/limma /inst/doc/usersguide.pdf, > page 42): fit for each biological replicate separately, then create a > contrast corresponding to the average of these and subtract the > control: > > design <- cbind( ctrl= c( 1, -1, rep( 0, 6 ) ), e1= c( 0, 0, 1, -1, > rep( 0, 4 ) ), e2= c( rep( 0, 4 ), 1, -1, 0, 0 ), e3= c( rep( 0, 6 ), > 1, -1 ) ) > cmtx <- makeContrasts( "(e1+e2+e3)/3 - ctrl", levels= design ) > fit <- lmFit( MA, design ) ; fit <- contrast.fit( fit, cmtx ) ; fit <- > eBayes( fit ) > > If I understand the text of the limma guide, these are alternative > approaches and should give at least similar results. > > And yes, the estimated logFC are exactly the same. However, the > p-values are much different. Using duplicateCorrelation causes a bunch > of genes to become statistically non-significant (not the other way > round). Either duplicateCorrelation is less sensitive or more > specific, and I wonder which is the case. Unfortunately, this bunch of > genes changes the results of the functional analysis. The approach you describe estimates the residual standard deviations entirely from the technical replicates, and hence will underestimate the biological variability, and hence will over-estimate the statistical significance of the results. The duplicateCorrelation approach takes the biological variation into account, and hence is statistically more rigorous. Rather than using the purely technical analysis, I would prefer that you handled the technical replicates via duplicateCorrelation and simply chose a more lenient FDR cutoff when making your gene list for functional analysis. > Also, maybe I'm lost, but after reading and thinking I don't see why I > can't use intraspotCorrelation here. I'm not saying I can, in fact this > gives me vastly different results I don't really trust (given the later > results of the functional analysis), but I just don't see the problem, > since the correlations are calculated within the arrays. For technical reasons, limma does not allow you to use both intraspotCorrelation and duplicateCorrelation in the same analysis. > I got quite used to apply that, since often I'm confronted with the > following problem: > > Cy3 Cy5 > A0 B0 > B0 A0 > A1 B1 > B1 A1 > > where the job is to compare A1 with A0 and B1 with B0. (the dye swaps > are technical replicates). I think that this is an unconnected design, > and there is no way of doing that with a normal model, so that the > channels should be analysed separately. Yes, that's right. My current recommendation for this analysis is to use the separate channel approach, and to simply pretend that the technical replicates are true biological replicates. This is because there is no practical way to fully account for the biological and technical levels of variability in the experiment. The technical and biological replicates will contribute equally to the residual standard errors. Not ideal, but probably the best of the choices available. Keep in mind that the p-values will be slightly too low. This makes it more important than usual that the key results are independently validated downstream. Best wishes Gordon > kind regards, > > j. > > > -- > -------- Dr. January Weiner 3 -------------------------------------- > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD REPLY

Login before adding your answer.

Traffic: 947 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6