Problem fitting a linear model using limma
1
0
Entering edit mode
@joaquin-martinez-5060
Last seen 10.4 years ago
Dear All, My colleague Ben and I have fitted a linear model to microarray data using the limma package but we get "coefficients not estimable" for the last two columns in the design matrix during the fit. As a consequence when trying to fit our contrast matrix we get the following error message, "Error in contrasts.fit(fit, contrasts = A) : trying to take contrast of non-estimable coefficient." We have discovered that if we simply reorder the design matrix columns and contrasts matrix rows, we encounter the same error. We would very much appreciate if someone could explain to us why the not-estimable coefficients arise, and how to correct the problem. We are following the guide (section 8.2) for limma 3.10.0 using this version of R 2.14.0 (sessionInfo at end of email). We have 6 different types of samples (Puninf, P86, P163, Muninf, M86, M163), and 3 biological replicates of each (1,2,3; 4,5,6; 7,8,9; 10,11,12; 13,14,15; 16,17,18). # The targets file looks like this targets <- structure(list(SlideNumber = c(29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 51L, 46L, 47L, 48L, 50L), FileName = c("BE T22h slide 29", "BE T22h slide 30", "BE T22h slide 31", "BE T22h slide 32", "BE T22h slide 33", "BE T22h slide 34", "BE T22h slide 35", "BE T22h slide 36", "BE T22h slide 37", "BE T22h slide 38", "BE T22h slide 39", "BE T22h slide 40", "BE T22h slide 41", "BE T22h slide 42", "BE T22h slide 43", "BE T22h slide 44", "BE T22h slide 51", "BE T22h slide 46", "BE T22h slide 47", "BE T22h slide 48", "BE T22h slide 50"), Cy3 = c("Puninf.1", "Puninf.1", "P163.7", "M86.13", "Muninf.10", "P86.4", "M163.16", "Muninf.11", "P86.5", "Puninf.2", "Muninf.11", "M163.17", "M86.14", "P163.8", "Puninf.3", "Puninf.3", "P163.9", "M86.15", "Muninf.12", "P86.6", "M163.18"), Cy5 = c("Muninf.10", "P86.4", "Puninf.1", "Muninf.10", "M163.16", "M86.13", "P163.7", "Puninf.2", "Puninf.2", "P163.8", "M86.14", "Muninf.11", "P86.5", "M163.17", "Muninf.12", "P86.6", "Puninf.3", "Muninf.12", "M163.18", "M86.15", "P163.9")), .Names = c("SlideNumber", "FileName", "Cy3", "Cy5"), class = "data.frame", row.names = c(NA, -21L)) # for readability we also paste it in here... # > targets # SlideNumber FileName Cy3 Cy5 # 1 29 BE T22h slide 29 Puninf.1 Muninf.10 # 2 30 BE T22h slide 30 Puninf.1 P86.4 # 3 31 BE T22h slide 31 P163.7 Puninf.1 # 4 32 BE T22h slide 32 M86.13 Muninf.10 # 5 33 BE T22h slide 33 Muninf.10 M163.16 # 6 34 BE T22h slide 34 P86.4 M86.13 # 7 35 BE T22h slide 35 M163.16 P163.7 # 8 36 BE T22h slide 36 Muninf.11 Puninf.2 # 9 37 BE T22h slide 37 P86.5 Puninf.2 # 10 38 BE T22h slide 38 Puninf.2 P163.8 # 11 39 BE T22h slide 39 Muninf.11 M86.14 # 12 40 BE T22h slide 40 M163.17 Muninf.11 # 13 41 BE T22h slide 41 M86.14 P86.5 # 14 42 BE T22h slide 42 P163.8 M163.17 # 15 43 BE T22h slide 43 Puninf.3 Muninf.12 # 16 44 BE T22h slide 44 Puninf.3 P86.6 # 17 51 BE T22h slide 51 P163.9 Puninf.3 # 18 46 BE T22h slide 46 M86.15 Muninf.12 # 19 47 BE T22h slide 47 Muninf.12 M163.18 # 20 48 BE T22h slide 48 P86.6 M86.15 # 21 50 BE T22h slide 50 M163.18 P163.9 # The design matrix is computed like this design <- modelMatrix(targets, ref = "Puninf.1") # but here is the structure if it is helpful design <- structure(c(0, 0, 0, 0, 1, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, -1, 0, 0, 0, -1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, 1, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, -1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, -1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 1, 0, 1, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, -1, 1, 0, 0, 0, 0), .Dim = c(21L, 17L), .Dimnames = list(c("slide 29", "slide 30", "slide 31", "slide 32", "slide 33", "slide 34", "slide 35", "slide 36", "slide 37", "slide 38", "slide 39", "slide 40", "slide 41", "slide 42", "slide 43", "slide 44", "slide 51", "slide 46", "slide 47", "slide 48", "slide 50" ), c("M163.16", "M163.17", "M163.18", "M86.13", "M86.14", "M86.15", "Muninf.10", "Muninf.11", "Muninf.12", "P163.7", "P163.8", "P163.9", "P86.4", "P86.5", "P86.6", "Puninf.2", "Puninf.3"))) # Fit linear model - commented out becuase we can't really send all the data in MA # fit <- lmFit(MA, design = design) # Coefficients not estimable: Puninf.2 Puninf.3 # Make a contrast matrix: A <- makeContrasts(PuninfvsMuninf=(Muninf.10+Muninf.11+Muninf.12-Puninf.2-P uninf.3)/3, levels = design) # and now fit the contrast con.fit <- contrasts.fit(fit, contrasts = A) Error in contrasts.fit(fit, contrasts = A) : trying to take contrast of non-estimable coefficient # now change the column order in the design matrix and row order of contrast matrix # we create a new sort index - arbitrarily placing the last two columns in the original # design matrix in the middle of the new one. Ditto for the rows of the contrast matrix. n <- nrow(A) newIndex <- c(1:5, n-1, n, 6:(n-2)) designB <- design[,newIndex] B <- A[newIndex,] attributes(B) <- attributes(A) # create a new fit, but we get the same knd of error, the last two columns are # identified as non-estimable fitB <- lmFit(X$MA, design = designB) #Coefficients not estimable: P86.5 P86.6 #Warning message: #Partial NA coefficients for 16278 probe(s) As far as we can tell, it's the trailing columns that get singled out, but we don't know why. Hopefully the information about is detailed enough, but we can provide more information if necessary to help troubleshooting. Thanks in advance, Joaquin > sessionInfo() R version 2.14.0 (2011-10-31) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] limma_3.10.0 loaded via a namespace (and not attached): [1] tools_2.14.0 [[alternative HTML version deleted]]
Microarray limma Microarray limma • 1.2k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 hours ago
United States
Hi Joaquin, On 3/7/2012 11:08 AM, Joaquin Martinez wrote: > Dear All, > > My colleague Ben and I have fitted a linear model to microarray data using > the limma package but we get "coefficients not estimable" for the last two > columns in the design matrix during the fit. As a consequence when trying > to fit our contrast matrix we get the following error message, "Error in > contrasts.fit(fit, contrasts = A) : trying to take contrast of > non-estimable coefficient." We have discovered that if we simply reorder > the design matrix columns and contrasts matrix rows, we encounter the same > error. We would very much appreciate if someone could explain to us why the > not-estimable coefficients arise, and how to correct the problem. > > We are following the guide (section 8.2) for limma 3.10.0 using this > version of R 2.14.0 (sessionInfo at end of email). We have 6 different > types of samples (Puninf, P86, P163, Muninf, M86, M163), and 3 biological > replicates of each (1,2,3; 4,5,6; 7,8,9; 10,11,12; 13,14,15; 16,17,18). Yes, but your targets file has numbers appended to the sample types, which makes R think they are different. Hypothetically you would have noticed this when using the modelMatrix() function: > modelMatrix(targets, ref = "Puninf.1") Found unique target names: M163.16 M163.17 M163.18 M86.13 M86.14 M86.15 Muninf.10 Muninf.11 Muninf.12 P163.7 P163.8 P163.9 P86.4 P86.5 P86.6 Puninf.1 Puninf.2 Puninf.3 This indicates that limma thinks all of your replicates are different sample types. This won't work out (as you have already found) because the model will be over specified, which simply means that you cannot estimate all the parameters you have in the model. This is the same as saying you can't determine the values of x and y in the formula x + y = 3 (but if you also know that x - y = -1, then you can solve for both x and y). Now back to the problem at hand. If you remove the numbers from your sample types, targets$Cy3 <- sapply(strsplit(targets$Cy3, "\\."), function(x) x[1]) targets$Cy5 <- sapply(strsplit(targets$Cy5, "\\."), function(x) x[1]) then you will create the correct model matrix > modelMatrix(targets, ref="Puninf") Found unique target names: M163 M86 Muninf P163 P86 Puninf M163 M86 Muninf P163 P86 [1,] 0 0 1 0 0 [2,] 0 0 0 0 1 [3,] 0 0 0 -1 0 [4,] 0 -1 1 0 0 [5,] 1 0 -1 0 0 [6,] 0 1 0 0 -1 [7,] -1 0 0 1 0 [8,] 0 0 -1 0 0 [9,] 0 0 0 0 -1 [10,] 0 0 0 1 0 [11,] 0 1 -1 0 0 [12,] -1 0 1 0 0 [13,] 0 -1 0 0 1 [14,] 1 0 0 -1 0 [15,] 0 0 1 0 0 [16,] 0 0 0 0 1 [17,] 0 0 0 -1 0 [18,] 0 -1 1 0 0 [19,] 1 0 -1 0 0 [20,] 0 1 0 0 -1 [21,] -1 0 0 1 0 Which will compare all samples to Puninf. Best, Jim > > # The targets file looks like this > > targets<- structure(list(SlideNumber = c(29L, 30L, 31L, 32L, 33L, 34L, > 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 51L, 46L, 47L, > 48L, 50L), FileName = c("BE T22h slide 29", "BE T22h slide 30", > "BE T22h slide 31", "BE T22h slide 32", "BE T22h slide 33", "BE T22h > slide 34", > "BE T22h slide 35", "BE T22h slide 36", "BE T22h slide 37", "BE T22h > slide 38", > "BE T22h slide 39", "BE T22h slide 40", "BE T22h slide 41", "BE T22h > slide 42", > "BE T22h slide 43", "BE T22h slide 44", "BE T22h slide 51", "BE T22h > slide 46", > "BE T22h slide 47", "BE T22h slide 48", "BE T22h slide 50"), > Cy3 = c("Puninf.1", "Puninf.1", "P163.7", "M86.13", "Muninf.10", > "P86.4", "M163.16", "Muninf.11", "P86.5", "Puninf.2", "Muninf.11", > "M163.17", "M86.14", "P163.8", "Puninf.3", "Puninf.3", "P163.9", > "M86.15", "Muninf.12", "P86.6", "M163.18"), Cy5 = c("Muninf.10", > "P86.4", "Puninf.1", "Muninf.10", "M163.16", "M86.13", "P163.7", > "Puninf.2", "Puninf.2", "P163.8", "M86.14", "Muninf.11", > "P86.5", "M163.17", "Muninf.12", "P86.6", "Puninf.3", "Muninf.12", > "M163.18", "M86.15", "P163.9")), .Names = c("SlideNumber", > "FileName", "Cy3", "Cy5"), class = "data.frame", row.names = c(NA, > -21L)) > > # for readability we also paste it in here... > #> targets > # SlideNumber FileName Cy3 Cy5 > # 1 29 BE T22h slide 29 Puninf.1 Muninf.10 > # 2 30 BE T22h slide 30 Puninf.1 P86.4 > # 3 31 BE T22h slide 31 P163.7 Puninf.1 > # 4 32 BE T22h slide 32 M86.13 Muninf.10 > # 5 33 BE T22h slide 33 Muninf.10 M163.16 > # 6 34 BE T22h slide 34 P86.4 M86.13 > # 7 35 BE T22h slide 35 M163.16 P163.7 > # 8 36 BE T22h slide 36 Muninf.11 Puninf.2 > # 9 37 BE T22h slide 37 P86.5 Puninf.2 > # 10 38 BE T22h slide 38 Puninf.2 P163.8 > # 11 39 BE T22h slide 39 Muninf.11 M86.14 > # 12 40 BE T22h slide 40 M163.17 Muninf.11 > # 13 41 BE T22h slide 41 M86.14 P86.5 > # 14 42 BE T22h slide 42 P163.8 M163.17 > # 15 43 BE T22h slide 43 Puninf.3 Muninf.12 > # 16 44 BE T22h slide 44 Puninf.3 P86.6 > # 17 51 BE T22h slide 51 P163.9 Puninf.3 > # 18 46 BE T22h slide 46 M86.15 Muninf.12 > # 19 47 BE T22h slide 47 Muninf.12 M163.18 > # 20 48 BE T22h slide 48 P86.6 M86.15 > # 21 50 BE T22h slide 50 M163.18 P163.9 > > # The design matrix is computed like this design<- modelMatrix(targets, > ref = "Puninf.1") > # but here is the structure if it is helpful > design<- structure(c(0, 0, 0, 0, 1, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, 1, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 1, 0, -1, 0, 0, 0, -1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, -1, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, -1, 0, 1, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, -1, > 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, -1, 0, 0, 0, 1, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, > 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 1, 0, 1, 0, 0, 0, -1, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > -1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, > 0, 1, 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, -1, 1, 0, 0, 0, 0), .Dim = c(21L, > 17L), .Dimnames = list(c("slide 29", "slide 30", "slide 31", > "slide 32", "slide 33", "slide 34", "slide 35", "slide 36", "slide 37", > "slide 38", "slide 39", "slide 40", "slide 41", "slide 42", "slide 43", > "slide 44", "slide 51", "slide 46", "slide 47", "slide 48", "slide 50" > ), c("M163.16", "M163.17", "M163.18", "M86.13", "M86.14", "M86.15", > "Muninf.10", "Muninf.11", "Muninf.12", "P163.7", "P163.8", "P163.9", > "P86.4", "P86.5", "P86.6", "Puninf.2", "Puninf.3"))) > > > # Fit linear model - commented out becuase we can't really send all the > data in MA > # fit<- lmFit(MA, design = design) > # Coefficients not estimable: Puninf.2 Puninf.3 > > # Make a contrast matrix: > A<- > makeContrasts(PuninfvsMuninf=(Muninf.10+Muninf.11+Muninf.12-Puninf.2 -Puninf.3)/3, > levels = design) > > # and now fit the contrast > con.fit<- contrasts.fit(fit, contrasts = A) > Error in contrasts.fit(fit, contrasts = A) : trying to take contrast of > non-estimable coefficient > > # now change the column order in the design matrix and row order of > contrast matrix > # we create a new sort index - arbitrarily placing the last two columns in > the original > # design matrix in the middle of the new one. Ditto for the rows of the > contrast matrix. > n<- nrow(A) > newIndex<- c(1:5, n-1, n, 6:(n-2)) > designB<- design[,newIndex] > B<- A[newIndex,] > attributes(B)<- attributes(A) > > # create a new fit, but we get the same knd of error, the last two columns > are > # identified as non-estimable > fitB<- lmFit(X$MA, design = designB) > #Coefficients not estimable: P86.5 P86.6 > #Warning message: > #Partial NA coefficients for 16278 probe(s) > > > As far as we can tell, it's the trailing columns that get singled out, but > we don't know why. Hopefully the information about is detailed enough, but > we can provide more information if necessary to help troubleshooting. > > Thanks in advance, > > Joaquin > >> sessionInfo() > R version 2.14.0 (2011-10-31) > Platform: i386-apple-darwin9.8.0/i386 (32-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] limma_3.10.0 > > loaded via a namespace (and not attached): > [1] tools_2.14.0 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Hi Jim, Thank you for your quick reply and advice. Actually we had considered doing exactly what you suggest and remove the numbers from the sample names. My concern then is what is then the reference or baseline?, i.e. Puninf.1 or an average value of Puninf.1, Puninf.2 and Puninf.3. Also, how does removing numbers take into consideration slight value differences among biological replicate samples? In the example we are following in the limma user's guide there are 3 wild-type mice (wt1,wt2,wt3) and 3 mutant mice (mu1, mu2, mu3), where wt1 is used as the reference. I think we have the same type of study but with 6 "mouse types". Thanks, Joaquin On Wed, Mar 7, 2012 at 12:01 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Joaquin, > > > On 3/7/2012 11:08 AM, Joaquin Martinez wrote: > >> Dear All, >> >> My colleague Ben and I have fitted a linear model to microarray data using >> the limma package but we get "coefficients not estimable" for the last two >> columns in the design matrix during the fit. As a consequence when trying >> to fit our contrast matrix we get the following error message, "Error in >> contrasts.fit(fit, contrasts = A) : trying to take contrast of >> non-estimable coefficient." We have discovered that if we simply reorder >> the design matrix columns and contrasts matrix rows, we encounter the same >> error. We would very much appreciate if someone could explain to us why >> the >> not-estimable coefficients arise, and how to correct the problem. >> >> We are following the guide (section 8.2) for limma 3.10.0 using this >> version of R 2.14.0 (sessionInfo at end of email). We have 6 different >> types of samples (Puninf, P86, P163, Muninf, M86, M163), and 3 biological >> replicates of each (1,2,3; 4,5,6; 7,8,9; 10,11,12; 13,14,15; 16,17,18). >> > > Yes, but your targets file has numbers appended to the sample types, which > makes R think they are different. Hypothetically you would have noticed > this when using the modelMatrix() function: > > > > modelMatrix(targets, ref = "Puninf.1") > Found unique target names: > M163.16 M163.17 M163.18 M86.13 M86.14 M86.15 Muninf.10 Muninf.11 > Muninf.12 P163.7 P163.8 P163.9 P86.4 P86.5 P86.6 Puninf.1 Puninf.2 Puninf.3 > > This indicates that limma thinks all of your replicates are different > sample types. This won't work out (as you have already found) because the > model will be over specified, which simply means that you cannot estimate > all the parameters you have in the model. This is the same as saying you > can't determine the values of x and y in the formula x + y = 3 (but if you > also know that x - y = -1, then you can solve for both x and y). > > Now back to the problem at hand. If you remove the numbers from your > sample types, > > targets$Cy3 <- sapply(strsplit(targets$Cy3, "\\."), function(x) x[1]) > targets$Cy5 <- sapply(strsplit(targets$Cy5, "\\."), function(x) x[1]) > > then you will create the correct model matrix > > > modelMatrix(targets, ref="Puninf") > Found unique target names: > M163 M86 Muninf P163 P86 Puninf > M163 M86 Muninf P163 P86 > [1,] 0 0 1 0 0 > [2,] 0 0 0 0 1 > [3,] 0 0 0 -1 0 > [4,] 0 -1 1 0 0 > [5,] 1 0 -1 0 0 > [6,] 0 1 0 0 -1 > [7,] -1 0 0 1 0 > [8,] 0 0 -1 0 0 > [9,] 0 0 0 0 -1 > [10,] 0 0 0 1 0 > [11,] 0 1 -1 0 0 > [12,] -1 0 1 0 0 > [13,] 0 -1 0 0 1 > [14,] 1 0 0 -1 0 > [15,] 0 0 1 0 0 > [16,] 0 0 0 0 1 > [17,] 0 0 0 -1 0 > [18,] 0 -1 1 0 0 > [19,] 1 0 -1 0 0 > [20,] 0 1 0 0 -1 > [21,] -1 0 0 1 0 > > Which will compare all samples to Puninf. > > Best, > > Jim > > > > > > >> # The targets file looks like this >> >> targets<- structure(list(SlideNumber = c(29L, 30L, 31L, 32L, 33L, 34L, >> 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 51L, 46L, 47L, >> 48L, 50L), FileName = c("BE T22h slide 29", "BE T22h slide 30", >> "BE T22h slide 31", "BE T22h slide 32", "BE T22h slide 33", "BE T22h >> slide 34", >> "BE T22h slide 35", "BE T22h slide 36", "BE T22h slide 37", "BE T22h >> slide 38", >> "BE T22h slide 39", "BE T22h slide 40", "BE T22h slide 41", "BE T22h >> slide 42", >> "BE T22h slide 43", "BE T22h slide 44", "BE T22h slide 51", "BE T22h >> slide 46", >> "BE T22h slide 47", "BE T22h slide 48", "BE T22h slide 50"), >> Cy3 = c("Puninf.1", "Puninf.1", "P163.7", "M86.13", "Muninf.10", >> "P86.4", "M163.16", "Muninf.11", "P86.5", "Puninf.2", "Muninf.11", >> "M163.17", "M86.14", "P163.8", "Puninf.3", "Puninf.3", "P163.9", >> "M86.15", "Muninf.12", "P86.6", "M163.18"), Cy5 = c("Muninf.10", >> "P86.4", "Puninf.1", "Muninf.10", "M163.16", "M86.13", "P163.7", >> "Puninf.2", "Puninf.2", "P163.8", "M86.14", "Muninf.11", >> "P86.5", "M163.17", "Muninf.12", "P86.6", "Puninf.3", "Muninf.12", >> "M163.18", "M86.15", "P163.9")), .Names = c("SlideNumber", >> "FileName", "Cy3", "Cy5"), class = "data.frame", row.names = c(NA, >> -21L)) >> >> # for readability we also paste it in here... >> #> targets >> # SlideNumber FileName Cy3 Cy5 >> # 1 29 BE T22h slide 29 Puninf.1 Muninf.10 >> # 2 30 BE T22h slide 30 Puninf.1 P86.4 >> # 3 31 BE T22h slide 31 P163.7 Puninf.1 >> # 4 32 BE T22h slide 32 M86.13 Muninf.10 >> # 5 33 BE T22h slide 33 Muninf.10 M163.16 >> # 6 34 BE T22h slide 34 P86.4 M86.13 >> # 7 35 BE T22h slide 35 M163.16 P163.7 >> # 8 36 BE T22h slide 36 Muninf.11 Puninf.2 >> # 9 37 BE T22h slide 37 P86.5 Puninf.2 >> # 10 38 BE T22h slide 38 Puninf.2 P163.8 >> # 11 39 BE T22h slide 39 Muninf.11 M86.14 >> # 12 40 BE T22h slide 40 M163.17 Muninf.11 >> # 13 41 BE T22h slide 41 M86.14 P86.5 >> # 14 42 BE T22h slide 42 P163.8 M163.17 >> # 15 43 BE T22h slide 43 Puninf.3 Muninf.12 >> # 16 44 BE T22h slide 44 Puninf.3 P86.6 >> # 17 51 BE T22h slide 51 P163.9 Puninf.3 >> # 18 46 BE T22h slide 46 M86.15 Muninf.12 >> # 19 47 BE T22h slide 47 Muninf.12 M163.18 >> # 20 48 BE T22h slide 48 P86.6 M86.15 >> # 21 50 BE T22h slide 50 M163.18 P163.9 >> >> # The design matrix is computed like this design<- modelMatrix(targets, >> ref = "Puninf.1") >> # but here is the structure if it is helpful >> design<- structure(c(0, 0, 0, 0, 1, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, 1, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 1, 0, -1, 0, 0, 0, -1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, -1, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, -1, 0, 1, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, -1, >> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, -1, 0, 0, 0, 1, 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, >> 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 1, 0, 1, 0, 0, 0, -1, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> -1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, >> 0, 1, 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, -1, 1, 0, 0, 0, 0), .Dim = c(21L, >> 17L), .Dimnames = list(c("slide 29", "slide 30", "slide 31", >> "slide 32", "slide 33", "slide 34", "slide 35", "slide 36", "slide 37", >> "slide 38", "slide 39", "slide 40", "slide 41", "slide 42", "slide 43", >> "slide 44", "slide 51", "slide 46", "slide 47", "slide 48", "slide 50" >> ), c("M163.16", "M163.17", "M163.18", "M86.13", "M86.14", "M86.15", >> "Muninf.10", "Muninf.11", "Muninf.12", "P163.7", "P163.8", "P163.9", >> "P86.4", "P86.5", "P86.6", "Puninf.2", "Puninf.3"))) >> >> >> # Fit linear model - commented out becuase we can't really send all the >> data in MA >> # fit<- lmFit(MA, design = design) >> # Coefficients not estimable: Puninf.2 Puninf.3 >> >> # Make a contrast matrix: >> A<- >> makeContrasts(PuninfvsMuninf=(**Muninf.10+Muninf.11+Muninf.12-** >> Puninf.2-Puninf.3)/3, >> levels = design) >> >> # and now fit the contrast >> con.fit<- contrasts.fit(fit, contrasts = A) >> Error in contrasts.fit(fit, contrasts = A) : trying to take contrast of >> non-estimable coefficient >> >> # now change the column order in the design matrix and row order of >> contrast matrix >> # we create a new sort index - arbitrarily placing the last two columns in >> the original >> # design matrix in the middle of the new one. Ditto for the rows of the >> contrast matrix. >> n<- nrow(A) >> newIndex<- c(1:5, n-1, n, 6:(n-2)) >> designB<- design[,newIndex] >> B<- A[newIndex,] >> attributes(B)<- attributes(A) >> >> # create a new fit, but we get the same knd of error, the last two columns >> are >> # identified as non-estimable >> fitB<- lmFit(X$MA, design = designB) >> #Coefficients not estimable: P86.5 P86.6 >> #Warning message: >> #Partial NA coefficients for 16278 probe(s) >> >> >> As far as we can tell, it's the trailing columns that get singled out, but >> we don't know why. Hopefully the information about is detailed enough, but >> we can provide more information if necessary to help troubleshooting. >> >> Thanks in advance, >> >> Joaquin >> >> sessionInfo() >>> >> R version 2.14.0 (2011-10-31) >> Platform: i386-apple-darwin9.8.0/i386 (32-bit) >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/en_US.**UTF-8/C/en_US.UTF-8/en_US.UTF-**8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] limma_3.10.0 >> >> loaded via a namespace (and not attached): >> [1] tools_2.14.0 >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Joaquin, On 3/7/2012 2:12 PM, Joaquin Martinez wrote: > Hi Jim, > > Thank you for your quick reply and advice. Actually we had considered > doing exactly what you suggest and remove the numbers from the sample > names. My concern then is what is then the reference or baseline?, > i.e. Puninf.1 or an average value of Puninf.1, Puninf.2 and Puninf.3. > Also, how does removing numbers take into consideration slight value > differences among biological replicate samples? The baseline is the average of the Puninf samples. Which is, I believe, what you want. To make the comparisons you request, limma is going to compute a t-statistic, which simply put is difference in means between two samples/ measure of the variability of the means So the numerator of the statistic tells you how different the two samples are. The denominator (which is a measure of the slight differences among biological replicates) is used to determine if the differences between samples is 'large' or not. If there is a lot of variability within sample types, the numerator has to be pretty big. Conversely, if there isn't much variability within sample types then a smaller numerator may be considered significantly large. Note that you aren't forced to use Puninf as the reference. That's just the easiest way to use modelMatrix(). You can also pass in a matrix for the 'param' argument, selecting direct comparisons to make. Or you can just leave Puninf as the reference, and make further comparisons with a contrast matrix, which is probably simpler. Note here that the design matrix with Puninf as reference is computing implicit comparisons (e.g., the M163 coefficient is really log2(M163) - log2(Puninf)). So if you want to know if there is a significant difference between M163 and M86, you can just add a contrast matrix that looks like 1 -1 0 0 Which simply means that we will be computing log2(M163) - log2(Puninf) - (log2(M86) - log2(Puninf)). Algebraically, the two log2(Puninf) will cancel out, and you end up with log2(M163) - log2(M86). You can create any contrast matrix you want using makeContrasts(). > > In the example we are following in the limma user's guide there are 3 > wild-type mice (wt1,wt2,wt3) and 3 mutant mice (mu1, mu2, mu3), where > wt1 is used as the reference. I think we have the same type of study > but with 6 "mouse types". Unless I am mistaken, you are looking at the wrong example. The example you reference involves technical replication, where RNA from each mouse was hybridized to more than one slide. From what you said originally, each of the samples in your experiment is a biological replicate. Your experiment is more similar to the common reference design experiment on p34 of the limma User's Guide. Best, Jim > > Thanks, > Joaquin > > > > > On Wed, Mar 7, 2012 at 12:01 PM, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > > Hi Joaquin, > > > On 3/7/2012 11:08 AM, Joaquin Martinez wrote: > > Dear All, > > My colleague Ben and I have fitted a linear model to > microarray data using > the limma package but we get "coefficients not estimable" for > the last two > columns in the design matrix during the fit. As a consequence > when trying > to fit our contrast matrix we get the following error message, > "Error in > contrasts.fit(fit, contrasts = A) : trying to take contrast of > non-estimable coefficient." We have discovered that if we > simply reorder > the design matrix columns and contrasts matrix rows, we > encounter the same > error. We would very much appreciate if someone could explain > to us why the > not-estimable coefficients arise, and how to correct the problem. > > We are following the guide (section 8.2) for limma 3.10.0 > using this > version of R 2.14.0 (sessionInfo at end of email). We have 6 > different > types of samples (Puninf, P86, P163, Muninf, M86, M163), and 3 > biological > replicates of each (1,2,3; 4,5,6; 7,8,9; 10,11,12; 13,14,15; > 16,17,18). > > > Yes, but your targets file has numbers appended to the sample > types, which makes R think they are different. Hypothetically you > would have noticed this when using the modelMatrix() function: > > > > modelMatrix(targets, ref = "Puninf.1") > Found unique target names: > M163.16 M163.17 M163.18 M86.13 M86.14 M86.15 Muninf.10 Muninf.11 > Muninf.12 P163.7 P163.8 P163.9 P86.4 P86.5 P86.6 Puninf.1 Puninf.2 > Puninf.3 > > This indicates that limma thinks all of your replicates are > different sample types. This won't work out (as you have already > found) because the model will be over specified, which simply > means that you cannot estimate all the parameters you have in the > model. This is the same as saying you can't determine the values > of x and y in the formula x + y = 3 (but if you also know that x > - y = -1, then you can solve for both x and y). > > Now back to the problem at hand. If you remove the numbers from > your sample types, > > targets$Cy3 <- sapply(strsplit(targets$Cy3, "\\."), function(x) x[1]) > targets$Cy5 <- sapply(strsplit(targets$Cy5, "\\."), function(x) x[1]) > > then you will create the correct model matrix > > > modelMatrix(targets, ref="Puninf") > Found unique target names: > M163 M86 Muninf P163 P86 Puninf > M163 M86 Muninf P163 P86 > [1,] 0 0 1 0 0 > [2,] 0 0 0 0 1 > [3,] 0 0 0 -1 0 > [4,] 0 -1 1 0 0 > [5,] 1 0 -1 0 0 > [6,] 0 1 0 0 -1 > [7,] -1 0 0 1 0 > [8,] 0 0 -1 0 0 > [9,] 0 0 0 0 -1 > [10,] 0 0 0 1 0 > [11,] 0 1 -1 0 0 > [12,] -1 0 1 0 0 > [13,] 0 -1 0 0 1 > [14,] 1 0 0 -1 0 > [15,] 0 0 1 0 0 > [16,] 0 0 0 0 1 > [17,] 0 0 0 -1 0 > [18,] 0 -1 1 0 0 > [19,] 1 0 -1 0 0 > [20,] 0 1 0 0 -1 > [21,] -1 0 0 1 0 > > Which will compare all samples to Puninf. > > Best, > > Jim > > > > > > > # The targets file looks like this > > targets<- structure(list(SlideNumber = c(29L, 30L, 31L, 32L, > 33L, 34L, > 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 51L, 46L, > 47L, > 48L, 50L), FileName = c("BE T22h slide 29", "BE T22h slide 30", > "BE T22h slide 31", "BE T22h slide 32", "BE T22h slide 33", > "BE T22h > slide 34", > "BE T22h slide 35", "BE T22h slide 36", "BE T22h slide 37", > "BE T22h > slide 38", > "BE T22h slide 39", "BE T22h slide 40", "BE T22h slide 41", > "BE T22h > slide 42", > "BE T22h slide 43", "BE T22h slide 44", "BE T22h slide 51", > "BE T22h > slide 46", > "BE T22h slide 47", "BE T22h slide 48", "BE T22h slide 50"), > Cy3 = c("Puninf.1", "Puninf.1", "P163.7", "M86.13", > "Muninf.10", > "P86.4", "M163.16", "Muninf.11", "P86.5", "Puninf.2", > "Muninf.11", > "M163.17", "M86.14", "P163.8", "Puninf.3", "Puninf.3", > "P163.9", > "M86.15", "Muninf.12", "P86.6", "M163.18"), Cy5 = > c("Muninf.10", > "P86.4", "Puninf.1", "Muninf.10", "M163.16", "M86.13", > "P163.7", > "Puninf.2", "Puninf.2", "P163.8", "M86.14", "Muninf.11", > "P86.5", "M163.17", "Muninf.12", "P86.6", "Puninf.3", > "Muninf.12", > "M163.18", "M86.15", "P163.9")), .Names = c("SlideNumber", > "FileName", "Cy3", "Cy5"), class = "data.frame", row.names > = c(NA, > -21L)) > > # for readability we also paste it in here... > #> targets > # SlideNumber FileName Cy3 Cy5 > # 1 29 BE T22h slide 29 Puninf.1 Muninf.10 > # 2 30 BE T22h slide 30 Puninf.1 P86.4 > # 3 31 BE T22h slide 31 P163.7 Puninf.1 > # 4 32 BE T22h slide 32 M86.13 Muninf.10 > # 5 33 BE T22h slide 33 Muninf.10 M163.16 > # 6 34 BE T22h slide 34 P86.4 M86.13 > # 7 35 BE T22h slide 35 M163.16 P163.7 > # 8 36 BE T22h slide 36 Muninf.11 Puninf.2 > # 9 37 BE T22h slide 37 P86.5 Puninf.2 > # 10 38 BE T22h slide 38 Puninf.2 P163.8 > # 11 39 BE T22h slide 39 Muninf.11 M86.14 > # 12 40 BE T22h slide 40 M163.17 Muninf.11 > # 13 41 BE T22h slide 41 M86.14 P86.5 > # 14 42 BE T22h slide 42 P163.8 M163.17 > # 15 43 BE T22h slide 43 Puninf.3 Muninf.12 > # 16 44 BE T22h slide 44 Puninf.3 P86.6 > # 17 51 BE T22h slide 51 P163.9 Puninf.3 > # 18 46 BE T22h slide 46 M86.15 Muninf.12 > # 19 47 BE T22h slide 47 Muninf.12 M163.18 > # 20 48 BE T22h slide 48 P86.6 M86.15 > # 21 50 BE T22h slide 50 M163.18 P163.9 > > # The design matrix is computed like this design<- > modelMatrix(targets, > ref = "Puninf.1") > # but here is the structure if it is helpful > design<- structure(c(0, 0, 0, 0, 1, 0, -1, 0, 0, 0, 0, 0, 0, > 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, 1, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 1, 0, -1, 0, 0, 0, -1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, -1, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, -1, 0, 1, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, -1, > 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, -1, 0, 0, 0, 1, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, > 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 1, 0, 1, 0, 0, 0, -1, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > -1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, > 0, 1, 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, -1, 1, 0, 0, 0, 0), .Dim = > c(21L, > 17L), .Dimnames = list(c("slide 29", "slide 30", "slide 31", > "slide 32", "slide 33", "slide 34", "slide 35", "slide 36", > "slide 37", > "slide 38", "slide 39", "slide 40", "slide 41", "slide 42", > "slide 43", > "slide 44", "slide 51", "slide 46", "slide 47", "slide 48", > "slide 50" > ), c("M163.16", "M163.17", "M163.18", "M86.13", "M86.14", > "M86.15", > "Muninf.10", "Muninf.11", "Muninf.12", "P163.7", "P163.8", > "P163.9", > "P86.4", "P86.5", "P86.6", "Puninf.2", "Puninf.3"))) > > > # Fit linear model - commented out becuase we can't really > send all the > data in MA > # fit<- lmFit(MA, design = design) > # Coefficients not estimable: Puninf.2 Puninf.3 > > # Make a contrast matrix: > A<- > makeContrasts(PuninfvsMuninf=(Muninf.10+Muninf.11+Muninf.12- Puninf.2-Puninf.3)/3, > levels = design) > > # and now fit the contrast > con.fit<- contrasts.fit(fit, contrasts = A) > Error in contrasts.fit(fit, contrasts = A) : trying to take > contrast of > non-estimable coefficient > > # now change the column order in the design matrix and row > order of > contrast matrix > # we create a new sort index - arbitrarily placing the last > two columns in > the original > # design matrix in the middle of the new one. Ditto for the > rows of the > contrast matrix. > n<- nrow(A) > newIndex<- c(1:5, n-1, n, 6:(n-2)) > designB<- design[,newIndex] > B<- A[newIndex,] > attributes(B)<- attributes(A) > > # create a new fit, but we get the same knd of error, the last > two columns > are > # identified as non-estimable > fitB<- lmFit(X$MA, design = designB) > #Coefficients not estimable: P86.5 P86.6 > #Warning message: > #Partial NA coefficients for 16278 probe(s) > > > As far as we can tell, it's the trailing columns that get > singled out, but > we don't know why. Hopefully the information about is detailed > enough, but > we can provide more information if necessary to help > troubleshooting. > > Thanks in advance, > > Joaquin > > sessionInfo() > > R version 2.14.0 (2011-10-31) > Platform: i386-apple-darwin9.8.0/i386 (32-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] limma_3.10.0 > > loaded via a namespace (and not attached): > [1] tools_2.14.0 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
Hi Jim, Once again, thank you for your helpful advice. Maybe I was not very clear in my original email. In the targets file that I sent, we have both biological replication (e.g. Puninf.1, Puninf.2 and Puninf.3) and technical replication (e.g. Puninf.1 present in slides29, 30 and 31) instead of including a common reference sample in all hybridizations. Ours is more like the "Direct Two-Color Designs" in section 7.4 including "Technical Replication" as in section 8.2. Based on this, does your previous advice still hold? Should we remove the numbers from the sample names? Or are there any further considerations we should take into account in our analysis? I truly appreciate your help. Best, Joaquin On Wed, Mar 7, 2012 at 3:07 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Joaquin, > > > On 3/7/2012 2:12 PM, Joaquin Martinez wrote: > >> Hi Jim, >> >> Thank you for your quick reply and advice. Actually we had considered >> doing exactly what you suggest and remove the numbers from the sample >> names. My concern then is what is then the reference or baseline?, i.e. >> Puninf.1 or an average value of Puninf.1, Puninf.2 and Puninf.3. Also, how >> does removing numbers take into consideration slight value differences >> among biological replicate samples? >> > > The baseline is the average of the Puninf samples. Which is, I believe, > what you want. To make the comparisons you request, limma is going to > compute a t-statistic, which simply put is > > difference in means between two samples/ measure of the variability of the > means > > So the numerator of the statistic tells you how different the two samples > are. The denominator (which is a measure of the slight differences among > biological replicates) is used to determine if the differences between > samples is 'large' or not. If there is a lot of variability within sample > types, the numerator has to be pretty big. Conversely, if there isn't much > variability within sample types then a smaller numerator may be considered > significantly large. > > Note that you aren't forced to use Puninf as the reference. That's just > the easiest way to use modelMatrix(). You can also pass in a matrix for the > 'param' argument, selecting direct comparisons to make. Or you can just > leave Puninf as the reference, and make further comparisons with a contrast > matrix, which is probably simpler. > > Note here that the design matrix with Puninf as reference is computing > implicit comparisons (e.g., the M163 coefficient is really log2(M163) - > log2(Puninf)). So if you want to know if there is a significant difference > between M163 and M86, you can just add a contrast matrix that looks like > > 1 > -1 > 0 > 0 > > Which simply means that we will be computing log2(M163) - log2(Puninf) - > (log2(M86) - log2(Puninf)). Algebraically, the two log2(Puninf) will cancel > out, and you end up with log2(M163) - log2(M86). You can create any > contrast matrix you want using makeContrasts(). > > > >> In the example we are following in the limma user's guide there are 3 >> wild-type mice (wt1,wt2,wt3) and 3 mutant mice (mu1, mu2, mu3), where wt1 >> is used as the reference. I think we have the same type of study but with 6 >> "mouse types". >> > > Unless I am mistaken, you are looking at the wrong example. The example > you reference involves technical replication, where RNA from each mouse was > hybridized to more than one slide. From what you said originally, each of > the samples in your experiment is a biological replicate. Your experiment > is more similar to the common reference design experiment on p34 of the > limma User's Guide. > > Best, > > Jim > > > >> Thanks, >> Joaquin >> >> >> >> >> >> On Wed, Mar 7, 2012 at 12:01 PM, James W. MacDonald <jmacdon@uw.edu<mailto:>> jmacdon@uw.edu>> wrote: >> >> Hi Joaquin, >> >> >> On 3/7/2012 11:08 AM, Joaquin Martinez wrote: >> >> Dear All, >> >> My colleague Ben and I have fitted a linear model to >> microarray data using >> the limma package but we get "coefficients not estimable" for >> the last two >> columns in the design matrix during the fit. As a consequence >> when trying >> to fit our contrast matrix we get the following error message, >> "Error in >> contrasts.fit(fit, contrasts = A) : trying to take contrast of >> non-estimable coefficient." We have discovered that if we >> simply reorder >> the design matrix columns and contrasts matrix rows, we >> encounter the same >> error. We would very much appreciate if someone could explain >> to us why the >> not-estimable coefficients arise, and how to correct the problem. >> >> We are following the guide (section 8.2) for limma 3.10.0 >> using this >> version of R 2.14.0 (sessionInfo at end of email). We have 6 >> different >> types of samples (Puninf, P86, P163, Muninf, M86, M163), and 3 >> biological >> replicates of each (1,2,3; 4,5,6; 7,8,9; 10,11,12; 13,14,15; >> 16,17,18). >> >> >> Yes, but your targets file has numbers appended to the sample >> types, which makes R think they are different. Hypothetically you >> would have noticed this when using the modelMatrix() function: >> >> >> > modelMatrix(targets, ref = "Puninf.1") >> Found unique target names: >> M163.16 M163.17 M163.18 M86.13 M86.14 M86.15 Muninf.10 Muninf.11 >> Muninf.12 P163.7 P163.8 P163.9 P86.4 P86.5 P86.6 Puninf.1 Puninf.2 >> Puninf.3 >> >> This indicates that limma thinks all of your replicates are >> different sample types. This won't work out (as you have already >> found) because the model will be over specified, which simply >> means that you cannot estimate all the parameters you have in the >> model. This is the same as saying you can't determine the values >> of x and y in the formula x + y = 3 (but if you also know that x >> - y = -1, then you can solve for both x and y). >> >> Now back to the problem at hand. If you remove the numbers from >> your sample types, >> >> targets$Cy3 <- sapply(strsplit(targets$Cy3, "\\."), function(x) x[1]) >> targets$Cy5 <- sapply(strsplit(targets$Cy5, "\\."), function(x) x[1]) >> >> then you will create the correct model matrix >> >> > modelMatrix(targets, ref="Puninf") >> Found unique target names: >> M163 M86 Muninf P163 P86 Puninf >> M163 M86 Muninf P163 P86 >> [1,] 0 0 1 0 0 >> [2,] 0 0 0 0 1 >> [3,] 0 0 0 -1 0 >> [4,] 0 -1 1 0 0 >> [5,] 1 0 -1 0 0 >> [6,] 0 1 0 0 -1 >> [7,] -1 0 0 1 0 >> [8,] 0 0 -1 0 0 >> [9,] 0 0 0 0 -1 >> [10,] 0 0 0 1 0 >> [11,] 0 1 -1 0 0 >> [12,] -1 0 1 0 0 >> [13,] 0 -1 0 0 1 >> [14,] 1 0 0 -1 0 >> [15,] 0 0 1 0 0 >> [16,] 0 0 0 0 1 >> [17,] 0 0 0 -1 0 >> [18,] 0 -1 1 0 0 >> [19,] 1 0 -1 0 0 >> [20,] 0 1 0 0 -1 >> [21,] -1 0 0 1 0 >> >> Which will compare all samples to Puninf. >> >> Best, >> >> Jim >> >> >> >> >> >> >> # The targets file looks like this >> >> targets<- structure(list(SlideNumber = c(29L, 30L, 31L, 32L, >> 33L, 34L, >> 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 51L, 46L, >> 47L, >> 48L, 50L), FileName = c("BE T22h slide 29", "BE T22h slide 30", >> "BE T22h slide 31", "BE T22h slide 32", "BE T22h slide 33", >> "BE T22h >> slide 34", >> "BE T22h slide 35", "BE T22h slide 36", "BE T22h slide 37", >> "BE T22h >> slide 38", >> "BE T22h slide 39", "BE T22h slide 40", "BE T22h slide 41", >> "BE T22h >> slide 42", >> "BE T22h slide 43", "BE T22h slide 44", "BE T22h slide 51", >> "BE T22h >> slide 46", >> "BE T22h slide 47", "BE T22h slide 48", "BE T22h slide 50"), >> Cy3 = c("Puninf.1", "Puninf.1", "P163.7", "M86.13", >> "Muninf.10", >> "P86.4", "M163.16", "Muninf.11", "P86.5", "Puninf.2", >> "Muninf.11", >> "M163.17", "M86.14", "P163.8", "Puninf.3", "Puninf.3", >> "P163.9", >> "M86.15", "Muninf.12", "P86.6", "M163.18"), Cy5 = >> c("Muninf.10", >> "P86.4", "Puninf.1", "Muninf.10", "M163.16", "M86.13", >> "P163.7", >> "Puninf.2", "Puninf.2", "P163.8", "M86.14", "Muninf.11", >> "P86.5", "M163.17", "Muninf.12", "P86.6", "Puninf.3", >> "Muninf.12", >> "M163.18", "M86.15", "P163.9")), .Names = c("SlideNumber", >> "FileName", "Cy3", "Cy5"), class = "data.frame", row.names >> = c(NA, >> -21L)) >> >> # for readability we also paste it in here... >> #> targets >> # SlideNumber FileName Cy3 Cy5 >> # 1 29 BE T22h slide 29 Puninf.1 Muninf.10 >> # 2 30 BE T22h slide 30 Puninf.1 P86.4 >> # 3 31 BE T22h slide 31 P163.7 Puninf.1 >> # 4 32 BE T22h slide 32 M86.13 Muninf.10 >> # 5 33 BE T22h slide 33 Muninf.10 M163.16 >> # 6 34 BE T22h slide 34 P86.4 M86.13 >> # 7 35 BE T22h slide 35 M163.16 P163.7 >> # 8 36 BE T22h slide 36 Muninf.11 Puninf.2 >> # 9 37 BE T22h slide 37 P86.5 Puninf.2 >> # 10 38 BE T22h slide 38 Puninf.2 P163.8 >> # 11 39 BE T22h slide 39 Muninf.11 M86.14 >> # 12 40 BE T22h slide 40 M163.17 Muninf.11 >> # 13 41 BE T22h slide 41 M86.14 P86.5 >> # 14 42 BE T22h slide 42 P163.8 M163.17 >> # 15 43 BE T22h slide 43 Puninf.3 Muninf.12 >> # 16 44 BE T22h slide 44 Puninf.3 P86.6 >> # 17 51 BE T22h slide 51 P163.9 Puninf.3 >> # 18 46 BE T22h slide 46 M86.15 Muninf.12 >> # 19 47 BE T22h slide 47 Muninf.12 M163.18 >> # 20 48 BE T22h slide 48 P86.6 M86.15 >> # 21 50 BE T22h slide 50 M163.18 P163.9 >> >> # The design matrix is computed like this design<- >> modelMatrix(targets, >> ref = "Puninf.1") >> # but here is the structure if it is helpful >> design<- structure(c(0, 0, 0, 0, 1, 0, -1, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, 1, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 1, 0, -1, 0, 0, 0, -1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, -1, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, -1, 0, 1, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, -1, >> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, -1, 0, 0, 0, 1, 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, >> 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 1, 0, 1, 0, 0, 0, -1, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> -1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, >> 0, 1, 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, -1, 1, 0, 0, 0, 0), .Dim = >> c(21L, >> 17L), .Dimnames = list(c("slide 29", "slide 30", "slide 31", >> "slide 32", "slide 33", "slide 34", "slide 35", "slide 36", >> "slide 37", >> "slide 38", "slide 39", "slide 40", "slide 41", "slide 42", >> "slide 43", >> "slide 44", "slide 51", "slide 46", "slide 47", "slide 48", >> "slide 50" >> ), c("M163.16", "M163.17", "M163.18", "M86.13", "M86.14", >> "M86.15", >> "Muninf.10", "Muninf.11", "Muninf.12", "P163.7", "P163.8", >> "P163.9", >> "P86.4", "P86.5", "P86.6", "Puninf.2", "Puninf.3"))) >> >> >> # Fit linear model - commented out becuase we can't really >> send all the >> data in MA >> # fit<- lmFit(MA, design = design) >> # Coefficients not estimable: Puninf.2 Puninf.3 >> >> # Make a contrast matrix: >> A<- >> makeContrasts(PuninfvsMuninf=(**Muninf.10+Muninf.11+Muninf.12-** >> Puninf.2-Puninf.3)/3, >> levels = design) >> >> # and now fit the contrast >> con.fit<- contrasts.fit(fit, contrasts = A) >> Error in contrasts.fit(fit, contrasts = A) : trying to take >> contrast of >> non-estimable coefficient >> >> # now change the column order in the design matrix and row >> order of >> contrast matrix >> # we create a new sort index - arbitrarily placing the last >> two columns in >> the original >> # design matrix in the middle of the new one. Ditto for the >> rows of the >> contrast matrix. >> n<- nrow(A) >> newIndex<- c(1:5, n-1, n, 6:(n-2)) >> designB<- design[,newIndex] >> B<- A[newIndex,] >> attributes(B)<- attributes(A) >> >> # create a new fit, but we get the same knd of error, the last >> two columns >> are >> # identified as non-estimable >> fitB<- lmFit(X$MA, design = designB) >> #Coefficients not estimable: P86.5 P86.6 >> #Warning message: >> #Partial NA coefficients for 16278 probe(s) >> >> >> As far as we can tell, it's the trailing columns that get >> singled out, but >> we don't know why. Hopefully the information about is detailed >> enough, but >> we can provide more information if necessary to help >> troubleshooting. >> >> Thanks in advance, >> >> Joaquin >> >> sessionInfo() >> >> R version 2.14.0 (2011-10-31) >> Platform: i386-apple-darwin9.8.0/i386 (32-bit) >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/en_US.**UTF-8/C/en_US.UTF-8/en_US.UTF- >> **8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods >> base >> >> other attached packages: >> [1] limma_3.10.0 >> >> loaded via a namespace (and not attached): >> [1] tools_2.14.0 >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >> > >> >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: >> http://news.gmane.org/gmane.**science.biology.informatics.** >> conductor<http: news.gmane.org="" gmane.science.biology.informatics.c="" onductor=""> >> >> >> -- James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> >> > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Joaquin, I think you may have painted yourself into a corner with this experiment. You have done something similar to the technical replication examples in section 8.2, but the problem is that you haven't done enough replicates to analyze that way. Note that in the examples in section 8.2, there are at least two identical (or dye swap) slides for each comparison, whereas you only have one. The experiment at the bottom of p.40 only has mu and wt samples, with three technical replicates each, and they used 18 slides! To do something similar, you would need 54 slides, not 21. If this were my experiment to analyze, I would probably go with a single channel analysis with either a blocking variable or duplicateCorrelation() to account for the technical replicates. This is a non-trivial analysis and I would highly recommend getting some help from a local statistician, preferably one with microarray experience. Best, Jim On 3/7/2012 6:26 PM, Joaquin Martinez wrote: > Hi Jim, > > Once again, thank you for your helpful advice. Maybe I was not very > clear in my original email. In the targets file that I sent, we have > both biological replication (e.g. Puninf.1, Puninf.2 and Puninf.3) and > technical replication (e.g. Puninf.1 present in slides29, 30 and 31) > instead of including a common reference sample in all hybridizations. > Ours is more like the "Direct Two-Color Designs" in section 7.4 > including "Technical Replication" as in section 8.2. Based on this, > does your previous advice still hold? Should we remove the numbers > from the sample names? Or are there any further considerations we > should take into account in our analysis? > > I truly appreciate your help. > > Best, > Joaquin > > On Wed, Mar 7, 2012 at 3:07 PM, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > > Hi Joaquin, > > > On 3/7/2012 2:12 PM, Joaquin Martinez wrote: > > Hi Jim, > > Thank you for your quick reply and advice. Actually we had > considered doing exactly what you suggest and remove the > numbers from the sample names. My concern then is what is then > the reference or baseline?, i.e. Puninf.1 or an average value > of Puninf.1, Puninf.2 and Puninf.3. Also, how does removing > numbers take into consideration slight value differences among > biological replicate samples? > > > The baseline is the average of the Puninf samples. Which is, I > believe, what you want. To make the comparisons you request, limma > is going to compute a t-statistic, which simply put is > > difference in means between two samples/ measure of the > variability of the means > > So the numerator of the statistic tells you how different the two > samples are. The denominator (which is a measure of the slight > differences among biological replicates) is used to determine if > the differences between samples is 'large' or not. If there is a > lot of variability within sample types, the numerator has to be > pretty big. Conversely, if there isn't much variability within > sample types then a smaller numerator may be considered > significantly large. > > Note that you aren't forced to use Puninf as the reference. That's > just the easiest way to use modelMatrix(). You can also pass in a > matrix for the 'param' argument, selecting direct comparisons to > make. Or you can just leave Puninf as the reference, and make > further comparisons with a contrast matrix, which is probably simpler. > > Note here that the design matrix with Puninf as reference is > computing implicit comparisons (e.g., the M163 coefficient is > really log2(M163) - log2(Puninf)). So if you want to know if there > is a significant difference between M163 and M86, you can just add > a contrast matrix that looks like > > 1 > -1 > 0 > 0 > > Which simply means that we will be computing log2(M163) - > log2(Puninf) - (log2(M86) - log2(Puninf)). Algebraically, the two > log2(Puninf) will cancel out, and you end up with log2(M163) - > log2(M86). You can create any contrast matrix you want using > makeContrasts(). > > > > In the example we are following in the limma user's guide > there are 3 wild-type mice (wt1,wt2,wt3) and 3 mutant mice > (mu1, mu2, mu3), where wt1 is used as the reference. I think > we have the same type of study but with 6 "mouse types". > > > Unless I am mistaken, you are looking at the wrong example. The > example you reference involves technical replication, where RNA > from each mouse was hybridized to more than one slide. From what > you said originally, each of the samples in your experiment is a > biological replicate. Your experiment is more similar to the > common reference design experiment on p34 of the limma User's Guide. > > Best, > > Jim > > > > Thanks, > Joaquin > > > > > > On Wed, Mar 7, 2012 at 12:01 PM, James W. MacDonald > <jmacdon at="" uw.edu="" <mailto:jmacdon="" at="" uw.edu=""> <mailto:jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">>> wrote: > > Hi Joaquin, > > > On 3/7/2012 11:08 AM, Joaquin Martinez wrote: > > Dear All, > > My colleague Ben and I have fitted a linear model to > microarray data using > the limma package but we get "coefficients not > estimable" for > the last two > columns in the design matrix during the fit. As a > consequence > when trying > to fit our contrast matrix we get the following error > message, > "Error in > contrasts.fit(fit, contrasts = A) : trying to take > contrast of > non-estimable coefficient." We have discovered that if we > simply reorder > the design matrix columns and contrasts matrix rows, we > encounter the same > error. We would very much appreciate if someone could > explain > to us why the > not-estimable coefficients arise, and how to correct > the problem. > > We are following the guide (section 8.2) for limma 3.10.0 > using this > version of R 2.14.0 (sessionInfo at end of email). We > have 6 > different > types of samples (Puninf, P86, P163, Muninf, M86, > M163), and 3 > biological > replicates of each (1,2,3; 4,5,6; 7,8,9; 10,11,12; > 13,14,15; > 16,17,18). > > > Yes, but your targets file has numbers appended to the sample > types, which makes R think they are different. > Hypothetically you > would have noticed this when using the modelMatrix() function: > > > > modelMatrix(targets, ref = "Puninf.1") > Found unique target names: > M163.16 M163.17 M163.18 M86.13 M86.14 M86.15 Muninf.10 > Muninf.11 > Muninf.12 P163.7 P163.8 P163.9 P86.4 P86.5 P86.6 Puninf.1 > Puninf.2 > Puninf.3 > > This indicates that limma thinks all of your replicates are > different sample types. This won't work out (as you have > already > found) because the model will be over specified, which simply > means that you cannot estimate all the parameters you have > in the > model. This is the same as saying you can't determine the > values > of x and y in the formula x + y = 3 (but if you also know > that x > - y = -1, then you can solve for both x and y). > > Now back to the problem at hand. If you remove the numbers from > your sample types, > > targets$Cy3 <- sapply(strsplit(targets$Cy3, "\\."), > function(x) x[1]) > targets$Cy5 <- sapply(strsplit(targets$Cy5, "\\."), > function(x) x[1]) > > then you will create the correct model matrix > > > modelMatrix(targets, ref="Puninf") > Found unique target names: > M163 M86 Muninf P163 P86 Puninf > M163 M86 Muninf P163 P86 > [1,] 0 0 1 0 0 > [2,] 0 0 0 0 1 > [3,] 0 0 0 -1 0 > [4,] 0 -1 1 0 0 > [5,] 1 0 -1 0 0 > [6,] 0 1 0 0 -1 > [7,] -1 0 0 1 0 > [8,] 0 0 -1 0 0 > [9,] 0 0 0 0 -1 > [10,] 0 0 0 1 0 > [11,] 0 1 -1 0 0 > [12,] -1 0 1 0 0 > [13,] 0 -1 0 0 1 > [14,] 1 0 0 -1 0 > [15,] 0 0 1 0 0 > [16,] 0 0 0 0 1 > [17,] 0 0 0 -1 0 > [18,] 0 -1 1 0 0 > [19,] 1 0 -1 0 0 > [20,] 0 1 0 0 -1 > [21,] -1 0 0 1 0 > > Which will compare all samples to Puninf. > > Best, > > Jim > > > > > > > # The targets file looks like this > > targets<- structure(list(SlideNumber = c(29L, 30L, 31L, > 32L, > 33L, 34L, > 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, > 51L, 46L, > 47L, > 48L, 50L), FileName = c("BE T22h slide 29", "BE T22h > slide 30", > "BE T22h slide 31", "BE T22h slide 32", "BE T22h > slide 33", > "BE T22h > slide 34", > "BE T22h slide 35", "BE T22h slide 36", "BE T22h > slide 37", > "BE T22h > slide 38", > "BE T22h slide 39", "BE T22h slide 40", "BE T22h > slide 41", > "BE T22h > slide 42", > "BE T22h slide 43", "BE T22h slide 44", "BE T22h > slide 51", > "BE T22h > slide 46", > "BE T22h slide 47", "BE T22h slide 48", "BE T22h > slide 50"), > Cy3 = c("Puninf.1", "Puninf.1", "P163.7", "M86.13", > "Muninf.10", > "P86.4", "M163.16", "Muninf.11", "P86.5", "Puninf.2", > "Muninf.11", > "M163.17", "M86.14", "P163.8", "Puninf.3", "Puninf.3", > "P163.9", > "M86.15", "Muninf.12", "P86.6", "M163.18"), Cy5 = > c("Muninf.10", > "P86.4", "Puninf.1", "Muninf.10", "M163.16", "M86.13", > "P163.7", > "Puninf.2", "Puninf.2", "P163.8", "M86.14", > "Muninf.11", > "P86.5", "M163.17", "Muninf.12", "P86.6", "Puninf.3", > "Muninf.12", > "M163.18", "M86.15", "P163.9")), .Names = > c("SlideNumber", > "FileName", "Cy3", "Cy5"), class = "data.frame", > row.names > = c(NA, > -21L)) > > # for readability we also paste it in here... > #> targets > # SlideNumber FileName Cy3 Cy5 > # 1 29 BE T22h slide 29 Puninf.1 Muninf.10 > # 2 30 BE T22h slide 30 Puninf.1 P86.4 > # 3 31 BE T22h slide 31 P163.7 Puninf.1 > # 4 32 BE T22h slide 32 M86.13 Muninf.10 > # 5 33 BE T22h slide 33 Muninf.10 M163.16 > # 6 34 BE T22h slide 34 P86.4 M86.13 > # 7 35 BE T22h slide 35 M163.16 P163.7 > # 8 36 BE T22h slide 36 Muninf.11 Puninf.2 > # 9 37 BE T22h slide 37 P86.5 Puninf.2 > # 10 38 BE T22h slide 38 Puninf.2 P163.8 > # 11 39 BE T22h slide 39 Muninf.11 M86.14 > # 12 40 BE T22h slide 40 M163.17 Muninf.11 > # 13 41 BE T22h slide 41 M86.14 P86.5 > # 14 42 BE T22h slide 42 P163.8 M163.17 > # 15 43 BE T22h slide 43 Puninf.3 Muninf.12 > # 16 44 BE T22h slide 44 Puninf.3 P86.6 > # 17 51 BE T22h slide 51 P163.9 Puninf.3 > # 18 46 BE T22h slide 46 M86.15 Muninf.12 > # 19 47 BE T22h slide 47 Muninf.12 M163.18 > # 20 48 BE T22h slide 48 P86.6 M86.15 > # 21 50 BE T22h slide 50 M163.18 P163.9 > > # The design matrix is computed like this design<- > modelMatrix(targets, > ref = "Puninf.1") > # but here is the structure if it is helpful > design<- structure(c(0, 0, 0, 0, 1, 0, -1, 0, 0, 0, 0, > 0, 0, > 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, > 0, 1, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, > 0, 0, 0, 1, 0, -1, 0, 0, 0, -1, 0, 1, 0, 0, 0, 0, 0, > 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 1, 0, -1, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, > 0, 0, 0, 0, -1, 0, 1, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, > 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, > 0, 0, -1, > 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, > 0, 0, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, -1, 0, 0, 0, 1, > 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 1, > 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 1, 0, 1, 0, 0, 0, > -1, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, > -1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, -1, 0, 0, 0, 0, > 0, 0, 0, > 0, 1, 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, -1, 1, 0, 0, 0, 0), > .Dim = > c(21L, > 17L), .Dimnames = list(c("slide 29", "slide 30", > "slide 31", > "slide 32", "slide 33", "slide 34", "slide 35", > "slide 36", > "slide 37", > "slide 38", "slide 39", "slide 40", "slide 41", > "slide 42", > "slide 43", > "slide 44", "slide 51", "slide 46", "slide 47", > "slide 48", > "slide 50" > ), c("M163.16", "M163.17", "M163.18", "M86.13", > "M86.14", > "M86.15", > "Muninf.10", "Muninf.11", "Muninf.12", "P163.7", > "P163.8", > "P163.9", > "P86.4", "P86.5", "P86.6", "Puninf.2", "Puninf.3"))) > > > # Fit linear model - commented out becuase we can't really > send all the > data in MA > # fit<- lmFit(MA, design = design) > # Coefficients not estimable: Puninf.2 Puninf.3 > > # Make a contrast matrix: > A<- > > makeContrasts(PuninfvsMuninf=(Muninf.10+Muninf.11+Muninf.12 -Puninf.2-Puninf.3)/3, > levels = design) > > # and now fit the contrast > con.fit<- contrasts.fit(fit, contrasts = A) > Error in contrasts.fit(fit, contrasts = A) : trying to take > contrast of > non-estimable coefficient > > # now change the column order in the design matrix and row > order of > contrast matrix > # we create a new sort index - arbitrarily placing the last > two columns in > the original > # design matrix in the middle of the new one. Ditto > for the > rows of the > contrast matrix. > n<- nrow(A) > newIndex<- c(1:5, n-1, n, 6:(n-2)) > designB<- design[,newIndex] > B<- A[newIndex,] > attributes(B)<- attributes(A) > > # create a new fit, but we get the same knd of error, > the last > two columns > are > # identified as non-estimable > fitB<- lmFit(X$MA, design = designB) > #Coefficients not estimable: P86.5 P86.6 > #Warning message: > #Partial NA coefficients for 16278 probe(s) > > > As far as we can tell, it's the trailing columns that get > singled out, but > we don't know why. Hopefully the information about is > detailed > enough, but > we can provide more information if necessary to help > troubleshooting. > > Thanks in advance, > > Joaquin > > sessionInfo() > > R version 2.14.0 (2011-10-31) > Platform: i386-apple-darwin9.8.0/i386 (32-bit) > > locale: > [1] > en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets > methods > base > > other attached packages: > [1] limma_3.10.0 > > loaded via a namespace (and not attached): > [1] tools_2.14.0 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > <mailto:bioconductor at="" r-project.org=""> <mailto:bioconductor at="" r-project.org="">> > > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
Hi Jim, Thank you so much for your time and input. I reckon we should try going with a single channel analysis... Hopefully we can still get out of the "corner". Best, Joaquin On Thu, Mar 8, 2012 at 11:25 AM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Joaquin, > > I think you may have painted yourself into a corner with this experiment. > You have done something similar to the technical replication examples in > section 8.2, but the problem is that you haven't done enough replicates to > analyze that way. Note that in the examples in section 8.2, there are at > least two identical (or dye swap) slides for each comparison, whereas you > only have one. > > The experiment at the bottom of p.40 only has mu and wt samples, with > three technical replicates each, and they used 18 slides! To do something > similar, you would need 54 slides, not 21. > > If this were my experiment to analyze, I would probably go with a single > channel analysis with either a blocking variable or duplicateCorrelation() > to account for the technical replicates. This is a non-trivial analysis and > I would highly recommend getting some help from a local statistician, > preferably one with microarray experience. > > Best, > > Jim > > > > > On 3/7/2012 6:26 PM, Joaquin Martinez wrote: > >> Hi Jim, >> >> Once again, thank you for your helpful advice. Maybe I was not very >> clear in my original email. In the targets file that I sent, we have both >> biological replication (e.g. Puninf.1, Puninf.2 and Puninf.3) and technical >> replication (e.g. Puninf.1 present in slides29, 30 and 31) instead of >> including a common reference sample in all hybridizations. Ours is more >> like the "Direct Two-Color Designs" in section 7.4 including "Technical >> Replication" as in section 8.2. Based on this, does your previous advice >> still hold? Should we remove the numbers from the sample names? Or are >> there any further considerations we should take into account in our >> analysis? >> >> I truly appreciate your help. >> >> Best, >> Joaquin >> >> On Wed, Mar 7, 2012 at 3:07 PM, James W. MacDonald <jmacdon@uw.edu<mailto:>> jmacdon@uw.edu>> wrote: >> >> Hi Joaquin, >> >> >> On 3/7/2012 2:12 PM, Joaquin Martinez wrote: >> >> Hi Jim, >> >> Thank you for your quick reply and advice. Actually we had >> considered doing exactly what you suggest and remove the >> numbers from the sample names. My concern then is what is then >> the reference or baseline?, i.e. Puninf.1 or an average value >> of Puninf.1, Puninf.2 and Puninf.3. Also, how does removing >> numbers take into consideration slight value differences among >> biological replicate samples? >> >> >> The baseline is the average of the Puninf samples. Which is, I >> believe, what you want. To make the comparisons you request, limma >> is going to compute a t-statistic, which simply put is >> >> difference in means between two samples/ measure of the >> variability of the means >> >> So the numerator of the statistic tells you how different the two >> samples are. The denominator (which is a measure of the slight >> differences among biological replicates) is used to determine if >> the differences between samples is 'large' or not. If there is a >> lot of variability within sample types, the numerator has to be >> pretty big. Conversely, if there isn't much variability within >> sample types then a smaller numerator may be considered >> significantly large. >> >> Note that you aren't forced to use Puninf as the reference. That's >> just the easiest way to use modelMatrix(). You can also pass in a >> matrix for the 'param' argument, selecting direct comparisons to >> make. Or you can just leave Puninf as the reference, and make >> further comparisons with a contrast matrix, which is probably simpler. >> >> Note here that the design matrix with Puninf as reference is >> computing implicit comparisons (e.g., the M163 coefficient is >> really log2(M163) - log2(Puninf)). So if you want to know if there >> is a significant difference between M163 and M86, you can just add >> a contrast matrix that looks like >> >> 1 >> -1 >> 0 >> 0 >> >> Which simply means that we will be computing log2(M163) - >> log2(Puninf) - (log2(M86) - log2(Puninf)). Algebraically, the two >> log2(Puninf) will cancel out, and you end up with log2(M163) - >> log2(M86). You can create any contrast matrix you want using >> makeContrasts(). >> >> >> >> In the example we are following in the limma user's guide >> there are 3 wild-type mice (wt1,wt2,wt3) and 3 mutant mice >> (mu1, mu2, mu3), where wt1 is used as the reference. I think >> we have the same type of study but with 6 "mouse types". >> >> >> Unless I am mistaken, you are looking at the wrong example. The >> example you reference involves technical replication, where RNA >> from each mouse was hybridized to more than one slide. From what >> you said originally, each of the samples in your experiment is a >> biological replicate. Your experiment is more similar to the >> common reference design experiment on p34 of the limma User's Guide. >> >> Best, >> >> Jim >> >> >> >> Thanks, >> Joaquin >> >> >> >> >> >> On Wed, Mar 7, 2012 at 12:01 PM, James W. MacDonald >> <jmacdon@uw.edu <mailto:jmacdon@uw.edu=""> <mailto:jmacdon@uw.edu>> >> <mailto:jmacdon@uw.edu>>> wrote: >> >> Hi Joaquin, >> >> >> On 3/7/2012 11:08 AM, Joaquin Martinez wrote: >> >> Dear All, >> >> My colleague Ben and I have fitted a linear model to >> microarray data using >> the limma package but we get "coefficients not >> estimable" for >> the last two >> columns in the design matrix during the fit. As a >> consequence >> when trying >> to fit our contrast matrix we get the following error >> message, >> "Error in >> contrasts.fit(fit, contrasts = A) : trying to take >> contrast of >> non-estimable coefficient." We have discovered that if we >> simply reorder >> the design matrix columns and contrasts matrix rows, we >> encounter the same >> error. We would very much appreciate if someone could >> explain >> to us why the >> not-estimable coefficients arise, and how to correct >> the problem. >> >> We are following the guide (section 8.2) for limma 3.10.0 >> using this >> version of R 2.14.0 (sessionInfo at end of email). We >> have 6 >> different >> types of samples (Puninf, P86, P163, Muninf, M86, >> M163), and 3 >> biological >> replicates of each (1,2,3; 4,5,6; 7,8,9; 10,11,12; >> 13,14,15; >> 16,17,18). >> >> >> Yes, but your targets file has numbers appended to the sample >> types, which makes R think they are different. >> Hypothetically you >> would have noticed this when using the modelMatrix() function: >> >> >> > modelMatrix(targets, ref = "Puninf.1") >> Found unique target names: >> M163.16 M163.17 M163.18 M86.13 M86.14 M86.15 Muninf.10 >> Muninf.11 >> Muninf.12 P163.7 P163.8 P163.9 P86.4 P86.5 P86.6 Puninf.1 >> Puninf.2 >> Puninf.3 >> >> This indicates that limma thinks all of your replicates are >> different sample types. This won't work out (as you have >> already >> found) because the model will be over specified, which simply >> means that you cannot estimate all the parameters you have >> in the >> model. This is the same as saying you can't determine the >> values >> of x and y in the formula x + y = 3 (but if you also know >> that x >> - y = -1, then you can solve for both x and y). >> >> Now back to the problem at hand. If you remove the numbers from >> your sample types, >> >> targets$Cy3 <- sapply(strsplit(targets$Cy3, "\\."), >> function(x) x[1]) >> targets$Cy5 <- sapply(strsplit(targets$Cy5, "\\."), >> function(x) x[1]) >> >> then you will create the correct model matrix >> >> > modelMatrix(targets, ref="Puninf") >> Found unique target names: >> M163 M86 Muninf P163 P86 Puninf >> M163 M86 Muninf P163 P86 >> [1,] 0 0 1 0 0 >> [2,] 0 0 0 0 1 >> [3,] 0 0 0 -1 0 >> [4,] 0 -1 1 0 0 >> [5,] 1 0 -1 0 0 >> [6,] 0 1 0 0 -1 >> [7,] -1 0 0 1 0 >> [8,] 0 0 -1 0 0 >> [9,] 0 0 0 0 -1 >> [10,] 0 0 0 1 0 >> [11,] 0 1 -1 0 0 >> [12,] -1 0 1 0 0 >> [13,] 0 -1 0 0 1 >> [14,] 1 0 0 -1 0 >> [15,] 0 0 1 0 0 >> [16,] 0 0 0 0 1 >> [17,] 0 0 0 -1 0 >> [18,] 0 -1 1 0 0 >> [19,] 1 0 -1 0 0 >> [20,] 0 1 0 0 -1 >> [21,] -1 0 0 1 0 >> >> Which will compare all samples to Puninf. >> >> Best, >> >> Jim >> >> >> >> >> >> >> # The targets file looks like this >> >> targets<- structure(list(SlideNumber = c(29L, 30L, 31L, >> 32L, >> 33L, 34L, >> 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, >> 51L, 46L, >> 47L, >> 48L, 50L), FileName = c("BE T22h slide 29", "BE T22h >> slide 30", >> "BE T22h slide 31", "BE T22h slide 32", "BE T22h >> slide 33", >> "BE T22h >> slide 34", >> "BE T22h slide 35", "BE T22h slide 36", "BE T22h >> slide 37", >> "BE T22h >> slide 38", >> "BE T22h slide 39", "BE T22h slide 40", "BE T22h >> slide 41", >> "BE T22h >> slide 42", >> "BE T22h slide 43", "BE T22h slide 44", "BE T22h >> slide 51", >> "BE T22h >> slide 46", >> "BE T22h slide 47", "BE T22h slide 48", "BE T22h >> slide 50"), >> Cy3 = c("Puninf.1", "Puninf.1", "P163.7", "M86.13", >> "Muninf.10", >> "P86.4", "M163.16", "Muninf.11", "P86.5", "Puninf.2", >> "Muninf.11", >> "M163.17", "M86.14", "P163.8", "Puninf.3", "Puninf.3", >> "P163.9", >> "M86.15", "Muninf.12", "P86.6", "M163.18"), Cy5 = >> c("Muninf.10", >> "P86.4", "Puninf.1", "Muninf.10", "M163.16", "M86.13", >> "P163.7", >> "Puninf.2", "Puninf.2", "P163.8", "M86.14", >> "Muninf.11", >> "P86.5", "M163.17", "Muninf.12", "P86.6", "Puninf.3", >> "Muninf.12", >> "M163.18", "M86.15", "P163.9")), .Names = >> c("SlideNumber", >> "FileName", "Cy3", "Cy5"), class = "data.frame", >> row.names >> = c(NA, >> -21L)) >> >> # for readability we also paste it in here... >> #> targets >> # SlideNumber FileName Cy3 Cy5 >> # 1 29 BE T22h slide 29 Puninf.1 Muninf.10 >> # 2 30 BE T22h slide 30 Puninf.1 P86.4 >> # 3 31 BE T22h slide 31 P163.7 Puninf.1 >> # 4 32 BE T22h slide 32 M86.13 Muninf.10 >> # 5 33 BE T22h slide 33 Muninf.10 M163.16 >> # 6 34 BE T22h slide 34 P86.4 M86.13 >> # 7 35 BE T22h slide 35 M163.16 P163.7 >> # 8 36 BE T22h slide 36 Muninf.11 Puninf.2 >> # 9 37 BE T22h slide 37 P86.5 Puninf.2 >> # 10 38 BE T22h slide 38 Puninf.2 P163.8 >> # 11 39 BE T22h slide 39 Muninf.11 M86.14 >> # 12 40 BE T22h slide 40 M163.17 Muninf.11 >> # 13 41 BE T22h slide 41 M86.14 P86.5 >> # 14 42 BE T22h slide 42 P163.8 M163.17 >> # 15 43 BE T22h slide 43 Puninf.3 Muninf.12 >> # 16 44 BE T22h slide 44 Puninf.3 P86.6 >> # 17 51 BE T22h slide 51 P163.9 Puninf.3 >> # 18 46 BE T22h slide 46 M86.15 Muninf.12 >> # 19 47 BE T22h slide 47 Muninf.12 M163.18 >> # 20 48 BE T22h slide 48 P86.6 M86.15 >> # 21 50 BE T22h slide 50 M163.18 P163.9 >> >> # The design matrix is computed like this design<- >> modelMatrix(targets, >> ref = "Puninf.1") >> # but here is the structure if it is helpful >> design<- structure(c(0, 0, 0, 0, 1, 0, -1, 0, 0, 0, 0, >> 0, 0, >> 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, >> 0, 1, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, >> 0, 0, 0, 1, 0, -1, 0, 0, 0, -1, 0, 1, 0, 0, 0, 0, 0, >> 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 1, 0, -1, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, >> 0, 0, 0, 0, -1, 0, 1, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, >> 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, >> 0, 0, -1, >> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, >> 0, 0, 0, 1, 0, 0, 1, -1, 0, 0, 0, 0, -1, 0, 0, 0, 1, >> 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 1, >> 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 1, 0, 1, 0, 0, 0, >> -1, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, 0, >> -1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, -1, 0, 0, 0, 0, >> 0, 0, 0, >> 0, 1, 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >> 0, 0, 0, >> 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, -1, 1, 0, 0, 0, 0), >> .Dim = >> c(21L, >> 17L), .Dimnames = list(c("slide 29", "slide 30", >> "slide 31", >> "slide 32", "slide 33", "slide 34", "slide 35", >> "slide 36", >> "slide 37", >> "slide 38", "slide 39", "slide 40", "slide 41", >> "slide 42", >> "slide 43", >> "slide 44", "slide 51", "slide 46", "slide 47", >> "slide 48", >> "slide 50" >> ), c("M163.16", "M163.17", "M163.18", "M86.13", >> "M86.14", >> "M86.15", >> "Muninf.10", "Muninf.11", "Muninf.12", "P163.7", >> "P163.8", >> "P163.9", >> "P86.4", "P86.5", "P86.6", "Puninf.2", "Puninf.3"))) >> >> >> # Fit linear model - commented out becuase we can't really >> send all the >> data in MA >> # fit<- lmFit(MA, design = design) >> # Coefficients not estimable: Puninf.2 Puninf.3 >> >> # Make a contrast matrix: >> A<- >> makeContrasts(PuninfvsMuninf=(** >> Muninf.10+Muninf.11+Muninf.12-**Puninf.2-Puninf.3)/3, >> levels = design) >> >> # and now fit the contrast >> con.fit<- contrasts.fit(fit, contrasts = A) >> Error in contrasts.fit(fit, contrasts = A) : trying to take >> contrast of >> non-estimable coefficient >> >> # now change the column order in the design matrix and row >> order of >> contrast matrix >> # we create a new sort index - arbitrarily placing the last >> two columns in >> the original >> # design matrix in the middle of the new one. Ditto >> for the >> rows of the >> contrast matrix. >> n<- nrow(A) >> newIndex<- c(1:5, n-1, n, 6:(n-2)) >> designB<- design[,newIndex] >> B<- A[newIndex,] >> attributes(B)<- attributes(A) >> >> # create a new fit, but we get the same knd of error, >> the last >> two columns >> are >> # identified as non-estimable >> fitB<- lmFit(X$MA, design = designB) >> #Coefficients not estimable: P86.5 P86.6 >> #Warning message: >> #Partial NA coefficients for 16278 probe(s) >> >> >> As far as we can tell, it's the trailing columns that get >> singled out, but >> we don't know why. Hopefully the information about is >> detailed >> enough, but >> we can provide more information if necessary to help >> troubleshooting. >> >> Thanks in advance, >> >> Joaquin >> >> sessionInfo() >> >> R version 2.14.0 (2011-10-31) >> Platform: i386-apple-darwin9.8.0/i386 (32-bit) >> >> locale: >> [1] >> en_US.UTF-8/en_US.UTF-8/en_US.**UTF-8/C/en_US.UTF-8/en_US.UTF-**8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets >> methods >> base >> >> other attached packages: >> [1] limma_3.10.0 >> >> loaded via a namespace (and not attached): >> [1] tools_2.14.0 >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >> > >> <mailto:bioconductor@r-**project.org <bioconductor@r-project.org=""> >> >> <mailto:bioconductor@r-**project.org <bioconductor@r-project.org=""> >> >> >> >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: >> http://news.gmane.org/gmane.**science.biology.informatics.** >> conductor<http: news.gmane.org="" gmane.science.biology.informatics.c="" onductor=""> >> >> >> -- James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> >> >> -- James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> >> > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- Joaquin Martinez Martinez, Ph.D Post Doctoral Research Scientist Bigelow Laboratory for Ocean Sciences 180 McKown Point P.O. Box 475 West Boothbay Harbor, Maine 04575-0475 Office phone: 202-747-3255 EXT-403 e-mail:jmartinez@bigelow.org [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 619 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6