I am attempting to analyze single-channel Agilent micro array data from two different groups. I have numerous covariates for each individual within these groups and have specified categorical covariates using the factor() method.
Sex <- targets[,"Sex"] #Note: targets is the target file containing array filenames and covariates. Levels <-c("1", "2") Sex <- factor(Sex, levels=Levels)
In order to build my model matrix with Class (the thing I am ultimately interested in comparing), batch, sex, and pH... I use the following command.
design <-model.matrix(~Class + Batch + Sex + pH)
And if I view this model matrix...
(Intercept) ClassControl Batch2 Sex2 pH 1 1 0 1 0 6.29 2 1 0 1 1 6.44 3 1 1 1 1 6.20
Question 1: Why does the model matrix statement add "control" to the first variable and "2" to the two... but not the last?
To fit the model, I use the following code...
fit <- lmFit(y0, design) #Note: y0 has been normalized and background corrected fit <-eBayes(fit, trend=TRUE) results <- topTable(fit, n=Inf, coef=2) write.table(results, 'results.txt', sep='\t', col.names=TRUE, row.names=FALSE)
Question 2: From what I read, the contrast statement is made by topTable() and not the model matrix. Thus, by specifying coef=2, I am telling limma that I'm interested in comparing the 2nd column, which corresponds to "Class." But, because the other covariates were including in the model statement, the model is adjusting for those covariates. Is that correct?
Question 3: Is there any way to view the fit (AIC/BIC) for the topTable() method. Essentially, I have other covariates and would like some objective measure on how the model is effected when they are added to the model.
I appreciate your help/feedback.
Thank you for the feedback.
Follow up Question: I want topTable() to show the regression slopes. Seeing that this is not an option in the documentation, I need to extract slopes from the MArrayLM object and pertinent information from topTable.
Does this look correct? It is a bit difficult for me to follow the information in the MAarrayLM object, so I am a bit unsure about my approach.
If there is a more efficient way to do this, I would love to hear it (I am new to R).
The regression slope is already in the topTable output. It is the column called "logFC".
So... yea... that's probably an indication that I should get some sleep.
Thanks again =)