Question

How to determine if clinical variables are responsible for gene expression with limma

0

Entering edit mode

Jon Manning ▴ 40

@jon-manning-3420

Last seen 8.8 years ago

United Kingdom

Hello, I'm new to the Bioconductor list, and fairly new to Bioconductor itself, so excuse me if the following is a stupid question- I've been looking around the list and documentation for a while without finding my answer. The short version of my question is "What is the most appropriate way to determine if microarray-derived gene expression is associated with any of a number of continuous and discrete clinical variables, independent of patient group/ treatment type?". The long version, with my attempt at this analysis is as follows: I'm currently analysing a single-channel Agilent microarray data set involving 29 patients in three clinical groups. I've been using limma, and think I've got the methods right for comparing those groups, like: clinical_group <- c(3,2,1,1,2,3,3,1,2,1,2,3,1,3,1,2,2,1,1,3,2,3,2,1,1,2,3,2,3) design <- model.matrix(~ 0+factor(clinical_group)) colnames(design) <- c("one", "two", "three") fit <- lmFit(esetPROC, design) comparisons <- c("one-three", "one-two", "three-two") contrast.matrix <- makeContrasts(contrasts=comparisons, levels=design) fit2 <- contrasts.fit(fit, contrast.matrix) fit2 <- eBayes(fit2) ....... where esetPROC is an expression set object containing normalised and corrected expression values. However, I also have a number of continuous and discrete clinical variables associated with these patients. I'm interested in seeing if any of these variables are associated with high or low gene expression. Referring to this thread... http://thread.gmane.org/gmane.science.biology.informatics.conductor/11 402/focus=11409 ... I attempted to do this with a design in limma in the following manner: design <- model.matrix(~ 0+var1+var2+var3) fit <- lmFit(esetPROC, design) fit2 <- eBayes(fit) , where var1 etc are continuous clinical variables. When using all the variables, I get very few probes significantly associated with the variables. However, if I employ only one variable at a time, all variables (even non-sensical variables such as the day of the month a patient was born) seem to produce hundreds or thousands of probes with significant adjusted p-values. I assume this is because I'm mis-understanding fundamentally something that's going on here (I'm not a mathematician), and mis-applying the method. I'd appreciate any pointers as regards where I'm going wrong here- and where my misconceptions may lie. Regards, Jon Manning -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

Microarray limma Microarray limma • 1.2k views

ADD COMMENT • link 15.6 years ago Jon Manning ▴ 40