Question

questions of using Limma: should I include all the samples?

0

Entering edit mode

Wu, Xiwei ▴ 350

@wu-xiwei-1102

Last seen 10.2 years ago

Dear all, I am analyzing one set of data designed as follows: Biological questions: what does an inhibitor (B) do to the hormone stimulation effect (A) of a cell line. Samples: 1) untreated (C) 2) treated with Hormone alone (C+ A) 3) treated with Inhibitor alone (C+ B) 3) treated with hormone + inhibitor (C + A + B + AB) There are three replicates each group (totally 12 CEl files). Platform: Affymetrix GeneChip I am assuming this is a 2X2 factorial desgn. My interpretation of the biological question is the intersection of genes showing hormone and inhibitor interaction effects (AB) and the genes responding to hormone (A) alone. I am not trained as statistician, so correct me if I am wrong. I am trying to use Limma with design matrix of 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 to estimate the four coefficinet of C, C+ A, C+B and C+A+B+AB (of course, I can estimate A, B, and AB directly using a different design matrix). Since the contrast of interest is A and AB, so the contrast matrix should be: -1 1 0 0 -1 -1 -1 1 My question is: 1) Are the design and contrast matrix correct? 2) I know this is a very naive question, but if I am only interested in hormone only effect, can I just use the untreated and hormone alone treated samples as the input (so instead of the 12 CEL files, only use the first 6 CEL files)? Will the analysis result be the same or different if not counting the normalization-produced difference? If there is difference, is that due to the difference of df? Thanks in advance. Xiwei Wu Assistant Research Scientist Beckman Research Institute City of Hope National Medical Center Duarte, CA 91010 ----------------------------------------------------------- SECURITY/CONFIDENTIALITY WARNING: This message and any atta...{{dropped}}

limma limma • 829 views

ADD COMMENT • link updated 19.8 years ago by Fangxin Hong ▴ 810 • written 19.8 years ago by Wu, Xiwei ▴ 350

score 0 · Answer 1 · 2005-02-07

> I am trying to use Limma with design matrix of > > 1 0 0 0 > 1 0 0 0 > 1 0 0 0 > 0 1 0 0 > 0 1 0 0 > 0 1 0 0 > 0 0 1 0 > 0 0 1 0 > 0 0 1 0 > 0 0 0 1 > 0 0 0 1 > 0 0 0 1 > > to estimate the four coefficinet of C, C+ A, C+B and C+A+B+AB (of course, > I > can estimate A, B, and AB directly using a different design matrix). > > Since the contrast of interest is A and AB, so the contrast matrix should > be: > -1 1 0 0 > -1 -1 -1 1 > > My question is: > 1) Are the design and contrast matrix correct? If your design matrix is right, then your contrast marix is not right, as the (-1,-1,-1,1) will give you estimate of AB-2C, but not AB. I would suggest you estimate C, A, B, and AB using design matrix 1 0 0 0 (only C) 1 1 0 0(C+A) 1 0 1 0(C+B) 1 1 1 1 (C+A+B+AB) and construct your contrast as 0 1 0 0 (test A) 0 0 0 1 (test AB) > 2) I know this is a very naive question, but if I am only interested in hormone only effect, can I just use the untreated and hormone alone treated > samples as the input (so instead of the 12 CEL files, only use the first 6 > CEL files)? Will the analysis result be the same or different if not counting the normalization-produced difference? If there is difference, is > that due to the difference of df? Well, this will only affect your error variance estimation, since you lose power for it. Usually less genes will be identified out using subset of the data, is indeed you can assume one model for all 12 data sets. Hopefull this would help. Fangxin -- Fangxin Hong, Ph.D. Plant Biology Laboratory The Salk Institute 10010 N. Torrey Pines Rd. La Jolla, CA 92037 E-mail: fhong@salk.edu