Question

edgeR ANOVA testing

0

Entering edit mode

Michael Breen ▴ 370

@michael-breen-5999

Last seen 10.6 years ago

Hi all, I am looking for a bit of enlightenment regarding edgeR. We are applying edgeR using ANOVA-like testing. We have 2 conditions (Cases and Controls) sampled at 2 treatments (untreated and treated) with 64 biological replicates per condition (64 X 2 = 128 samples). We want to test 3 ways: 1. DE between untreated and treated for cases. 2. DE between untreated and treated for controls. 3. DE between untreated and treated responding differently between Cases and Controls. I am curious if my design matrix is set up properly and if we are testing with the right coefficients and contrasts as seen below to address these three approaches. Any insight is helpful. Yours, Michael rawdata <- read.delim("Matrix.txt", check.names=FALSE, stringsAsFactors=FALSE) targets <- read.delim("Described.txt", check.names=FALSE, stringsAsFactors=FALSE) head(targets) Condition Patient Treatments 1 Case 4060 untreated 2 Case 4060 treated 3 Case 4096 untreated 4 Case 4096 treated 5 Case 4099 untreated 6 Case 4099 treated y <- DGEList(counts=rawdata[,2:129], genes=rawdata[,1:1]) keep <- rowSums(cpm(y)>1) >= 64 y <- y[keep,] dim (y) y <- calcNormFactors(y) y$samples design <- model.matrix(~Condition+Condition:Patient+Condition: Treatments, data=targets) colnames(design) [1] "(Intercept)" "ConditionControl" [3] "ConditionCase:Patient" "ConditionControl:Patient" [5] "ConditionCase:Treatments " "ConditionControl:Treatments " y <- estimateGLMCommonDisp(y, design, verbose=TRUE) y <- estimateGLMTrendedDisp(y, design) y <- estimateGLMTagwiseDisp(y, design) fit <- glmFit(y, design) colnames(fit) # DE between untreated and treated for cases lrt <- glmLRT(fit, coef="ConditionCase: Treatments ") # DE between untreated and treated for cases lrt2 <- glmLRT(fit, coef="ConditionControl:Treatments ") # DE between untreated and treated responding differently between cases and controls lrt3 <- glmLRT(fit, contrast=c(0,0,0,0,1,-1) [[alternative HTML version deleted]]

edgeR edgeR • 1.5k views

ADD COMMENT • link 11.8 years ago Michael Breen ▴ 370

0

Entering edit mode

Gordon, Any insight on the below? I am looking for a bit of enlightenment regarding edgeR. We are applying edgeR using ANOVA-like testing. We have 2 conditions (Cases and Controls) sampled at 2 treatments (untreated and treated) with 64 biological replicates per condition (64 X 2 = 128 samples). We want to test 3 ways: 1. DE between untreated and treated for cases. 2. DE between untreated and treated for controls. 3. DE between untreated and treated responding differently between Cases and Controls. I am curious if my design matrix is set up properly and if we are testing with the right coefficients and contrasts as seen below to address these three approaches. Any insight is helpful. Yours, Michael rawdata <- read.delim("Matrix.txt", check.names=FALSE, stringsAsFactors=FALSE) targets <- read.delim("Described.txt", check.names=FALSE, stringsAsFactors=FALSE) head(targets) Condition Patient Treatments 1 Case 4060 untreated 2 Case 4060 treated 3 Case 4096 untreated 4 Case 4096 treated 5 Case 4099 untreated 6 Case 4099 treated y <- DGEList(counts=rawdata[,2:129], genes=rawdata[,1:1]) keep <- rowSums(cpm(y)>1) >= 64 y <- y[keep,] dim (y) y <- calcNormFactors(y) y$samples design <- model.matrix(~Condition+Condition:Patient+Condition: Treatments, data=targets) colnames(design) [1] "(Intercept)" "ConditionControl" [3] "ConditionCase:Patient" "ConditionControl:Patient" [5] "ConditionCase:Treatments " "ConditionControl:Treatments " y <- estimateGLMCommonDisp(y, design, verbose=TRUE) y <- estimateGLMTrendedDisp(y, design) y <- estimateGLMTagwiseDisp(y, design) fit <- glmFit(y, design) colnames(fit) # DE between untreated and treated for cases lrt <- glmLRT(fit, coef="ConditionCase: Treatments ") # DE between untreated and treated for cases lrt2 <- glmLRT(fit, coef="ConditionControl:Treatments ") # DE between untreated and treated responding differently between cases and controls lrt3 <- glmLRT(fit, contrast=c(0,0,0,0,1,-1) [[alternative HTML version deleted]]

ADD REPLY • link 11.8 years ago Michael Breen ▴ 370

score 0 · Answer 1 · 2013-06-25

Dear Michael,

The analysis not correct at the moment because you have not declared Patient to be a factor in R. Because the patient identifiers are numbers ("4060" etc), R by default thinks that Patient is to be treated as a numerical covariate. You need to use the factor() function to tell R that the entries for Patient are just identifiers, not numerical
quantities.

Otherwise the analysis might be fine, but I can't tell for sure without seeing all 128 rows of the targets frame. I think that you may be following Section 3.5 of the edgeR User's Guide. Look carefully at how Patient is defined in that example.

There needs to be a column in the design matrix for each Patient, except for the first in each condition group.

Best wishes
Gordon