edgeR ANOVA testing
1
0
Entering edit mode
Michael Breen ▴ 370
@michael-breen-5999
Last seen 10.2 years ago
Hi all, I am looking for a bit of enlightenment regarding edgeR. We are applying edgeR using ANOVA-like testing. We have 2 conditions (Cases and Controls) sampled at 2 treatments (untreated and treated) with 64 biological replicates per condition (64 X 2 = 128 samples). We want to test 3 ways: 1. DE between untreated and treated for cases. 2. DE between untreated and treated for controls. 3. DE between untreated and treated responding differently between Cases and Controls. I am curious if my design matrix is set up properly and if we are testing with the right coefficients and contrasts as seen below to address these three approaches. Any insight is helpful. Yours, Michael rawdata <- read.delim("Matrix.txt", check.names=FALSE, stringsAsFactors=FALSE) targets <- read.delim("Described.txt", check.names=FALSE, stringsAsFactors=FALSE) head(targets) Condition Patient Treatments 1 Case 4060 untreated 2 Case 4060 treated 3 Case 4096 untreated 4 Case 4096 treated 5 Case 4099 untreated 6 Case 4099 treated y <- DGEList(counts=rawdata[,2:129], genes=rawdata[,1:1]) keep <- rowSums(cpm(y)>1) >= 64 y <- y[keep,] dim (y) y <- calcNormFactors(y) y$samples design <- model.matrix(~Condition+Condition:Patient+Condition: Treatments, data=targets) colnames(design) [1] "(Intercept)" "ConditionControl" [3] "ConditionCase:Patient" "ConditionControl:Patient" [5] "ConditionCase:Treatments " "ConditionControl:Treatments " y <- estimateGLMCommonDisp(y, design, verbose=TRUE) y <- estimateGLMTrendedDisp(y, design) y <- estimateGLMTagwiseDisp(y, design) fit <- glmFit(y, design) colnames(fit) # DE between untreated and treated for cases lrt <- glmLRT(fit, coef="ConditionCase: Treatments ") # DE between untreated and treated for cases lrt2 <- glmLRT(fit, coef="ConditionControl:Treatments ") # DE between untreated and treated responding differently between cases and controls lrt3 <- glmLRT(fit, contrast=c(0,0,0,0,1,-1) [[alternative HTML version deleted]]
edgeR edgeR • 1.4k views
ADD COMMENT
0
Entering edit mode
Gordon, Any insight on the below? I am looking for a bit of enlightenment regarding edgeR. We are applying edgeR using ANOVA-like testing. We have 2 conditions (Cases and Controls) sampled at 2 treatments (untreated and treated) with 64 biological replicates per condition (64 X 2 = 128 samples). We want to test 3 ways: 1. DE between untreated and treated for cases. 2. DE between untreated and treated for controls. 3. DE between untreated and treated responding differently between Cases and Controls. I am curious if my design matrix is set up properly and if we are testing with the right coefficients and contrasts as seen below to address these three approaches. Any insight is helpful. Yours, Michael rawdata <- read.delim("Matrix.txt", check.names=FALSE, stringsAsFactors=FALSE) targets <- read.delim("Described.txt", check.names=FALSE, stringsAsFactors=FALSE) head(targets) Condition Patient Treatments 1 Case 4060 untreated 2 Case 4060 treated 3 Case 4096 untreated 4 Case 4096 treated 5 Case 4099 untreated 6 Case 4099 treated y <- DGEList(counts=rawdata[,2:129], genes=rawdata[,1:1]) keep <- rowSums(cpm(y)>1) >= 64 y <- y[keep,] dim (y) y <- calcNormFactors(y) y$samples design <- model.matrix(~Condition+Condition:Patient+Condition: Treatments, data=targets) colnames(design) [1] "(Intercept)" "ConditionControl" [3] "ConditionCase:Patient" "ConditionControl:Patient" [5] "ConditionCase:Treatments " "ConditionControl:Treatments " y <- estimateGLMCommonDisp(y, design, verbose=TRUE) y <- estimateGLMTrendedDisp(y, design) y <- estimateGLMTagwiseDisp(y, design) fit <- glmFit(y, design) colnames(fit) # DE between untreated and treated for cases lrt <- glmLRT(fit, coef="ConditionCase: Treatments ") # DE between untreated and treated for cases lrt2 <- glmLRT(fit, coef="ConditionControl:Treatments ") # DE between untreated and treated responding differently between cases and controls lrt3 <- glmLRT(fit, contrast=c(0,0,0,0,1,-1) [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 9 minutes ago
WEHI, Melbourne, Australia

Dear Michael,

The analysis not correct at the moment because you have not declared Patient to be a factor in R.  Because the patient identifiers are numbers ("4060" etc), R by default thinks that Patient is to be treated as a numerical covariate.  You need to use the factor() function to tell R that the entries for Patient are just identifiers, not numerical
quantities.

Otherwise the analysis might be fine, but I can't tell for sure without seeing all 128 rows of the targets frame.  I think that you may be following Section 3.5 of the edgeR User's Guide.  Look carefully at how Patient is defined in that example.

There needs to be a column in the design matrix for each Patient, except for the first in each condition group.

Best wishes
Gordon

ADD COMMENT

Login before adding your answer.

Traffic: 752 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6