Question

EdgeR experimental design matrix

0

Entering edit mode

bpshiel • 0

@bpshiel-10817

Last seen 8.8 years ago

hi,

i'm looking for some guidance on experimental design in EdgeR. The manual and examples i can find just dont cover what im looking for and im really stuck.

My experimental design is set up like this.

Group	Tank	survival
C	1	2
C	1	2
C	2	5
C	2	5
C	3	6
C	3	6
G	4	7
G	4	7
G	5	4
G	5	4
G	6	5
G	6	5
U	7	8
U	7	8
U	8	4
U	8	4
U	9	2
U	9	2

I want to test for the difference between groups (C-G, U-C,U-G) while taking Tank and Survival into account. Each sample is a seperate animal , Tank refers to the tank the animals were kept, and Survival refers to how many animals were left in that tank at the end of the experiment out of 10.

My problem in edgeR is that that i can't estimate dispersion no matter what designs i try. I believe it's because my factors are all nested in the groups i want to compare (Tanks are specific to groups and the survival factor for each animal is the same in each tank)

for example

design= Group:Tank +Survival

example eroor msgs when trying to estimate dispersion:

" Warning message:
In estimateGLMCommonDisp.default(y = y$counts, design = design, :
No residual df: setting dispersion to NA"

Can anyone help me out?

edger differential gene expression experimental design • 1.5k views

ADD COMMENT • link updated 8.8 years ago by Aaron Lun ★ 28k • written 8.8 years ago by bpshiel • 0

score 2 · Answer 1 · 2016-06-01

Let's tackle the overall problem first. As it is, you can't do your comparisons. This is because your tanks are nested within your groups, so any attempt to block on the tanks will absorb any genuine DE between groups. On the other hand, ignoring the tank effect is not viable either, as you'll end up with dependencies between samples from the same tank that will distort your dispersion estimates/likelihood ratio tests.

Now, if you were using voom and limma, I would suggest blocking on the tanks with duplicateCorrelation. This will account for the dependencies while still allowing you to do comparisons between groups. However, this functionality is not available in edgeR, so the only other option is to use sumTechReps to sum counts within each tank. This will give 9 tank-level samples that can be analyzed with ~ Group + Survival in the design matrix (using the group and survival values of each tank). There's no dependencies between samples, because you've added all dependent samples from each tank into a single sample.

Obviously, you lose a bit of information with this summation strategy, because you can't take advantage of the variability within each tank. This is not ideal, but it's the best that can be done with edgeR for your current experimental design.