LIMMA : group specific effect of a treatment while controlling for individual effect
3
0
Entering edit mode
SPbeginner • 0
@spbeginner-15170
Last seen 5.7 years ago

Dear Bioconductor community,

I have timecourse RNA-seq data.
Animals received 4 differents treatments (group = GpA, GpB, GpCD, GpE).
There are 6 animals in each group.
For each animal,blood were collacted before any treatment (D0) and at 3 time points after treatment: D3, D7 and D28.
So, the experiment could be described as follows :

Animal    Time    Group Rep
M657    0    GpA 1
M657    3    GpA 1
M657    7    GpA 1
M657    28    GpA 1
M658    0    GpA 2
M658    3    GpA 2
M658    7    GpA 2
M658    28    GpA 2
.
.
.
.
D627    0    GpB 1
D627    3    GpB 1
D627    7    GpB 1
D627    28    GpB 1
D628    0    GpB 2
D628    3    GpB 2
D628    7    GpB 2
D628    28    GpB 2
.
.
.
.

With rep being a factor value for Animal-Group which distinguishes the individual nested within a group.


I want to determine :
1/ Which genes respond at either the Day3, Day7 or day28 in the different group
for exemple :
GpA : Day3vsDay0 Day7vsDay0, Day28vsDay0
do I have to analyse each Group separatly ? Or I can analyse them all together ?
Because each subject yields a value for each time point, can I take "Animal" effect into my model ? For this, I use block option in duplicateCorrelation().

Here is what I did :

TS <- paste(Design_clean$Group, Design_clean$Time, sep=".")
Design_clean_simple <-data.frame(Group=TS,Replicate=factor(Design_clean$Rep),row.names=row.names(Design_clean))
Blocked_Design_model<-model.matrix(~0+Group,data=Design_clean_simple)
dge <- DGEList(counts=Count)
dge.norm <- calcNormFactors(dge)

v.wts.bl<- voomWithQualityWeights(dge.norm, Blocked_Design_model, plot=TRUE)
corfit.wts.bl <- duplicateCorrelation(v.wts.bl,Blocked_Design_model,block=Design_clean_simple$Replicate)
v.wts.bl.corfit <- voomWithQualityWeights(dge.norm, Blocked_Design_model, plot=TRUE, correlation=corfit.wts.bl$consensus)
fit.wts.bl.corfit <- lmFit(v.wts.bl.corfit, Blocked_Design_model,block=Design_clean_simple$Replicate,correlation=corfit.wts.bl$consensus)

To identify DEG at Day3, Day7 and Day28 when compared to baseline (Day0) :

contrast.matrix.blocked<-makeContrasts(GpA.D3=GroupGpA.3-GroupGpA.0, GpA.D7=GroupGpA.7-GroupGpA.0, GpA.D28=GroupGpA.28-GroupGpA.0
 levels=colnames(Blocked_Design_model))

fit2.wts.bl.corfit <- contrasts.fit(fit.wts.bl.corfit, contrast.matrix.blocked)
fit2.wts.bl.corfit <- eBayes(fit2.wts.bl.corfit)

2/ Which genes respond differently over time in the different groups ie :
Which genes respond differently over time in the GpA relative to the GpB?

So, to answer this question, here is the contrast matrix, as an exemple
Which genes respond differently at Day3 in the GpA relative to the GpB :

contrast.matrix.blocked<-makeContrasts(GpAvsGpB.D3 = (GroupA.3-GroupA.0)-(GroupB.3-GroupB.0),levels=colnames(Blocked_Design_model))
it2.wts.bl.corfit <- contrasts.fit(fit.wts.bl.corfit, contrast.matrix.blocked)
fit2.wts.bl.corfit <- eBayes(fit2.wts.bl.corfit)

Am I doing right ?

 

 

limma-voom design matrix limma time-course limma design matrix limma • 2.4k views
ADD COMMENT
0
Entering edit mode

None of your tags are limma; the maintainers don't get notified if the post is not tagged with the package name.

ADD REPLY
0
Entering edit mode

Ok, thanks for your comment, I admit that's was not clear enought; I edit my post to make it more understandable.

So with your recommandations :

TS <- paste(Design_clean$Group, Design_clean$Time, sep=".")
Design_clean_simple <-data.frame(Group=TS,Animal=factor(Design_clean$Animal),row.names=row.names(Design_clean))

head(Design_clean_simple)
Animal Group
M657 GpA.0
M657 GpA.3
M657 GpA.7
M657 GpA.28
M658 GpA.0
M658 GpA.3
M658 GpA.7
M658 GpA.28
.
.
.
.
D627 GpB.0
D627 GpB.3
D627 GpB.7
D627 GpB.28
D628 GpB.0
D628 GpB.3
D628 GpB.7
D628 GpB.28
Blocked_Design_model<-model.matrix(~0+Group,data=Design_clean_simple)
dge <- DGEList(counts=Count)
dge.norm <- calcNormFactors(dge)

v.wts.bl<- voomWithQualityWeights(dge.norm, Blocked_Design_model, plot=TRUE)
corfit.wts.bl <- duplicateCorrelation(v.wts.bl,Blocked_Design_model,block=Design_clean_simple$Animal)
v.wts.bl.corfit <- voomWithQualityWeights(dge.norm, Blocked_Design_model, plot=TRUE, correlation=corfit.wts.bl$consensus)
fit.wts.bl.corfit <- lmFit(v.wts.bl.corfit, Blocked_Design_model,block=Design_clean_simple$Replicate,correlation=corfit.wts.bl$consensus)

contrast.matrix.blocked<-makeContrasts(GpA.D3=GroupGpA.3-GroupGpA.0, GpA.D7=GroupGpA.7-GroupGpA.0, GpA.D28=GroupGpA.28-GroupGpA.0
 levels=colnames(Blocked_Design_model))

fit2.wts.bl.corfit <- contrasts.fit(fit.wts.bl.corfit, contrast.matrix.blocked)
fit2.wts.bl.corfit <- eBayes(fit2.wts.bl.corfit,robust=TRUE)
ADD REPLY
0
Entering edit mode

I have another question :

contrast.matrix.blocked<-makeContrasts(GpAvsGpB.D3 = (GroupA.3-GroupA.0)-(GroupB.3-GroupB.0),levels=colnames(Blocked_Design_model))
it2.wts.bl.corfit <- contrasts.fit(fit.wts.bl.corfit, contrast.matrix.blocked)
fit2.wts.bl.corfit <- eBayes(fit2.wts.bl.corfit)

 

how to interpret the logFC in this example ? 

Am I right if i say : genes with logFC >0 are genes that are up-regulated in GroupA at day3 vs day0  but not affected (or down-regulated) in GroupeB at day3 vs day0 ?

ADD REPLY
0
Entering edit mode

No. Think of the possible options.

GroupA.3-GroupA.0 > 0 > GroupB.3-GroupB.0 => logFC > 0
GroupA.3-GroupA.0 > GroupB.3-GroupB.0 > 0 => logFC > 0
0 > GroupA.3-GroupA.0 > GroupB.3-GroupB.0 => logFC > 0

So it could be anything, as long as the 3 vs 0 log-fold change in group A is greater than that in group B.

ADD REPLY
1
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 9 hours ago
The city by the bay

This looks fine to me. A few suggestions:

  • Make sure you've filtered out low-abundance before running calcNormFactors, etc.
  • I usually like running eBayes with robust=TRUE to protect against outliers.
  • I would put Animal in the design matrix (and resolve any linear dependencies) rather than blocking on it with duplicateCorrelation. The latter is slightly anticonservative, and you don't need it as you are not comparing directly between different animals. Your comparisons are either within animals (GpA.D3) or between log-fold changes (GpAvsGpB.D3).
ADD COMMENT
1
Entering edit mode

Actually I don't think the code is right because OP has computed duplicateCorrelation for 'Rep' instead of for 'Animal'. If this was corrected, I think the use of duplicateCorrelation would be fine here.

ADD REPLY
0
Entering edit mode

'Rep' is the same as 'animal' : it's a factor value for Animal-Group which distinguishes the individual nested within a group.

Each Group will have animal numbered between 1-6. Is it better to let Animal ID in my model ?

ADD REPLY
0
Entering edit mode

Is my code correct like this : 

TS <- paste(Design_clean$Group, Design_clean$Time, sep=".")
Design_clean_simple <-data.frame(Group=TS,Animal=factor(Design_clean$Animal),row.names=row.names(Design_clean))
Blocked_Design_model<-model.matrix(~0+Group,data=Design_clean_simple)
dge <- DGEList(counts=Count)
dge.norm <- calcNormFactors(dge)

v.wts.bl<- voomWithQualityWeights(dge.norm, Blocked_Design_model, plot=TRUE)
corfit.wts.bl <- duplicateCorrelation(v.wts.bl,Blocked_Design_model,block=Design_clean_simple$Animal)
v.wts.bl.corfit <- voomWithQualityWeights(dge.norm, Blocked_Design_model, plot=TRUE, correlation=corfit.wts.bl$consensus)
fit.wts.bl.corfit <- lmFit(v.wts.bl.corfit, Blocked_Design_model,block=Design_clean_simple$Replicate,correlation=corfit.wts.bl$consensus)

contrast.matrix.blocked<-makeContrasts(GpA.D3=GroupGpA.3-GroupGpA.0, GpA.D7=GroupGpA.7-GroupGpA.0, GpA.D28=GroupGpA.28-GroupGpA.0
 levels=colnames(Blocked_Design_model))

fit2.wts.bl.corfit <- contrasts.fit(fit.wts.bl.corfit, contrast.matrix.blocked)
fit2.wts.bl.corfit <- eBayes(fit2.wts.bl.corfit,robust=TRUE)
ADD REPLY
0
Entering edit mode

'Rep' is not the same as 'Animal'. In the table that you show, 'Animal' takes on 12 values while 'Rep' takes on only 6 values. The vector that you input to limma as the block to duplicateCorrelation must take on 12 values.

It is hard to check your code with certainty, considering that all your code works from a data frame called 'Design_clean' that has different columns from the table that you actually show us. The table you show has a column called 'Treatment', but no 'Group'. The code works from 'Group but not 'Treatment'. Why show us one table but work from another?

And, to be pedantic, the table you show is not a "design matrix". In limma, it is called the "targets frame".

 

ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 3 hours ago
WEHI, Melbourne, Australia

Your new code (posted 9 March 2018) looks correct, except that I would insert the line

keep <- filterByExpr(dge, Blocked_Design_model)
dge <- dge[keep,,keep.lib.sizes=FALSE]

before calcNormFactors. That was also Aaron's no. 1 suggestion.

The function filterByExpr was added to edgeR recently, on 26 Dec 2017, so you may need to re-install edgeR to get it.

With the approach taken, your experiment is just a one-way layout, so forming any contrasts between the groups or times is very straightforward, as I think you already appreciate.

Of course, as always, it is good practice to examine plots at each stage of your analysis, and check that the consensus correlation is positive, rather than just running all the code through as a block.

ADD COMMENT
0
Entering edit mode

Could you specify what version of edgeR the function filterByExpr can be found in? I am working in version 3.20.9 and it isn't found. Reinstalling with biocLite doesn't seem to resolve the issue. 

ADD REPLY
0
Entering edit mode

3.20.9 is ancient, we're on 3.34.9 now. Make sure you're running the latest R (3.4.*) and read:

https://www.bioconductor.org/install/#troubleshoot-bioconductor-packages

ADD REPLY

Login before adding your answer.

Traffic: 916 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6