Hi All,
I have 12 human donors, each donor sample has been treated and also non-treatment. And, these 9 samples have been classified into 3 status. The final goal is to find out the difference among 3 status. The list as:
FileName | Donor | Treatment | Status | ||
Donor1_Inf | Donor1 | inf | H | ||
Donor1_Un | Donor1 | Un | H | ||
Donor2_Inf | Donor2 | inf | L | ||
Donor2_Un | Donor2 | Un | L | ||
Donor3_Inf | Donor3 | inf | H | ||
Donor3_Un | Donor3 | Un | H | ||
Donor4_Inf | Donor4 | inf | L | ||
Donor4_Un | Donor4 | Un | L | ||
Donor5_Inf | Donor5 | inf | L | ||
Donor5_Un | Donor5 | Un | L | ||
Donor6_Inf | Donor6 | inf | L | ||
Donor6_Un | Donor6 | Un | L | ||
Donor7_Inf | Donor7 | inf | M | ||
Donor7_Un | Donor7 | Un | M | ||
Donor8_Inf | Donor8 | inf | H | ||
Donor8_Un | Donor8 | Un | H | ||
Donor9_Inf | Donor9 | inf | H | ||
Donor9_Un | Donor9 | Un | H | ||
Donor10_Inf | Donor10 | inf | M | ||
Donor10_Un | Donor10 | Un | M | ||
Donor11_Inf | Donor11 | inf | M | ||
Donor11_Un | Donor11 | Un | M | ||
Donor12_Inf | Donor12 | inf | M | ||
Donor12_Un | Donor12 | Un | M |
Can I set the limma to perform the analysis as this:
Treat1 <- factor(paste(target_hr96_new$Status,target_hr96_new$Treatment,sep = "."))
design1 <- model.matrix(~ 0 + Treat1)
colnames(design1) <- levels(Treat1)
corfit1 <- duplicateCorrelation(selNormEset_hr96,design1,block = YWJ_NK_hr96_lumi_new$Donor)
fit1 <- lmFit(selNormEset_hr96,design1,block = YWJ_NK_hr96_lumi_new$Donor,correlation = corfit1$consensus)
cm1 <- makeContrasts(demo1 = (H.inf - H.Un) - (L.inf - L.Un),
demo2 = (H.inf - H.Un) - (M.inf - M.Un),
demo3 = (M.inf - M.Un) - (L.inf - L.Un),
levels = design1)
I don't feel confidence to make the contrast as above, can I minus the Un(control) from inf to minimize the variation? If I directly look at the H.inf - L.inf, I worry the bg variation among the donors will cause the false positive.
Can anyone give me suggestion, comments? Thanks.
Thanks again, Aaron. one more question, I may not really understand the "duplicateCorrelation" function, the "consensus" of duplicateCorrelation result indicated the correlation, so the higher value means more related? Is there a threshold to cutoff how close of these samples?
The correlation represents the dependencies between samples from the same donor. (The "consensus" terminology comes from the fact that the correlation estimate is stabilized by sharing information across many genes.) This quantifies the impact of a donor-specific effect that makes samples from the same donor more similar than expected under independence. Use it as it is, you're not meant to apply a cut-off of any kind to this value.
And, when I apply this contrast to my analysis, the adj-P values of most gene are > 0.1, all most of them are at approx 0.9. That's the low sample size caused or the slight inherent difference caused? I feel it's hard to conclude so.
The reasons for loss of power are varied. The changes may be too small; your samples may be too variable; and/or your sample size may be too small. Many of these things are determined by the experimental design, beyond the ability of the analyst to change. The only suggestion I can make is to check that you don't have any outlier samples that might be inflating the variance, and if there are, consider using
arrayWeights
to downweight the offending samples.When I done the QC/QA, I already removed two paired outlier samples. I hesitate to use the arrayWeight in case the outlier "black sheep" samples caused false positive DEgene.