Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.3 years ago
Hello Bioconductor mailing list,
I'm using edgeR to analyze RNA-seq and RIP-seq data. I'm using the
moderated gene-wise dispersion method where I choose a prior.n based
upon the number of samples I have and how much shrinkage I desire to
reduce variance in my data. I've followed the R user guide and a guide
a professor made at my university to achieve gene-wise (tag-wise)
dispersion as opposed to common dispersion. I can see that the tag-
wise dispersion changes for each gene when I change the prior.n value
(see below). However I notice that that p-values and logFC values
don't change when I change the prior.n (see below).
So I'm wondering if I'm doing something wrong or if the p-values
and/or logFC are not supposed to change? Or is it just some data is
not effected at all by prior.n?
The following is an example of my command line code analyzing some
RNA-seq data. I illustrated an example where I used a prior.n = 1 or 4
to illustrate how I don't see a change in the final results. Is this
correct?
Thanks
J
normFact = calcNormFactors(TotalReads)
treatments = substr(colnames(TotalReads), 1, 3)
d = DGEList(counts = TotalReads, group = treatments, genes =
rownames(TotalReads), lib.size = colSums(TotalReads) * normFact)
d = estimateCommonDisp(d)
d$common.dispersion
[1] 0.1669584
#comparing prior.n 1 v. 4
d1 = estimateTagwiseDisp(d, prior.n = 1)
d1
$tagwise.dispersion
[1] 0.1607020 0.3674590 0.1058665 0.1080470 0.1545314
20684 more elements ???
d4 = estimateTagwiseDisp(d, prior.n = 4)
d4
$tagwise.dispersion
[1] 0.1463446 0.4059551 0.1278219 0.1213169 0.2150826
20684 more elements ...
#prior.n = 1
edgeHSMvHSFd1 = exactTest(d1, pair = c("HSM", "HSF"), (common.disp =
FALSE ))
head(edgeHSMvHSFd1)
An object of class "DGEExact"
$table
logFC logCPM PValue
ENSG00000000003 -0.07130289 6.290327 6.073496e-01
ENSG00000000005 -2.28060195 -4.439294 1.000000e+00
ENSG00000000419 0.25083103 4.119127 8.780027e-02
ENSG00000000457 -0.19453270 4.557886 8.468506e-02
ENSG00000000460 -0.05015655 1.770160 9.075606e-01
ENSG00000000938 0.92221495 4.526491 1.227442e-11
#prior.n = 4
edgeHSMvHSFd4 = exactTest(d4, pair = c("HSM", "HSF"), (common.disp =
FALSE ))
head(edgeHSMvHSFd4)
An object of class "DGEExact"
$table
logFC logCPM PValue
ENSG00000000003 -0.07130289 6.290327 6.073496e-01
ENSG00000000005 -2.28060195 -4.439294 1.000000e+00
ENSG00000000419 0.25083103 4.119127 8.780027e-02
ENSG00000000457 -0.19453270 4.557886 8.468506e-02
ENSG00000000460 -0.05015655 1.770160 9.075606e-01
ENSG00000000938 0.92221495 4.526491 1.227442e-11
sum(edgeHSMvHSFd1$table[,1]>0)
[1] 12770
sum(edgeHSMvHSFd4$table[,1]>0)
[1] 12770
#same results for DE genes for both
sum(edgeHSMvHSFd1$table[,1]>0 & edgeHSMvHSFd1$table[,3]<0.01)
[1] 1763
sum(edgeHSMvHSFd4$table[,1]>0 & edgeHSMvHSFd4$table[,3]<0.01)
[1] 1763
-- output of sessionInfo():
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] edgeR_2.6.2 limma_3.12.0
loaded via a namespace (and not attached):
[1] tools_2.15.0
--
Sent via the guest posting facility at bioconductor.org.
Hi Ming,
This has nothing to do with voom. It is the same message that you will always receive from limma when you try to fit an over-parametrized model. (By comparison, edgeR would have simply told you the design matrix was not of full rank and stopped.)
You can only used a paired analysis (+Patient etc) when the treatments to be compared are applied to the same patient. You can't compare different tumour types (KRas and KRasP) which affect different patients.
Gordon