Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.4 years ago
Hi,
I am using the package edgeR to determine differentially expressed
genes in RNA-seq data. I have 8 samples, corresponding to 3 patients
(2 conditions per patient and 2 patients duplicated). I performed the
contrast of these two conditions, following the user guide (section 11
Case study: Oral carcinomas vs matched normal tissue). When analyzing
some border line differentially expressed genes (FDR ~ 0.02) I found
that in some cases, the read counts was really low, e.g. only one
sample with 2 reads, and the others (7) with 0 counts.
> countsTable[rownames(countsTable)=="ENSG00000207696",grep("CT",colna
mes(countsTable))]
61_CT.poli 61_CT.tot 67_CT.poli 67_CT.tot 70_CT.poli
70_CT.tot 61m2_CT.tot 67m2_CT.tot
ENSG00000207696 0 0 0 0 2
0 0 0
The design matrix used was:
> design
patientsCT61 patientsCT67 patientsCT70 as.factor(1 -
(as.numeric(groupsCT) - 1))1
1 1 0 0
0
2 0 1 0
0
3 0 0 1
0
4 1 0 0
0
5 0 1 0
0
6 1 0 0
1
7 0 1 0
1
8 0 0 1
1
The result for the gene in question was:
> tab.DGEensCT[rownames(tab.DGEensCT)=="ENSG00000207696",]
table.logConc table.logFC table.PValue table.FDR
adjust.method comparison
ENSG00000207696 -18.34 7.726 0.005222 0.03941
BH as.factor(1 - (as.numeric(groupsCT) - 1))1
When investigating the result of the gene, I found that the glm-fitted
values looked like this:
> format((glmfit.DGEensCT$fitted.values)[rownames(glmfit.DGEensCT$fitt
ed.values)=="ENSG00000207696",],scientific=FALSE,digits=4)
61_CT.tot 67_CT.tot 70_CT.tot 61m2_CT.tot 67m2_CT.tot
61_CT.poli 67_CT.poli 70_CT.poli
"0.00008988" "0.00004048" "0.06229493" "0.00026943" "0.00014959"
"0.03479503" "0.03522840" "1.84247667"
Since I was not able to find the details used in edgeR to calculate
the model easily, I was wondering if from this glm-fitted values is it
reasonable to obtain such a low FDR?
I used the common dispersion for the variance estimate:
DGEensCT.CR<-estimateCRDisp(DGEensCT,design)
glmfit.DGEensCT <- glmFit(DGEensCT, design,dispersion =
DGEensCT.CR$CR.common.dispersion)
lrt.DGEensCT <- glmLRT(DGEensCT, glmfit.DGEensCT)
Could someone help me with this?
Thank you very much in advanced!
Cheers,
Luc??a
-- output of sessionInfo():
R version 2.12.0 (2010-10-15)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] biomaRt_2.6.0 edgeR_2.0.5
loaded via a namespace (and not attached):
[1] RCurl_1.4-3 XML_3.2-0 limma_3.6.9
--
Sent via the guest posting facility at bioconductor.org.