Dear all,
I am doing DGE multifactorial analysis with EdgeR but I have a problem
with somes locus I obtained a large logFC value , perhaps because any
comparision there is some zero.
Is this possible? or maybe I have some error
Thank for your help, waiting your response.
Regards,
This is the scrip that I am using:
d.all<- readDGE(counts.all, skip = 5, comment.char = "!")
colnames(d.all$counts)<-counts.all[,3]
cpmd.all<-cpm(d.all)
d.all <- d.all[rowSums(cpmd.all > 1) >= 2, ]
d.all<- calcNormFactors(d.all)
designall<-model.matrix(~time+time:d.all$samples$group)
d.all <- estimateGLMCommonDisp(d.all, designall)
glmfit.all <- glmFit(d.all, designall,dispersion =
d.all$common.dispersion)
lrt.all <- glmLRT(d.all, glmfit.all, coef=4)
topTags(lrt.all,sort.by = "logFC")
logConc logFC LR P.Value FDR
AT1G22220 -12.30768 -144269488 62.70746 2.398056e-15 2.995171e-12
AT1G08430 -13.45102 -144269487 37.11160 1.115581e-09 2.960891e-07
AT4G11180 -13.84678 144269487 38.22337 6.309211e-10 1.835116e-07
AT5G41880 -13.36909 -144269486 33.88640 5.842556e-09 1.255052e-06
AT1G04610 -13.41465 -144269486 33.44700 7.323248e-09 1.480900e-06
AT2G24850 -14.18267 -144269486 25.85859 3.673666e-07 3.979742e-05
AT5G44635 -13.98784 -144269486 24.61663 6.994557e-07 6.736796e-05
AT1G35690 -13.60305 144269486 27.92875 1.258656e-07 1.590776e-05
AT1G67328 -13.27975 -144269486 24.43246 7.696091e-07 7.262716e-05
AT1G51460 -13.77582 -144269486 24.12274 9.038647e-07 8.123359e-05
sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] edgeR_2.5.5 limma_3.10.3
loaded via a namespace (and not attached):
[1] tools_2.14.1
Sandra Isabel Gonzalez Morales
Graduet Student PhD
Centro de Investigacion y de Estudios Avanzados del I.P.N
Laboratorio Nacional deGenomica
para la Biodiversidad
Km. 9.6 Libramiento Norte Carretera Irapuato-Leon
Irapuato, Gto. M?xico
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Dear Sandra,
It doesn't seem surprising to get large fold changes if the counts are
all
zero in one of groups (have you checked whether this is so?).
However you are using an old version of Bioconductor, and we only
answer
questions on the current version. The current version of edgeR
automatically moderates the logFC towards zero to avoid very large
values,
even when all the counts are zero in one of the groups.
Best wishes
Gordon
> Date: Wed, 23 May 2012 17:12:48 -0500 (CDT)
> From: sgonzalez at ira.cinvestav.mx
> To: bioconductor at r-project.org
> Subject: [BioC] EdgeR: problem with large logFC value
>
>
> Dear all,
>
> I am doing DGE multifactorial analysis with EdgeR but I have a
problem
> with somes locus I obtained a large logFC value , perhaps because
any
> comparision there is some zero.
> Is this possible? or maybe I have some error
>
> Thank for your help, waiting your response.
> Regards,
>
> This is the scrip that I am using:
>
> d.all<- readDGE(counts.all, skip = 5, comment.char = "!")
> colnames(d.all$counts)<-counts.all[,3]
> cpmd.all<-cpm(d.all)
> d.all <- d.all[rowSums(cpmd.all > 1) >= 2, ]
> d.all<- calcNormFactors(d.all)
> designall<-model.matrix(~time+time:d.all$samples$group)
> d.all <- estimateGLMCommonDisp(d.all, designall)
> glmfit.all <- glmFit(d.all, designall,dispersion =
d.all$common.dispersion)
> lrt.all <- glmLRT(d.all, glmfit.all, coef=4)
> topTags(lrt.all,sort.by = "logFC")
> logConc logFC LR P.Value FDR
> AT1G22220 -12.30768 -144269488 62.70746 2.398056e-15 2.995171e-12
> AT1G08430 -13.45102 -144269487 37.11160 1.115581e-09 2.960891e-07
> AT4G11180 -13.84678 144269487 38.22337 6.309211e-10 1.835116e-07
> AT5G41880 -13.36909 -144269486 33.88640 5.842556e-09 1.255052e-06
> AT1G04610 -13.41465 -144269486 33.44700 7.323248e-09 1.480900e-06
> AT2G24850 -14.18267 -144269486 25.85859 3.673666e-07 3.979742e-05
> AT5G44635 -13.98784 -144269486 24.61663 6.994557e-07 6.736796e-05
> AT1G35690 -13.60305 144269486 27.92875 1.258656e-07 1.590776e-05
> AT1G67328 -13.27975 -144269486 24.43246 7.696091e-07 7.262716e-05
> AT1G51460 -13.77582 -144269486 24.12274 9.038647e-07 8.123359e-05
>
> sessionInfo()
> R version 2.14.1 (2011-12-22)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] edgeR_2.5.5 limma_3.10.3
>
> loaded via a namespace (and not attached):
> [1] tools_2.14.1
>
>
> Sandra Isabel Gonzalez Morales
> Graduet Student PhD
> Centro de Investigacion y de Estudios Avanzados del I.P.N
> Laboratorio Nacional deGenomica
> para la Biodiversidad
> Km. 9.6 Libramiento Norte Carretera Irapuato-Leon
> Irapuato, Gto. M?xico
>
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}