Hi everyone:
I have a problem with the result of DESeq2 . When I dealt with case-control research, I used following commands . case=8 , control num=8. I just want the DEGs in case and control.
> library("DESeq2")
> countData <- read.table('featureCounts.txt', header=TRUE, row.names=1)
> countData <- countData[ ,7:ncol(countData)];
> countData <- as.matrix(countData);
> condition <- factor(c(rep('C0', 8),rep('C1', 8)));
> (colData <- data.frame(row.names=colnames(countData), condition));
condition
SRR2422919 C0
SRR2422920 C0
SRR2422921 C0
SRR2422922 C0
SRR2422923 C0
SRR2422924 C0
SRR2422925 C0
SRR2422926 C0
SRR2422927 C1
SRR2422928 C1
SRR2422929 C1
SRR2422930 C1
SRR2422931 C1
SRR2422932 C1
SRR2422933 C1
SRR2422934 C1
> dds <- DESeqDataSetFromMatrix(countData=countData, colData=colData, design=~ condition);
> dds <- dds[ rowSums(counts(dds)) >= 2, ];dds$condition <- relevel(dds$condition, ref='C1');
> dds <- DESeq(dds);
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
-- replacing outliers and refitting for 452 genes
-- DESeq argument 'minReplicatesForReplace' = 7
-- original counts are preserved in counts(dds)
estimating dispersions
fitting model and testing
> res <- results(dds,contrast=c("condition","C0","C1"), alpha=0.050000);
> (summary(res));
I got a error result ,but I don't know why. And in the result of res ,all the padj is same value. Is there a limit number of replicate? Or there is something wrong in my commands or input data?
out of 50569 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up) : 0, 0%
LFC < 0 (down) : 0, 0%
outliers [1] : 0, 0%
low counts [2] : 6, 0.012%
(mean count < 0)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
NULL
Thank you for any advice you can provide!
Best wishes!
Wei.
my sessionInfo()
> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936
[2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
[3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936
[4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_People's Republic of China.936
attached base packages:
[1] parallel stats4 stats graphics grDevices
[6] utils datasets methods base
other attached packages:
[1] DESeq2_1.18.1 SummarizedExperiment_1.8.1
[3] DelayedArray_0.4.1 matrixStats_0.53.1
[5] Biobase_2.38.0 GenomicRanges_1.30.3
[7] GenomeInfoDb_1.14.0 IRanges_2.12.0
[9] S4Vectors_0.16.0 BiocGenerics_0.24.0
loaded via a namespace (and not attached):
[1] RColorBrewer_1.1-2 checkmate_1.8.5
[3] cluster_2.0.7-1 rstudioapi_0.7
[5] magrittr_1.5 acepack_1.4.1
[7] gtable_0.2.0 zlibbioc_1.24.0
[9] memoise_1.1.0 data.table_1.11.4
[11] RCurl_1.95-4.10 base64enc_0.1-3
[13] XVector_0.18.0 pillar_1.2.3
[15] htmltools_0.3.6 stringr_1.3.1
[17] genefilter_1.60.0 splines_3.4.4
[19] Formula_1.2-3 lattice_0.20-35
[21] bit_1.1-14 survival_2.42-3
[23] annotate_1.56.2 htmlwidgets_1.2
[25] plyr_1.8.4 locfit_1.5-9.1
[27] knitr_1.20 gridExtra_2.3
[29] Matrix_1.2-14 GenomeInfoDbData_1.0.0
[31] digest_0.6.15 colorspace_1.3-2
[33] AnnotationDbi_1.40.0 geneplotter_1.56.0
[35] stringi_1.1.7 RSQLite_2.1.1
[37] lazyeval_0.2.1 Hmisc_4.1-1
[39] tibble_1.4.2 compiler_3.4.4
[41] bit64_0.9-7 rpart_4.1-13
[43] backports_1.1.2 htmlTable_1.12
[45] BiocParallel_1.12.0 xtable_1.8-2
[47] DBI_1.0.0 munsell_0.4.3
[49] Rcpp_0.12.17 XML_3.98-1.11
[51] blob_1.1.1 ggplot2_2.2.1
[53] latticeExtra_0.6-28 tools_3.4.4
[55] foreign_0.8-70 bitops_1.0-6
[57] nnet_7.3-12 scales_0.5.0
[59] rlang_0.2.1 grid_3.4.4
Thanks for your reply!
I plot my data ,and here is my code and picture.
vsd <- vst(dds, blind=FALSE);plotPCA(vsd,intgroup=c("condition"))
I still confuse about the result . If the padj in res is the same value ,it means there is no DEGs in my data. Am I right?
Here is part of my res, and padj is almost the same .
baseMean log2FoldChange lfcSE stat pvalue padj
ENSG00000223972.5_2 6.5451997638089 -0.0189375878914287 0.729114828678411 -0.0259733956114359 0.979278558516615 0.999831031256582
ENSG00000227232.5_2 120.847064381123 -0.186905549119038 0.329696931365903 -0.566901088052916 0.570781384906547 0.999831031256582
ENSG00000243485.5_3 0.808323540891408 -0.074012933209564 1.37926898289051 -0.0536609857306127 0.957205266868574 0.999831031256582
ENSG00000240361.2_3 0.239939440001618 1.14714451288691 3.00810143961501 0.381351671781961 0.702942316641613 0.999831031256582
ENSG00000238009.6_4 7.07309580387112 0.304727068397541 0.527410209655461 0.57777999518934 0.563412662998333 0.999831031256582
ENSG00000233750.3_2 8.51424947357038 0.481305868904192 0.645876990754807 0.74519742271932 0.456152380689177 0.999831031256582
ENSG00000237683.5 44.6016455674963 -0.482214380258619 0.608012199829965 -0.793099843051625 0.42771965427787 0.999831031256582
ENSG00000268903.1_3 0.210424498191674 -0.678764339221412 2.60481045070771 -0.260581087210011 0.794415576134089 0.999831031256582
ENSG00000239906.1_3 0.992353998475917 0.24558917118856 1.09000383248799 0.225310374026841 0.821737829837437 0.999831031256582
ENSG00000241860.6_3 40.9000005523611 0.0323096767435896 0.357865331760903 0.0902844558443465 0.928061172140727 0.999831031256582
It looks like the differences between groups are not larger in general than the differences among biological replicates, which gives you a visual sense of the lack of small p-values.
The padj being the same is a natural consequence of the method of Benjamini and Hochberg.