Hello,
I am working with a dataset composed of three treatment groups. I have paired samples that consist of a baseline sample and the same sample which has been enriched for active transcription. The batch variable accounts for batches during sample preparation. The experimental design looks like this:
SampleName | Treatment | Batch | Pair.nested |
RT1_Input | RT | 2 | 1 |
RT1_IP | RT | 2 | 1 |
RT2_Input | RT | 2 | 2 |
RT2_IP | RT | 2 | 2 |
RT3_Input | RT | 1 | 3 |
RT3_IP | RT | 1 | 3 |
RT4_Input | RT | 1 | 4 |
RT4_IP | RT | 1 | 4 |
C1_Input | C | 2 | 1 |
C1_IP | C | 2 | 1 |
C2_Input | C | 2 | 2 |
C2_IP | C | 2 | 2 |
C3_Input | C | 1 | 3 |
C3_IP | C | 1 | 3 |
C4_Input | C | 1 | 4 |
C4_IP | C | 1 | 4 |
W1_Input | W | 2 | 1 |
W1_IP | W | 2 | 1 |
W2_Input | W | 2 | 2 |
W2_IP | W | 2 | 2 |
W3_Input | W | 1 | 3 |
W3_IP | W | 1 | 3 |
W4_Input | W | 1 | 4 |
W4_IP | W | 1 | 4 |
|
I need to compare C vs RT, W vs RT, and C vs W for differential expression due to treatment while controlling for the enrichment and the batch effects. I've used design = ~Pair.nested + Batch + Treatment. Does this design properly control for the IP/Input enrichment? Should I approach this differently? How should I specify the contrasts?
R version 3.1.1 (2014-07-10)
DESeq2 1.4.5
## Platform: i386-w64-mingw32/i386 (32-bit)
##
## locale:
## [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] RColorBrewer_1.1-2 gplots_2.16.0 DESeq2_1.4.5
## [4] RcppArmadillo_0.4.650.1.1 Rcpp_0.11.4 GenomicRanges_1.16.4
## [7] GenomeInfoDb_1.0.2 IRanges_1.22.10 BiocGenerics_0.10.0
## [10] genefilter_1.44.0 lattice_0.20-29 gtools_3.4.1
##
## loaded via a namespace (and not attached):
## [1] annotate_1.40.1 AnnotationDbi_1.26.1 Biobase_2.22.0 BiocStyle_1.0.0
## [5] bitops_1.0-6 caTools_1.17.1 DBI_0.3.1 digest_0.6.8
## [9] evaluate_0.5.5 formatR_1.0 gdata_2.13.3 geneplotter_1.40.0
## [13] grid_3.1.1 htmltools_0.2.6 KernSmooth_2.23-14 knitr_1.9
## [17] locfit_1.5-9.1 rmarkdown_0.5.1 RSQLite_1.0.0 splines_3.1.1
## [21] stats4_3.1.1 stringr_0.6.2 survival_2.37-7 tools_3.1.1
## [25] XML_3.98-1.1 xtable_1.7-4 XVector_0.2.0
"differential expression due to treatment while controlling for the enrichment and the batch effects"
Just to check: you mean that you want to find genes which have differential expression due to treatment in the enrichment samples relative to baseline?
I am looking for differential expression in the enriched samples due to treatment after normalizing (adjusting?) for the enrichment. I have tried using FC ratios from DESeq2 run for each condition individually using enriched vs unenriched (IP/Input) for the contrast. The resulting fold change would be the normalized enrichment ratio. Then manually, divide the treatment ratios, C by RT etc. for an expression ratio for each contrast. I'm using DESeq2 with the design above and finding very few genes meeting padj<0.1 for any contrast. I have also run a standard gene expression analysis on the input (unenriched) samples to obtain a list of expressed genes as one would normally see.
Many thanks for any assistance you can offer,
Susan