Question

Interaction terms in DESeq2 contrasts

1

Entering edit mode

dwzhang.wisc ▴ 20

@dwzhangwisc-10040

Last seen 9.0 years ago

Hello,

I have a question about setting up interaction terms in DESeq2 following the recent update. My experiment consists of 42 total samples, from 14 donors, across three different cell populations, and two different disease states (healthy control vs RA). My sample block is as follows:

 patient population disease
      x1         CM      HC
      x1         EM      HC
      x1         Na      HC
      x2         CM      HC
      x2         EM      HC
      x2         Na      HC

.....

I want to understand the differences between disease and healthy samples. My design is:

~disease + disease:population + disease:patient

Previously, I used the following contrast but received the following error:

> res<-results(dds, contrast=list(c("diseaseRA","diseaseRA.populationCM"),c("diseaseHC","diseaseHC.populationCM")))
Error in checkContrast(contrast, resNames) :
  all elements of the contrast as a list of length 2 should be elements of 'resultsNames(object)'

I saw in a previous thread that I can set up a numerical contrast as well. So I tried this approach:

n=model.matrix(~disease+disease:population+disease:patient,samples)
ctrst = colMeans(n[samples$disease == "RA" & samples$population == "CM",]) -
  colMeans(n[samples$disease == "HC" & samples$population == "CM",])
res<-results(dds, contrast=ctrst)

This new model worked, but the results were slightly off from my run in a previous DESeq2 version:

Current run:

baseMean log2FoldChange     lfcSE      stat       pvalue        padj
                   <numeric>      <numeric> <numeric> <numeric>    <numeric>   <numeric>
chr1_713980_714315  7.711915      0.3961093 0.4481465 0.8838834 3.767592e-01 0.585865655
chr1_840122_840415 17.376991      0.4691477 0.2750717 1.7055470 8.809246e-02 0.241106534
chr1_856580_856838  8.218786      1.0523656 0.4238437 2.4829094 1.303142e-02 0.072174032

Previous run:

 baseMean log2FoldChange     lfcSE      stat       pvalue       padj
                   <numeric>      <numeric> <numeric> <numeric>    <numeric>  <numeric>
chr1_713980_714315  7.711915      0.3549742 0.4386707 0.8092043 4.183976e-01 0.62903224
chr1_840122_840415 17.376991      0.4412814 0.2698125 1.6355112 1.019419e-01 0.27137040
chr1_856580_856838  8.218786      1.0165040 0.4156541 2.4455524 1.446304e-02 0.08100218

Am I doing the contrast correctly? If so, why are the values slightly different from my previous run? Is it because DESeq2 now turns off log2FC shrinkage for all terms if an interaction term is present? Thank you in advanced for the help. My session info is attached below.

Best,

Dave

> sessionInfo()
R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
 [1] grid      parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] cluster_2.0.3              BiocParallel_1.4.3         VennDiagram_1.6.16         futile.logger_1.4.1        ggplot2_2.1.0              plyr_1.8.3                
 [7] RColorBrewer_1.1-2         gplots_2.17.0              DESeq2_1.10.1              RcppArmadillo_0.6.600.4.0  Rcpp_0.12.3                SummarizedExperiment_1.0.2
[13] Biobase_2.30.0             GenomicRanges_1.22.4       GenomeInfoDb_1.6.3         IRanges_2.4.8              S4Vectors_0.8.11           BiocGenerics_0.16.1       

loaded via a namespace (and not attached):
 [1] XVector_0.10.0       bitops_1.0-6         futile.options_1.0.0 tools_3.2.4          zlibbioc_1.16.0      rpart_4.1-10         RSQLite_1.0.0        annotate_1.48.0     
 [9] gtable_0.2.0         lattice_0.20-33      DBI_0.3.1            gridExtra_2.2.1      genefilter_1.52.1    caTools_1.17.1       gtools_3.5.0         locfit_1.5-9.1      
[17] nnet_7.3-12          AnnotationDbi_1.32.3 XML_3.98-1.4         survival_2.38-3      foreign_0.8-66       gdata_2.17.0         latticeExtra_0.6-28  Formula_1.2-1       
[25] geneplotter_1.48.0   lambda.r_1.1.7       Hmisc_3.17-2         scales_0.4.0         splines_3.2.4        xtable_1.8-2         colorspace_1.2-6     labeling_0.3        
[33] KernSmooth_2.23-15   acepack_1.3-3.3      munsell_0.4.3

deseq2 interactions design and contrast matrix contrast • 6.3k views

ADD COMMENT • link 9.1 years ago dwzhang.wisc ▴ 20

score 1 · Answer 1 · 2016-04-06

Thanks for the response Michael. Let me clarify: I have 7 patients who are healthy controls (HC) and 7 patients with rheumatoid arthritis (RA). For each patient, we isolated three different cell populations, labelled Na, CM, and EM. I want to compare whether the Na samples from RA patients are different from the Na samples from HC, or the CM samples from RA patients from the CM samples from HC. I previously used the character/list-style contrasts. It worked with the older versions of DESeq2, but since updating it, I have not been able to get it to work. I keep getting the following error:

> dds<-DESeqDataSetFromMatrix(countData=table,colData=samples,
+                             design=~disease+disease:population+disease:patient)
> dds<-DESeq(dds,parallel=T)
> res<-results(dds, contrast=list(c("diseaseRA","diseaseRA.populationCM"),c("diseaseHC","diseaseHC.populationCM")))
Error in checkContrast(contrast, resNames) :
  all elements of the contrast as a list of length 2 should be elements of 'resultsNames(object)'

Here are the output from the results:

> resultsNames(dds)
 [1] "Intercept"              "disease_RA_vs_HC"       "diseaseHC.populationEM" "diseaseRA.populationEM" "diseaseHC.populationNa" "diseaseRA.populationNa"
 [7] "diseaseHC.patientx2"    "diseaseRA.patientx2"    "diseaseHC.patientx3"    "diseaseRA.patientx3"    "diseaseHC.patientx4"    "diseaseRA.patientx4"   
[13] "diseaseHC.patientx5"    "diseaseRA.patientx5"    "diseaseHC.patientx6"    "diseaseRA.patientx6"    "diseaseHC.patientx7"    "diseaseRA.patientx7"

I switched to the numeric contrast after I read one of your older posts.

http://seqanswers.com/forums/showthread.php?p=153077#post153077

I would love to continue using a character or list style contrasts, but unfortunately, I can't get it to work. Thanks again for your help.

Dave

score 0 · Answer 2 · 2016-04-06

Can you say more about patient and disease? Are the patients in one disease state or another? How many in each?

I don't know about that numeric contrast, it's hard to see what's going on and easy to make mistakes. I prefer using character- or list-style contrasts. You should be able to make any comparison with these.

You say "I want to understand the differences between disease and healthy samples", can you be much more specific? What question are you aiming to answer with respect to the three cell populations?