Dear Michael,
I selected an Age cutoff, median of the overlapping ages for the groups male and female. I used cut() successfully but when the DESeq() analysis was running, it halted with this error:
Error in optim(betaRow, objectiveFn, method = "L-BFGS-B", lower = -large, :
L-BFGS-B needs finite values of 'fn'
I checked my DESeq2 version 1.4.5.
R version 3.1.1 (2014-07-10)
Platform: i386-w64-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] DESeq2_1.4.5 RcppArmadillo_0.4.450.1.0 Rcpp_0.11.3
[4] GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7
[7] BiocGenerics_0.8.0 genefilter_1.44.0 lattice_0.20-29
[10] gtools_3.4.1
loaded via a namespace (and not attached):
[1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0 BiocStyle_1.0.0
[5] DBI_0.3.1 geneplotter_1.40.0 grid_3.1.1 locfit_1.5-9.1
[9] RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.1.1 stats4_3.1.1
[13] survival_2.37-7 tools_3.1.1 XML_3.98-1.1 xtable_1.7-4
I worked around the error by converting Age to two factors, O and Y and proceeded successfully with DESeq()
My current snag is extracting the correct results from the analyzed dataset. First as per your instruction,
> resultsNames(DE_Analysis)
[1] "Intercept" "Age_Y_vs_O"
[3] "Gender_Male_vs_Female" "Phenotype_flexible_vs_inflexible"
[5] "GenderMale.Phenotypeflexible"
> results(DE_Analysis, contrast=list("Phenotype_flexible_vs_inflexible"))
Error in results(DE_Analysis, contrast = list("Phenotype_flexible_vs_inflexible")) :
'contrast', as a list, should have length 2,
see the manual page of ?results for more information
> results(DE_Analysis, contrast=list(c("Phenotype_flexible_vs_inflexible","GenderMale.Phenotypeflexible")))
Error in results(DE_Analysis, contrast = list(c("Phenotype_flexible_vs_inflexible", :
'contrast', as a list, should have length 2,
see the manual page of ?results for more information
I am successful with:
> results(DE_Analysis, contrast=c("Phenotype","flexible", "inflexible"))
log2 fold change (MAP): Phenotype flexible vs inflexible
Wald test p-value: Phenotype flexible vs inflexible
DataFrame with 30553 rows and 6 columns
baseMean log2FoldChange lfcSE stat pvalue padj
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
A*01:01:01:01 allele 0.0000000 NA NA NA NA NA
A*03:01:0:01 allele 0.0000000 NA NA NA NA NA
A1BG 0.3813450 1.9276013 1.608610 1.1983025 0.2307993 NA
A1BG-AS1 0.5606999 0.1980567 1.235204 0.1603433 0.8726107 NA
A1CF 1.6435664 0.9704531 0.911482 1.0646980 0.2870126 NA
and
> results(DE_Analysis, name="GenderMale.Phenotypeflexible")
log2 fold change (MAP): GenderMale.Phenotypeflexible
Wald test p-value: GenderMale.Phenotypeflexible
DataFrame with 30553 rows and 6 columns
baseMean log2FoldChange lfcSE stat pvalue padj
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
A*01:01:01:01 allele 0.0000000 NA NA NA NA NA
A*03:01:0:01 allele 0.0000000 NA NA NA NA NA
A1BG 0.3813450 0.97909416 0.8398631 1.16577824 0.2437041 0.999993
A1BG-AS1 0.5606999 -0.08720169 0.9308521 -0.09367943 0.9253638 0.999993
A1CF 1.6435664 -1.21334676 0.9052759 -1.34030598 0.1801459 0.999993
and the results when I use results(DE_Analysis)
log2 fold change (MAP): GenderMale.Phenotypeflexible
Wald test p-value: GenderMale.Phenotypeflexible
DataFrame with 30553 rows and 6 columns
baseMean log2FoldChange lfcSE stat pvalue
<numeric> <numeric> <numeric> <numeric> <numeric>
A*01:01:01:01 allele 0.0000000 NA NA NA NA
A*03:01:0:01 allele 0.0000000 NA NA NA NA
A1BG 0.3813450 0.97909416 0.8398631 1.16577824 0.2437041
A1BG-AS1 0.5606999 -0.08720169 0.9308521 -0.09367943 0.9253638
A1CF 1.6435664 -1.21334676 0.9052759 -1.34030598 0.1801459
... ... ... ... ... ...
but not successful for female
> results(DE_Analysis, name="GenderFemale.Phenotypeflexible")
Error in results(DE_Analysis, name = "GenderFemale.Phenotypeflexible") :
cannot find appropriate results in the DESeqDataSet.
possibly nbinomWaldTest or nbinomLRT has not yet been run.
How would I code to obtain this contrast?
I'd aim for a biologically meaningful cut.