Hello,
I am using DESeq2 in phyloseq to analyse my data, following the phyloseq tutorial.
Previously I used the following code and this seemed to work nicely, however when I am trying to repeat this I now get the following message and although the listed bacterial species are the same the log2fold changes are quite different. The difference is at the point when I try to convert the phyloseq format to DESeq2 I get the error message in bold below (previously I was not seeing this error message). I have also loaded the sessionInfo in case this helps shed any light on the problem.
otufile=("otu_table_m1000_minus_cyano_chloro_hdf5_consensuslineage.txt") mapfile=("Mapping_file_LS.txt") trefile=("rep_set_tree.tre") qiimedata = import_qiime(otufile, mapfile, trefile) #remove my unwanted groups qiimedata = subset_samples(qiimedata, LesionScore != "6") qiimedata = subset_samples(qiimedata, LesionScore != "7") qiimedata = subset_samples(qiimedata, LesionScore != "0") qiimedata = subset_samples(qiimedata, LesionScore != "1") qiimedata = subset_samples(qiimedata, LesionScore != "3") qiimedata = subset_samples(qiimedata, LesionScore != "2") #left with the two groups I want to compare head(sample_data(qiimedata)$LesionScore, 25) [1] -5 -5 4 -5 -5 -5 4 -5 4 4 4 4 -5 -5 4 -5 -5 4 diagdds = phyloseq_to_deseq2(qiimedata, ~ LesionScore) converting counts to integer mode the design formula contains a numeric variable with integer values, specifying a model with increasing fold change for higher values. did you mean for this to be a factor? if so, first convert this variable to a factor using the factor() function > sessionInfo() R version 3.3.2 (2016-10-31) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: macOS Sierra 10.12 locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base other attached packages: [1] BiocInstaller_1.24.0 DESeq2_1.14.1 [3] SummarizedExperiment_1.4.0 Biobase_2.34.0 [5] GenomicRanges_1.26.3 GenomeInfoDb_1.10.3 [7] IRanges_2.8.1 S4Vectors_0.12.1 [9] BiocGenerics_0.20.0 phyloseq_1.19.1 loaded via a namespace (and not attached): [1] locfit_1.5-9.1 Rcpp_0.12.9 ape_4.1 [4] lattice_0.20-34 Biostrings_2.42.1 assertthat_0.1 [7] digest_0.6.12 foreach_1.4.3 plyr_1.8.4 [10] backports_1.0.5 acepack_1.4.1 RSQLite_1.1-2 [13] ggplot2_2.2.1 zlibbioc_1.20.0 lazyeval_0.2.0 [16] data.table_1.10.4 annotate_1.52.1 vegan_2.4-2 [19] rpart_4.1-10 Matrix_1.2-8 checkmate_1.8.2 [22] splines_3.3.2 BiocParallel_1.8.1 geneplotter_1.52.0 [25] stringr_1.2.0 foreign_0.8-67 htmlwidgets_0.8 [28] igraph_1.0.1 RCurl_1.95-4.8 munsell_0.4.3 [31] base64enc_0.1-3 multtest_2.30.0 mgcv_1.8-17 [34] htmltools_0.3.5 nnet_7.3-12 biomformat_1.2.0 [37] tibble_1.2 gridExtra_2.2.1 htmlTable_1.9 [40] Hmisc_4.0-2 codetools_0.2-15 XML_3.98-1.5 [43] permute_0.9-4 MASS_7.3-45 bitops_1.0-6 [46] grid_3.3.2 DBI_0.5-1 nlme_3.1-131 [49] jsonlite_1.3 xtable_1.8-2 gtable_0.2.0 [52] magrittr_1.5 scales_0.4.1 stringi_1.1.2 [55] XVector_0.14.0 reshape2_1.4.2 genefilter_1.56.0 [58] latticeExtra_0.6-28 Formula_1.2-1 RColorBrewer_1.1-2 [61] iterators_1.0.8 tools_3.3.2 ade4_1.7-5 [64] survival_2.40-1 AnnotationDbi_1.36.2 colorspace_1.3-2 [67] rhdf5_2.18.0 cluster_2.0.5 memoise_1.0.0 [70] knitr_1.15.1
Cheers,
Sarah
Hi Michael,
Thanks for the clarification.
Changing from using numerical values for LesionScore in my mapping file fixed the problem (eg from 4 to four etc), meaning I guess this is now coded as a factor.
Cheers,
Sarah