Hi,
I am trying to analyse my proteomics data (intensity based label free quantification) using Limma.
When I look at the MD plot after limma fit, I see a strong correlation between log-fold-change and average-log-expression. What could cause this odd behaviour and how can I fix it?
Thanks,
Gil
The code I used is:
> imat <- read.delim("matrix.tsv", sep = "\t", row.names = 1) %>% + as.matrix() > head(imat) X51 X56 X57 X61 X65 X67 X72 X74 X77 X83 X85 X88 7966 1670300 1057800 1261700 1319300 1273000 1371400 950630 1920300 656870 706280 1157600 860140 7580 2727900 1536000 2675700 2168800 2246800 2530700 7548400 3874900 1761800 2203100 2815500 1867300 5068 27261000 19086000 19027000 16980000 24905000 29381000 27716000 24395000 15292000 16111000 13618000 14989000 4178 124790 372080 0 240420 0 0 492620 495800 2050400 408840 5791100 4327800 1038 1076100 962340 1384800 0 1395300 0 910440 0 714210 0 0 465820 1117 0 154830 159230 0 0 0 221510 0 0 133880 243150 881710 > sample_info <- read.delim("sample_desc.tsv",sep = "\t",stringsAsFactors = T) > head(sample_info) Sample Group 1 X51 A 2 X56 A 3 X57 A 4 X61 A 5 X65 A 6 X67 A > > design <- model.matrix(~0+sample_info$Group) > colnames(design) <- levels(sample_info$Group) > > limma_fit = lmFit(log2(imat+1), design = design) > plotMD(limma_fit)
session info:
> sessionInfo() R version 3.2.1 (2015-06-18) Platform: x86_64-unknown-linux-gnu (64-bit) Running under: Red Hat Enterprise Linux Server release 6.8 (Santiago) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] ecoliLeucine_1.10.0 ecolicdf_2.18.0 affy_1.48.0 Biobase_2.30.0 BiocGenerics_0.16.1 BiocInstaller_1.20.3 [7] plotly_4.5.6 ggdendro_0.1-20 gridExtra_2.2.1 purrr_0.2.2 ggplot2_2.2.1 dplyr_0.5.0 [13] tidyr_0.6.0 tibble_1.2 limma_3.26.9 knitr_1.15.1 loaded via a namespace (and not attached): [1] colorspace_1.3-2 amap_0.8-14 htmltools_0.3.5 stats4_3.2.1 viridisLite_0.1.3 yaml_2.1.14 [7] base64enc_0.1-3 DBI_0.6 bit64_0.9-5 RColorBrewer_1.1-2 affyio_1.40.0 matrixStats_0.51.0 [13] plyr_1.8.4 stringr_1.2.0 zlibbioc_1.16.0 munsell_0.4.3 gtable_0.2.0 htmlwidgets_0.8 [19] memoise_1.0.0 evaluate_0.10 labeling_0.3 IRanges_2.4.8 AnnotationDbi_1.32.3 preprocessCore_1.32.0 [25] highr_0.6 Rcpp_0.12.10 edgeR_3.12.1 scales_0.4.1 S4Vectors_0.8.11 jsonlite_1.0 [31] bit_1.1-12 digest_0.6.12 stringi_1.1.2 grid_3.2.1 tools_3.2.1 magrittr_1.5 [37] lazyeval_0.2.0 RSQLite_1.1-2 MASS_7.3-45 data.table_1.10.4 assertthat_0.1 httr_1.2.1 [43] R6_2.2.0
Thank you Gordon and James for the fast and helpful answers.
I used "1+" in my model and it worked fine.
Al the best ,
Gil