limma edger setting up linear model

0

Entering edit mode

steven wink ▴ 90

@steven-wink-5440

Last seen 5.6 years ago

Dear list, I could not find a fitting example in in the userguides for limma / edger - this is probably because of my lack of understanding of multiv. statistics. I have performed an experiment as follows: cell_line treatment time 1 1 1 1 2 1 1 3 1 1 4 1 1 5 1 1 1 2 1 2 2 1 3 2 1 4 2 1 5 2 2 1 1 2 2 1 2 3 1 2 4 1 2 5 1 2 1 2 2 2 2 2 3 2 2 4 2 2 5 2 3 1 1 3 2 1 3 3 1 3 4 1 3 5 1 3 1 2 3 2 2 3 3 2 3 4 2 3 5 2 biological info on the experiment: 4 replicates for controls (treatment 1) 3 replicates for the other 4 treatments the cell lines are actually very similar - stable knock down / overexpression versions of each other - so maybe treat as random sample when interested in treatment effects? The treatments include a negative control, I am also interested in different treatment comparisons ( 3 vs 4, 2 vs 5 etc etc) though. The effect of time is not really of interest to me, so if it makes it easier it would be ok to split the data in 2 sets, 1 for each time point. biol questions: baseline differences in cell iines. differences in cell lines response to treatments the treatment effects relative to control and to each other. Above questions for both time points. This seems to me to be a factorial design, so first thing I tried was a 3 factorial design, with a design matrix with all possible combinations: >cellLine <- eSetrmaF$cell_line > treatment <- eSetrmaF$treatment > time <- eSetrmaF$time > allCombos <- paste( cellLine, treatment, time, sep = "." ) > allCombos <- factor( allCombos ) > design <- model.matrix( ~0 + allCombos ) > colnames( design ) <- levels( allCombos ) > fitAll <- lmFit( eSetrmaF, design ) to test if what I was doing made any sense I checked for IGF1 cell line for treatment glarg at 6h compared to its vehicle control, I also included an interaction term to test: "what is the difference of cell lines IRA and IRB in their response to glargine at 6h? > cont.matrix1 <- makeContrasts( IGF1_glarg_6 = IGF1R.glargine.6h-IGF1R.control.6h, > IRA_IRB_glarg_6h = ( IRA.glargine.6h - IRA.control.6h ) - ( IRB.glargine.6h - IRB.control.6h ), > levels = design ) > fitAll2 <- contrasts.fit(fitAll, cont.matrix1) > fitAll3 <- eBayes(fitAll2) The results don't seem to make sense since the intersection of probe IDs from the toptable results (number = 500) and the results from a simple t test between IGF1R.glargine.6h-IGF1R.control.6h (also 500 rows), samples is very low (random even) Any help to which manual examples I should look, or a general strategy is greatly appreciated. Best regards Steven Wink [[alternative HTML version deleted]]

limma limma • 1.3k views

ADD COMMENT • link updated 11.8 years ago by Steven ▴ 110 • written 11.8 years ago by steven wink ▴ 90

0

Entering edit mode

Steven ▴ 110

@steven-5432

Last seen 10.3 years ago

Just adding session info in case it's needed. R version 2.15.2 (2012-10-26) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=C LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] hthgu133pluspmcdf_2.11.0 genefilter_1.40.0 hthgu133pluspm.db_15.1.0 org.Hs.eg.db_2.8.0 RSQLite_0.11.2 [6] DBI_0.2-5 AnnotationDbi_1.20.3 gdata_2.12.0 vsn_3.26.0 affy_1.36.1 [11] arrayQualityMetrics_3.14.0 limma_3.14.4 Biobase_2.18.0 BiocGenerics_0.4.0 loaded via a namespace (and not attached): [1] affyio_1.26.0 affyPLM_1.34.0 annotate_1.36.0 beadarray_2.8.1 BeadDataPackR_1.10.0 BiocInstaller_1.8.3 Biostrings_2.26.3 [8] Cairo_1.5-2 cluster_1.14.3 colorspace_1.2-1 gcrma_2.30.0 grid_2.15.2 gtools_2.7.0 Hmisc_3.10-1 [15] hwriter_1.3 IRanges_1.16.6 KernSmooth_2.23-8 lattice_0.20-13 latticeExtra_0.6-24 parallel_2.15.2 plyr_1.8 [22] preprocessCore_1.20.0 RColorBrewer_1.0-5 reshape2_1.2.2 setRNG_2011.11-2 splines_2.15.2 stats4_2.15.2 stringr_0.6.2 [29] survival_2.37-2 SVGAnnotation_0.93-1 tools_2.15.2 XML_3.95-0.1 xtable_1.7-1 zlibbioc_1.4.0 Kind regards Steven -- ir. Steven Wink, PhD student Division of Toxicology Leiden/Amsterdam Center for Drug Research (LACDR) Leiden University phone: 31-71-5276039 2013/3/5 steven wink <hardervidertsie@gmail.com> > Dear list, > > I could not find a fitting example in in the userguides for limma / edger - > this is probably because of my lack of understanding of multiv. statistics. > > I have performed an experiment as follows: > > cell_line treatment time > 1 1 1 > 1 2 1 > 1 3 1 > 1 4 1 > 1 5 1 > 1 1 2 > 1 2 2 > 1 3 2 > 1 4 2 > 1 5 2 > 2 1 1 > 2 2 1 > 2 3 1 > 2 4 1 > 2 5 1 > 2 1 2 > 2 2 2 > 2 3 2 > 2 4 2 > 2 5 2 > 3 1 1 > 3 2 1 > 3 3 1 > 3 4 1 > 3 5 1 > 3 1 2 > 3 2 2 > 3 3 2 > 3 4 2 > 3 5 2 > > biological info on the experiment: > 4 replicates for controls (treatment 1) > 3 replicates for the other 4 treatments > the cell lines are actually very similar - stable knock down / > overexpression versions of each other - so maybe treat as random sample > when interested in treatment effects? > The treatments include a negative control, I am also interested in > different treatment comparisons ( 3 vs 4, 2 vs 5 etc etc) though. > The effect of time is not really of interest to me, so if it makes it > easier it would be ok to split the data in 2 sets, 1 for each time point. > > biol questions: > baseline differences in cell iines. > differences in cell lines response to treatments > the treatment effects relative to control and to each other. > Above questions for both time points. > > This seems to me to be a factorial design, so first thing I tried was a 3 > factorial design, with a design matrix with all possible combinations: > > >cellLine <- eSetrmaF$cell_line > > treatment <- eSetrmaF$treatment > > time <- eSetrmaF$time > > allCombos <- paste( cellLine, treatment, time, sep = "." > ) > > allCombos <- factor( allCombos ) > > design <- model.matrix( ~0 + allCombos ) > > colnames( design ) <- levels( allCombos ) > > fitAll <- lmFit( eSetrmaF, design ) > > > to test if what I was doing made any sense I checked for IGF1 cell line for > treatment glarg at 6h compared to its vehicle control, I also included an > interaction term to test: "what is the difference of cell lines IRA and IRB > in their response to glargine at 6h? > > > cont.matrix1 <- makeContrasts( IGF1_glarg_6 = > IGF1R.glargine.6h- IGF1R.control.6h, > > IRA_IRB_glarg_6h = ( IRA.glargine.6h > - IRA.control.6h > ) - ( IRB.glargine.6h - IRB.control.6h ), > > levels = design ) > > fitAll2 <- contrasts.fit(fitAll, cont.matrix1) > > fitAll3 <- eBayes(fitAll2) > > The results don't seem to make sense since the intersection of probe IDs > from the toptable results (number = 500) and the results from a simple t > test between IGF1R.glargine.6h-IGF1R.control.6h (also 500 rows), samples is > very low (random even) > > Any help to which manual examples I should look, or a general strategy is > greatly appreciated. > > Best regards > > Steven Wink > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]

ADD COMMENT • link 11.8 years ago Steven ▴ 110

Login before adding your answer.