Question

Unexpected results using limma with numerical factor - regression intercept?

0

Entering edit mode

Matthew Hannah ▴ 940

@matthew-hannah-621

Last seen 10.6 years ago

Sorry for getting things confused, I'd like to ask a statistian but we don't really have one:( But hopefully I'm improving... My mistakes were - 1.I mixed up lmFit(limma) with lm.fit(stats) - sorry, 1 character's not alot when you are in a hurry;) 2.Being unsure of how lmFit handles factor c(1,2,3..) vs numeric c(1,2,3..) and thinking this example in the Limma user guide meant -ve numbers could be confused as dye swaps. > design <- c(-1,1,-1,1) #"The negative numbers in the design matrix indicate the dye-swaps." > fit <- lmFit(MA,design) sorry, no real excuse there, it just seemed logical when I typed it...doh Anyway after much searching I *think* I've found out how to use limma for (standard?) regression and might even be able to solve the original posters query.... lm() and (I guess) lmFit have an implied intercept term, so the design of the original posters fit >design <- model.matrix(~-1 + TestScore1, pData(eset)); removes the intercept and so forces the fit to go through zero?? For linear regression against a quantified variable you need to allow an intercept to be fitted (unless you have a reason to think it goes through zero?)?? Maybe I'm wrong but here's my logic - num <- c(1,2,3,4,5,6,7) #measured growth...etc #Pearson and its p-value cor(exprs(esetgcrma)[1,]~num) cor.test(exprs(esetgcrma)[1,]~num) #lm - produces pearson squared, and same p-value as above, also produces coefficients lm(exprs(esetgcrma)[1,]~num) #limma - produces the same coefficients as lm() above design <- model.matrix(~num) fit <- lmFit(esetgcrma,design) So limma produces the 'same' results as pearson and lm(), but you can carry them on for further analysis..... So is this correct, or should I go back to wearing the dunce hat in the corner? Cheers, Matt > -----Original Message----- > From: Gordon Smyth [mailto:smyth@wehi.edu.au] > Sent: Donnerstag, 26. August 2004 09:38 > To: Matthew Hannah > Cc: bioconductor@stat.math.ethz.ch > Subject: Re: [BioC] Unexpected results using limma with > numerical factor > > At 05:40 PM 25/08/2004, Matthew Hannah wrote: > >I've recently asked a similar question but have got no feedback, see > >point 2 onwards in the following. > >https://www.stat.math.ethz.ch/pipermail/bioconductor/2004-Aug ust/005802. > >html > > > >As I now understand it (perhaps/probably wrongly) the > standard approach > >(equilivent to ANOVA?) is to use a non-numeric factor for the fit. > >However, lm() in R is also capable of regression fits. > Looking (but not > >understanding) lm.fit in limma, it appears to be an independent > >function (lm bit written in fortran?) and doesn't call lm() > from stats. > >So the question is really if lm.fit can do numeric regression? > > I am afraid that you've got things a bit skew. lm.fit() is > not a function in limma, rather is a low level function in > the stats package which is called by lm(). The limma > functions use lm.fit() and lm.wfit() just like > lm() does. > > It isn't really helpful to think of ANOVA in the microarray > context. ANOVA is a special case of regression. > > >Another consideration is that you wouldn't use a design > (factor) but a > >numeric vector. My guess is that your design is being taken as a > >factor, (and if you look in the user guide) the -ve values > may indicate > >dye swaps, which could be interesting! I've just tried lmfit with a > >numeric vector (but as there are replicates each number appears 3 > >times) and got meaningless results - all lods>30 and all > tiny p-values > >1e-24. So initially it looks like you can't use limma like this, but > >I'd like to hear an expert verdict as if the approach is > completely wrong. > > I don't think I understand your question. Have a talk with a > statistician at your own institution about factors, > covariates and design matrices. > > Gordon > > >Anyway if you're looking for correlations then you could perhaps try > >pearson (see my previous post about p-values and R2 from lm(). > >try- > > > >Test1 <- c(0.58,-2.36,-12.24,-14.84,0.15,-3.23,-11.66,-12.91) > >Correl <- esApply(eset, 1, function(x) { cor(x, Test1) > >}) > > > >be aware that you might get alot of correlations by chance, > >particularly as your scores seem to be in 2 groups, close to > zero and < > >-11. And a straight line between two groups gives a good pearson as > >it's sensitive to outliers. > > > >Perhaps it's best to change your test scores to factors anyway- > >Test1.high, Test1.low, Test2.high, Test2.low, and do a conventional > >limma analysis. Without a regular distribution of test scores, > >correlations are going to be largely meaningless. > > > >HTH, and that someone can answer this more definetely. > > > >Cheers, > >Matt > > >

GO Regression limma GO Regression limma • 1.1k views

ADD COMMENT • link 20.7 years ago Matthew Hannah ▴ 940