Question

Determining direction of logFC in limma

0

Entering edit mode

pt2395 • 0

@pt2395-9245

Last seen 9.0 years ago

United States

I'm sure this is an easy question, but I'm fairly new to using limma and R and I'd appreciate any help or guidance. I've run a paired limma analysis trying to look at differentially expressed genes in a pre-post experiment design. I'm trying to understand what direction my logFC values represent i.e. whether the gene is being up regulated or down regulated from pre to post or post to pre. My code is as follows:

write.exprs(e,file="expressionDataHFNIH5-7-14.txt") 

d<-read.table("expressionDataHFNIH5-7-14.txt", header=T, row.names=1)
names(d)<-gsub(".CEL", "", gsub("X", "", names(d)))
d<-round(d, 5)

write.table(d, file="filepath.txt", quote=F, sep='\t')

#create a design matrix
#congestion 1=pre and 2=post

design <- cbind(ID = c(1,1,2,2,7,7,8,8,9,9,10,10,11,11,14,14,16,16,20,20,21,21,22,22,23,23,24,24,25,25,26,26,27,27,28,28,29,29,30,30,33,33,34,34,35,35,36,36,37,37,39,39,40,40,41,41), Congestion=c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2))

ID<-factor(design[,1])
Congestion<-factor(design[,2], levels=c("1","2"))

#create a design matrix
design<-model.matrix(~ID+Congestion)

fit <- lmFit(e, design)
fit <- eBayes(fit)

topTable(fit, coef = "Congestion2", adjust = "fdr")

topt2<-topTable(fit, coef = "Congestion2", adjust.method = "fdr", n=Inf, sort.by="p")
write.table(topt2,file="filepath.txt",quote=F,sep='\t'

Thank you!

logFC limma • 3.7k views

ADD COMMENT • link 9.0 years ago pt2395 • 0

score 1 · Answer 1 · 2015-11-20

1

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 7 hours ago

The city by the bay

Your Congestion2 coefficient represents the log-fold change of your "2" samples over your "1" samples (i.e., post over pre). When you drop this coefficient during your DE comparison, the reported log-fold change corresponds to the value of this coefficient. Thus, you can interpret positive values as being upregulated in the post condition compared to pre, and negative values as being downregulated.

Incidentally, if you ever get confused, it's always a good idea to look at the expression values to figure out what's going on. In this case, you can pick a couple of the top DE genes and have a look at the direction of change within each patient for those genes. This is also a good way to check that you've set up the design and contrasts correctly for more complicated experiments; it never hurts to be on the safe side.

ADD COMMENT • link 9.0 years ago Aaron Lun ★ 28k

0

Entering edit mode

Thank you! One more question for my understanding, is it post over pre because of the way in which I've ordered it in my levels and design matrix? So if I decided that I wanted to determine my log-fold change of my "1" samples over my "2" samples then I would need to interchange their order to derive that value?

ah never mind you just answered it. Thank you so much for your help.

ADD REPLY • link 9.0 years ago pt2395 • 0

3

Entering edit mode

Yeah, model.matrix will choose the first level as the "reference". You should be able to flip it around by setting levels=c("2","1") in the construction of the Congestion variable, in which case you should end up with Congestion1 as the relevant coefficient to drop for your DE testing.