Log2FC values very small with SCAN
2
0
Entering edit mode
Vani ▴ 20
@vani-8145
Last seen 8.9 years ago
United States

Hi,

​I am using the SCAN method to normalize several geo datasets. The resulting Log2FC values of the normalized eset are very small (between -.5 and .5). Is this normal? Not sure why the values are so small.

Please advise.

 

scan Log2FC • 2.0k views
ADD COMMENT
1
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

I downloaded the series data matrix for GSE21610 (which is already normalized using MAS5) and did a quick analysis using limma. There are plenty of large fold changes, some with log2FC > 3.

Even using SCAN normalization, there are a number of large fold changes, so your apparent claim that the log2FC are all between -0.5 and 0.5 is not actually true.

Whatever normalization method you use, the data analysis seems to be me to require more attention than this. This study has three possible values for the disease status: "none", "dilated cardiomyopathy" and "ischemic cardiomyopathy". There are other variables that should be adjusted for in the limma linear model (particularly gender and age). There are other important analysis steps that should be done to address data quality, especially filtering out unexpressed probes. I would rather see you giving attention to these fundamental analysis issues instead of worrying so much about the size of the fold changes.

ADD COMMENT
0
Entering edit mode
@stephen-piccolo-6761
Last seen 4.2 years ago
United States

Hi Vani,

It's hard to know what could cause this without knowing more about the data set and analysis you are doing. Can you provide a few more details (array type, sample size, method of calculating Log2FC, etc.)? Also, have you tried it with any other normalization methods?

Thanks,

-Steve

ADD COMMENT
0
Entering edit mode

I am getting small values for the Affymetrix Human Genome U133 Plus 2.0 Array. The sample size is around 68. I am using limma's lmFit and toptable to generate the log2FC. I tried FRMA and the values ranged from -1.3 to 1.3.

Here is my code:

#Load data using InSilicoDb
eset21610 <- getDataset("GSE21610","GPL570",  format = "CURESET",norm = "SCAN", features = "GENE")

design1 <- model.matrix(~ Heart_Failure, pData(eset21610))

afterLimma <- lmFit(eset21610, design = design1)

e4 <- eBayes(afterLimma)

impdata <- topTable(e4,number = 19528,sort.by="logFC")

plot(impdata$logFC, -log10(impdata$P.Value),
   xlim=c(-.6, .6), ylim=c(-1, 10),
   xlab="log2 fold change", ylab="-log10 p-value")
ADD REPLY
0
Entering edit mode

Vani,

Sorry for the late reply. I looked at the data and did some simple simulations to make sure I understand what is going on. It appears this is because the variance is larger for the fRMA data than for the SCAN data. I don't know enough about how limma works to know how this affects the logFC values. Perhaps the authors of that tool could shed some light on this...

ADD REPLY
0
Entering edit mode

limma just computes fold changes for the data it is given, and variances do not enter into the calculation. If there is problem here, it is at the normalization stage.

ADD REPLY

Login before adding your answer.

Traffic: 602 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6