Question

VST advice (Illumina microarray)

0

Entering edit mode

Pan Du ★ 1.2k

@pan-du-2010

Last seen 10.4 years ago

Hi Mark, I found the reason. The reason is that you plotted the mean-sd plot based on all samples. The big variances among different sample types are normal, and these variances are located in the high intensity range. If you plot the mean-sd plot only based on the same type of samples, then the plot will be fine (variance may still exist among replicates due to uncontrolled reasons). Pleas have a try of the following code: for(i in 1:4){ meanSdPlot(normDataList[[i]][,1:5], main=names(normDataList)[i]) } Tell me if this is not clear for you. Thanks! Have a nice day, Pan On 1/28/08 8:02 AM, "Mark Dunning" <md392 at="" cam.ac.uk=""> wrote: >> >>> Hi guys, >>> >>> Hope you're both well. I just read your VST paper in NAR. It seems like >>> a good method and I am keen to use it for my own data. >>> >>> To begin with I have been looking at data from the MAQC project - a >>> dilution series using Human-6 chips and have been following the code in >>> your vignette. I have been able to reproduce the results in the vignette >>> for the Barnes data, but not for this data. I am attaching the >>> 'meanSdPlot' I obtain and as you can see there is an increase of sd >>> after about 30,000, which is not the behaviour I would expect. Would you >>> be able to suggest what the problem might be? >>> >>> >>> I have a lumi-batch object created using the non-normalised data for 19 >>> of the MAQC arrays and here is the summary. >>> >>>> summary(exprs(x.lumi)) >>> ILM_1_A1 ILM_1_A2 ILM_1_A3 ILM_1_A4 >>> Min. : 30.2 Min. : 33.3 Min. : 33.8 Min. : 33.8 >>> 1st Qu.: 52.5 1st Qu.: 51.1 1st Qu.: 52.1 1st Qu.: 52.8 >>> Median : 59.0 Median : 57.7 Median : 59.0 Median : 59.6 >>> Mean : 271.0 Mean : 268.3 Mean : 303.4 Mean : 280.4 >>> 3rd Qu.: 103.3 3rd Qu.: 102.2 3rd Qu.: 111.4 3rd Qu.: 105.9 >>> Max. :33398.3 Max. :30155.3 Max. :35888.5 Max. :32372.4 >>> ILM_1_A5 ILM_1_B1 ILM_1_B2 ILM_1_B3 >>> Min. : 35.9 Min. : 33.4 Min. : 32.7 Min. : 31.7 >>> 1st Qu.: 60.6 1st Qu.: 50.4 1st Qu.: 50.8 1st Qu.: 47.9 >>> Median : 68.2 Median : 56.6 Median : 57.1 Median : 54.4 >>> Mean : 331.1 Mean : 235.5 Mean : 245.5 Mean : 251.0 >>> 3rd Qu.: 122.8 3rd Qu.: 99.3 3rd Qu.: 102.4 3rd Qu.: 103.4 >>> Max. :38188.5 Max. :38868.7 Max. :37145.1 Max. :35338.0 >>> ILM_1_B4 ILM_1_B5 ILM_1_C1 ILM_1_C2 >>> Min. : 32.0 Min. : 32.4 Min. : 33.8 Min. : 33.2 >>> 1st Qu.: 51.5 1st Qu.: 59.7 1st Qu.: 53.3 1st Qu.: 52.6 >>> Median : 58.2 Median : 67.1 Median : 60.1 Median : 59.5 >>> Mean : 258.9 Mean : 285.2 Mean : 261.9 Mean : 279.0 >>> 3rd Qu.: 106.1 3rd Qu.: 120.7 3rd Qu.: 113.2 3rd Qu.: 116.6 >>> Max. :37861.9 Max. :40534.5 Max. :27707.1 Max. :31168.6 >>> ILM_1_C4 ILM_1_C5 ILM_1_D1 ILM_1_D2 >>> Min. : 32.8 Min. : 34.1 Min. : 32.0 Min. : 30.2 >>> 1st Qu.: 52.3 1st Qu.: 57.6 1st Qu.: 50.2 1st Qu.: 52.1 >>> Median : 59.1 Median : 65.5 Median : 57.1 Median : 58.9 >>> Mean : 270.2 Mean : 296.7 Mean : 250.3 Mean : 258.9 >>> 3rd Qu.: 114.6 3rd Qu.: 125.7 3rd Qu.: 112.5 3rd Qu.: 116.0 >>> Max. :29069.5 Max. :33560.1 Max. :32544.1 Max. :35139.3 >>> ILM_1_D3 ILM_1_D4 ILM_1_D5 >>> Min. : 35.8 Min. : 32.5 Min. : 37.1 >>> 1st Qu.: 54.6 1st Qu.: 58.7 1st Qu.: 59.4 >>> Median : 61.6 Median : 66.7 Median : 67.6 >>> Mean : 268.1 Mean : 289.6 Mean : 300.5 >>> 3rd Qu.: 121.2 3rd Qu.: 131.0 3rd Qu.: 134.0 >>> Max. :36666.1 Max. :37621.7 Max. :41185.4 >>> >>> >>> I then apply the following transforms and create a normalised data >>> object as in the vignette. >>> >>> x.lumi.vst <- lumiT(x.lumi) >>> x.lumi.vst.quantile <- lumiN(x.lumi.vst, method='quantile') >>> >>> ## log2 transform and Quantile normalization >>> x.lumi.log <- lumiT(x.lumi, method='log2') >>> x.lumi.log.quantile <- lumiN(x.lumi.log, method='quantile') >>> >>> x.lumi.vsn <- lumiN(x.lumi, method='vsn', lts.quantile=0.5) >>> >>> >>> normDataList <- list('Raw.Log2'=exprs(x.lumi.log), >>> 'VST.Quantile'=exprs(x.lumi.vst.quantile), >>> 'Log2.Quantile'=exprs(x.lumi.log.quantile), >>> 'VSN'=exprs(x.lumi.vsn)) >>> >>> However when I run >>> >>>> for(i in 1:4){ >>> + meanSdPlot(normDataList[[i]], main=names(normDataList)[i]) >>> + } >>> >>> >>> ...I get the attached picture. It does not seem that VST is working, or >>> maybe I have done something wrong. Has VST been used on Human-6 data >>> before and is there some special trick I need to use? >>> >>> Any help you could give would be greatly appreciated >>> >>> Best wishes, >>> >>> Mark >>>

• 906 views

ADD COMMENT • link 17.0 years ago Pan Du ★ 1.2k