Question

Illumina BeadChips and beadarray

0

Entering edit mode

Ina Hoeschele ▴ 620

@ina-hoeschele-2992

Last seen 3.2 years ago

United States

Hi, I am working on low-level analysis of bead-level expression data from Illumina BeadChips using the beadarray package of Bioconductor. Each chip has 6 arrays with 2 strips. For each of the 12 strips per chip, there is a _.csv, _.txt, _Grn.tif and _Grn.locs file (and _Grn.xml). There is also a _Grn.idat file for each of the 6 arrays per chip. My .csv files do not contain bead-level information but rather just probe-id, number of beads, mean Grn and dev Grn (I assume Grn is foreground intensity). My .txt files contain bead-level information with probe-id, foreground intensity Grn and bead location GrnX, GrnY (bead center coordinates). Therefore, for beadarray analysis I use the .txt (not .csv) and .tif files. I read the bead-level data for one chip as follows: for background correction method subtract or normexp: BLData.sharpen.subtract.txt.tif.013 = readIllumina(textType=".txt", arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", "1814647013_F_2"), singleChannel=TRUE, useImages=TRUE, beadInfo=NULL, targets=NULL, storeXY=TRUE, imageManipulation="sharpen", metrics=TRUE, metricsFile="Metrics.txt", backgroundMethod="subtract/normexp", normalizeMethod="none") Then my first problem is that the background intensities, which I look at using an <- arrayNames(BLData.sharpen.txt.tif.013) BLData.sharpen.subtract/normexp.txt.tif.013 at beadData[[an[1]]]$Gb are ALL zero. If I specify a small value for backgroundSize (e.g. 4), then a few of the Gb values are small but nonzero. I did not expect to find all Gb values to be zero for the default backgroundSize = 17, so what is going on here??? When I read the data without background correction as follows (only changing the backgroundMethods option) BLData.sharpen.txt.tif.013 = readIllumina(textType=".txt", arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", "1814647013_F_2"), singleChannel=TRUE, useImages=TRUE, beadInfo=NULL, targets=NULL, storeXY=TRUE, imageManipulation="sharpen", metrics=TRUE, metricsFile="Metrics.txt", backgroundMethod="none", normalizeMethod="none") then I get the same $G values as for backgroundMethod="subtract" which also does not make sense to me (the values for backgroundMethod="normexp" are different) ??? Of course I can retrieve the not-background adjusted bead intensities from the .txt files by setting useImages=FALSE), but the above with useImages=TRUE and backgroundMethod="none" should produce the same G values, right?, but it does not. Instead, backgroundMethod="subtract" and backgroundMethod="none" produce the same values! My last problem is that the bead center coordinates GrnX and GrnY that I find in the .txt files and those from beadarray, BLData.sharpen.subtract/normexp.txt.tif.013 at beadData[[an[1]]]$GrnX/Y, are NOT the same, which worries me! Any comments are very much appreciated. Thanks, Ina Hoeschele

beadarray beadarray • 1.9k views

ADD COMMENT • link updated 16.3 years ago by Matt Ritchie ▴ 460 • written 16.3 years ago by Ina Hoeschele ▴ 620

score 0 · Answer 1 · 2008-08-19

0

Entering edit mode

Matt Ritchie ▴ 460

@matt-ritchie-2048

Last seen 10.2 years ago

Dear Ina, I'm not sure what the cause of this inconsistency could be. When I try the different background correction options available in readIllumina() on other data (I used the example data set at http://www.compbio.group.cam.ac.uk/Resources/illumina/SAMExample.zip - see below for the R code I ran), the background values are nonzero. The local background is calculated as the average of the lowest 5 pixels in 17 x 17 window around each bead center by default, so all zero values would indicate that the images have lots of zero pixels, which is unusual. Are the foreground values nonzero? As for the difference in coordinates - the GrnX and GrnY values stored in the BLData object have been shifted by an offset in the X and Y directions so that the minimum X and Y coordinates are (0,0). This makes life easier when generating spatial plots of the data in R. You might also find the getArrayData() function easier to use than the complicated @beadData command in your email. See ?getArrayData for details. Best wishes, Matt ****************** library(beadarray) BLData.sharpen.none = readIllumina(textType=".csv", singleChannel=TRUE, useImages=TRUE, beadInfo=NULL, targets=NULL, storeXY=TRUE, imageManipulation="sharpen", metrics=TRUE, backgroundMethod="none", normalizeMethod="none") BLData.sharpen.subtract = readIllumina(textType=".csv", singleChannel=TRUE, useImages=TRUE, beadInfo=NULL, targets=NULL, storeXY=TRUE, imageManipulation="sharpen", metrics=TRUE, backgroundMethod="subtract", normalizeMethod="none") BLData.sharpen.normexp = readIllumina(textType=".csv", singleChannel=TRUE, useImages=TRUE, beadInfo=NULL, targets=NULL, storeXY=TRUE, imageManipulation="sharpen", metrics=TRUE, backgroundMethod="normexp", normalizeMethod="none") summary(getArrayData(BLData.sharpen.none, what="G", array=1, log=FALSE)) Min. 1st Qu. Median Mean 3rd Qu. Max. -661.6 993.6 1074.0 2628.0 1271.0 80740.0 summary(getArrayData(BLData.sharpen.none, what="Gb", array=1, log=FALSE)) Min. 1st Qu. Median Mean 3rd Qu. Max. 279.0 705.0 708.0 706.5 713.0 772.0 # the above is equivalent to an = arrayNames(BLData.sharpen.none) summary(BLData.sharpen.none at beadData[[an[1]]]$Gb) Min. 1st Qu. Median Mean 3rd Qu. Max. 279.0 705.0 708.0 706.5 713.0 772.0 summary(getArrayData(BLData.sharpen.subtract, what="G", array=1, log=FALSE)) Min. 1st Qu. Median Mean 3rd Qu. Max. -1306.0 285.2 366.6 1922.0 567.1 80320.0 summary(getArrayData(BLData.sharpen.subtract, what="Gb", array=1, log=FALSE)) Min. 1st Qu. Median Mean 3rd Qu. Max. 279.0 705.0 708.0 706.5 713.0 772.0 summary(getArrayData(BLData.sharpen.normexp, what="G", array=1, log=FALSE)) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.958 156.900 238.000 1795.000 438.600 80190.000 summary(getArrayData(BLData.sharpen.normexp, what="Gb", array=1, log=FALSE)) Min. 1st Qu. Median Mean 3rd Qu. Max. 279.0 705.0 708.0 706.5 713.0 772.0 sessionInfo() R version 2.7.0 (2008-04-22) i386-apple-darwin8.10.1 locale: en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] beadarray_1.8.0 affy_1.18.0 preprocessCore_1.2.0 [4] affyio_1.8.0 geneplotter_1.18.0 annotate_1.18.0 [7] xtable_1.5-2 AnnotationDbi_1.2.0 RSQLite_0.6-8 [10] DBI_0.2-4 lattice_0.17-6 Biobase_2.0.0 [13] limma_2.14.0 loaded via a namespace (and not attached): [1] KernSmooth_2.22-22 RColorBrewer_1.0-2 grid_2.7.0 > Hi, > I am working on low-level analysis of bead-level expression data from > Illumina BeadChips using the beadarray package of Bioconductor. Each chip has > 6 arrays with 2 strips. For each of the 12 strips per chip, there is a _.csv, > _.txt, _Grn.tif and _Grn.locs file (and _Grn.xml). There is also a _Grn.idat > file for each of the 6 arrays per chip. My .csv files do not contain > bead-level information but rather just probe-id, number of beads, mean Grn and > dev Grn (I assume Grn is foreground intensity). My .txt files contain > bead-level information with probe-id, foreground intensity Grn and bead > location GrnX, GrnY (bead center coordinates). Therefore, for beadarray > analysis I use the .txt (not .csv) and .tif files. > I read the bead-level data for one chip as follows: > > for background correction method subtract or normexp: > > BLData.sharpen.subtract.txt.tif.013 = readIllumina(textType=".txt", > arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", > "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", > "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", > "1814647013_F_2"), singleChannel=TRUE, useImages=TRUE, > beadInfo=NULL, targets=NULL, storeXY=TRUE, > imageManipulation="sharpen", metrics=TRUE, metricsFile="Metrics.txt", > backgroundMethod="subtract/normexp", normalizeMethod="none") > > Then my first problem is that the background intensities, which I look at > using > an <- arrayNames(BLData.sharpen.txt.tif.013) > BLData.sharpen.subtract/normexp.txt.tif.013 at beadData[[an[1]]]$Gb > are ALL zero. If I specify a small value for backgroundSize (e.g. 4), then a > few of the Gb values are small but nonzero. I did not expect to find all Gb > values to be zero for the default backgroundSize = 17, so what is going on > here??? > > When I read the data without background correction as follows (only changing > the backgroundMethods option) > > BLData.sharpen.txt.tif.013 = readIllumina(textType=".txt", > arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", > "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", > "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", > "1814647013_F_2"), singleChannel=TRUE, useImages=TRUE, > beadInfo=NULL, targets=NULL, storeXY=TRUE, > imageManipulation="sharpen", metrics=TRUE, metricsFile="Metrics.txt", > backgroundMethod="none", normalizeMethod="none") > > then I get the same $G values as for backgroundMethod="subtract" which also > does not make sense to me (the values for backgroundMethod="normexp" are > different) ??? > > Of course I can retrieve the not-background adjusted bead intensities from the > .txt files by setting > useImages=FALSE), but the above with useImages=TRUE and > backgroundMethod="none" should produce the same G values, right?, but it does > not. Instead, backgroundMethod="subtract" and backgroundMethod="none" produce > the same values! > > My last problem is that the bead center coordinates GrnX and GrnY that I find > in the .txt files and those from beadarray, > BLData.sharpen.subtract/normexp.txt.tif.013 at beadData[[an[1]]]$GrnX/Y, are NOT > the same, which worries me! > > Any comments are very much appreciated. > Thanks, Ina Hoeschele

ADD COMMENT • link 16.3 years ago Matt Ritchie ▴ 460

0

Entering edit mode

Matt and others, one additional question regarding Illumina BeadChip expression data. If one uses a background correction method that produces negative intensities, then what is the best course of action (apart from using normexp!) in your experience? For the joint analysis of multiple beadchips (each with 6 samples), if one would delete all genes that have a negative intensity in some sample, then there would soon be no data left. So set negative values to a small positive value (which)? Thanks, Ina ----- Original Message ----- From: "Matt Ritchie" <matt.ritchie@cancer.org.uk> To: "Ina Hoeschele" <inah at="" vbi.vt.edu="">, bioconductor at stat.math.ethz.ch Sent: Monday, August 18, 2008 11:35:29 PM GMT -05:00 US/Canada Eastern Subject: Re: [BioC] Illumina BeadChips and beadarray Dear Ina, I'm not sure what the cause of this inconsistency could be. When I try the different background correction options available in readIllumina() on other data (I used the example data set at http://www.compbio.group.cam.ac.uk/Resources/illumina/SAMExample.zip - see below for the R code I ran), the background values are nonzero. The local background is calculated as the average of the lowest 5 pixels in 17 x 17 window around each bead center by default, so all zero values would indicate that the images have lots of zero pixels, which is unusual. Are the foreground values nonzero? As for the difference in coordinates - the GrnX and GrnY values stored in the BLData object have been shifted by an offset in the X and Y directions so that the minimum X and Y coordinates are (0,0). This makes life easier when generating spatial plots of the data in R. You might also find the getArrayData() function easier to use than the complicated @beadData command in your email. See ?getArrayData for details. Best wishes, Matt ****************** library(beadarray) BLData.sharpen.none = readIllumina(textType=".csv", singleChannel=TRUE, useImages=TRUE, beadInfo=NULL, targets=NULL, storeXY=TRUE, imageManipulation="sharpen", metrics=TRUE, backgroundMethod="none", normalizeMethod="none") BLData.sharpen.subtract = readIllumina(textType=".csv", singleChannel=TRUE, useImages=TRUE, beadInfo=NULL, targets=NULL, storeXY=TRUE, imageManipulation="sharpen", metrics=TRUE, backgroundMethod="subtract", normalizeMethod="none") BLData.sharpen.normexp = readIllumina(textType=".csv", singleChannel=TRUE, useImages=TRUE, beadInfo=NULL, targets=NULL, storeXY=TRUE, imageManipulation="sharpen", metrics=TRUE, backgroundMethod="normexp", normalizeMethod="none") summary(getArrayData(BLData.sharpen.none, what="G", array=1, log=FALSE)) Min. 1st Qu. Median Mean 3rd Qu. Max. -661.6 993.6 1074.0 2628.0 1271.0 80740.0 summary(getArrayData(BLData.sharpen.none, what="Gb", array=1, log=FALSE)) Min. 1st Qu. Median Mean 3rd Qu. Max. 279.0 705.0 708.0 706.5 713.0 772.0 # the above is equivalent to an = arrayNames(BLData.sharpen.none) summary(BLData.sharpen.none at beadData[[an[1]]]$Gb) Min. 1st Qu. Median Mean 3rd Qu. Max. 279.0 705.0 708.0 706.5 713.0 772.0 summary(getArrayData(BLData.sharpen.subtract, what="G", array=1, log=FALSE)) Min. 1st Qu. Median Mean 3rd Qu. Max. -1306.0 285.2 366.6 1922.0 567.1 80320.0 summary(getArrayData(BLData.sharpen.subtract, what="Gb", array=1, log=FALSE)) Min. 1st Qu. Median Mean 3rd Qu. Max. 279.0 705.0 708.0 706.5 713.0 772.0 summary(getArrayData(BLData.sharpen.normexp, what="G", array=1, log=FALSE)) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.958 156.900 238.000 1795.000 438.600 80190.000 summary(getArrayData(BLData.sharpen.normexp, what="Gb", array=1, log=FALSE)) Min. 1st Qu. Median Mean 3rd Qu. Max. 279.0 705.0 708.0 706.5 713.0 772.0 sessionInfo() R version 2.7.0 (2008-04-22) i386-apple-darwin8.10.1 locale: en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] beadarray_1.8.0 affy_1.18.0 preprocessCore_1.2.0 [4] affyio_1.8.0 geneplotter_1.18.0 annotate_1.18.0 [7] xtable_1.5-2 AnnotationDbi_1.2.0 RSQLite_0.6-8 [10] DBI_0.2-4 lattice_0.17-6 Biobase_2.0.0 [13] limma_2.14.0 loaded via a namespace (and not attached): [1] KernSmooth_2.22-22 RColorBrewer_1.0-2 grid_2.7.0 > Hi, > I am working on low-level analysis of bead-level expression data from > Illumina BeadChips using the beadarray package of Bioconductor. Each chip has > 6 arrays with 2 strips. For each of the 12 strips per chip, there is a _.csv, > _.txt, _Grn.tif and _Grn.locs file (and _Grn.xml). There is also a _Grn.idat > file for each of the 6 arrays per chip. My .csv files do not contain > bead-level information but rather just probe-id, number of beads, mean Grn and > dev Grn (I assume Grn is foreground intensity). My .txt files contain > bead-level information with probe-id, foreground intensity Grn and bead > location GrnX, GrnY (bead center coordinates). Therefore, for beadarray > analysis I use the .txt (not .csv) and .tif files. > I read the bead-level data for one chip as follows: > > for background correction method subtract or normexp: > > BLData.sharpen.subtract.txt.tif.013 = readIllumina(textType=".txt", > arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", > "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", > "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", > "1814647013_F_2"), singleChannel=TRUE, useImages=TRUE, > beadInfo=NULL, targets=NULL, storeXY=TRUE, > imageManipulation="sharpen", metrics=TRUE, metricsFile="Metrics.txt", > backgroundMethod="subtract/normexp", normalizeMethod="none") > > Then my first problem is that the background intensities, which I look at > using > an <- arrayNames(BLData.sharpen.txt.tif.013) > BLData.sharpen.subtract/normexp.txt.tif.013 at beadData[[an[1]]]$Gb > are ALL zero. If I specify a small value for backgroundSize (e.g. 4), then a > few of the Gb values are small but nonzero. I did not expect to find all Gb > values to be zero for the default backgroundSize = 17, so what is going on > here??? > > When I read the data without background correction as follows (only changing > the backgroundMethods option) > > BLData.sharpen.txt.tif.013 = readIllumina(textType=".txt", > arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", > "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", > "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", > "1814647013_F_2"), singleChannel=TRUE, useImages=TRUE, > beadInfo=NULL, targets=NULL, storeXY=TRUE, > imageManipulation="sharpen", metrics=TRUE, metricsFile="Metrics.txt", > backgroundMethod="none", normalizeMethod="none") > > then I get the same $G values as for backgroundMethod="subtract" which also > does not make sense to me (the values for backgroundMethod="normexp" are > different) ??? > > Of course I can retrieve the not-background adjusted bead intensities from the > .txt files by setting > useImages=FALSE), but the above with useImages=TRUE and > backgroundMethod="none" should produce the same G values, right?, but it does > not. Instead, backgroundMethod="subtract" and backgroundMethod="none" produce > the same values! > > My last problem is that the bead center coordinates GrnX and GrnY that I find > in the .txt files and those from beadarray, > BLData.sharpen.subtract/normexp.txt.tif.013 at beadData[[an[1]]]$GrnX/Y, are NOT > the same, which worries me! > > Any comments are very much appreciated. > Thanks, Ina Hoeschele

ADD REPLY • link 16.3 years ago Ina Hoeschele ▴ 620

0

Entering edit mode

25/08/2008 18:21 Ina Hoeschele scripsit > Matt and others, > one additional question regarding Illumina BeadChip expression data. > If one uses a background correction method that produces negative > intensities, then what is the best course of action (apart from using > normexp!) in your experience? For the joint analysis of multiple > beadchips (each with 6 samples), if one would delete all genes that > have a negative intensity in some sample, then there would soon be no > data left. So set negative values to a small positive value (which)? > Thanks, Ina > Dear Ina, just replacing negative values by a made-up positive value is a bad idea. For example, it will distort your statistical inference. There are two main types of background correction methods: (1) ones that are guaranteed to result in positive intensities; these methods are necessarily biased, since there are some genes which are not expressed. An example is normexp. (2) ones that try to be unbiased, and then because of noise in the estimator sometimes produce negative, and sometimes positive, values for those genes that are not or only negligibly expressed. The motivation for (1) is that one can directly proceed with taking the log-transformation. With (2) one needs to spend a bit more thought on what is the appropriate transformation, and one answer is the glog-transformation, as provided in the vsn package; see also its vignette, and library("vsn"); citation("vsn") vsn introduces a bias towards 0 for those (g)log-ratios in which numerator and/or denominator are small. Interestingly enough, the bottom-line results of the two methods are often surprisingly similar - the biases are introduced in different places, but (when appropriately tuned) have similar regularizing effects. Best wishes Wolfgang ------------------------------------------------------------------ Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber

ADD REPLY • link 16.3 years ago Wolfgang Huber ★ 13k

0

Entering edit mode

Hi Matt and others, I am working with Illumina beadlevel expression data. When I read the foreground intensities as provided by Illumina in the .txt files (Grn column), then perform beadsummary and then calculate Pearson correlation coefficients between the six samples on one chip, I get correlations between .68 and .96. However, when I read from the .tif files and use backgroundMethod = normexp or subtract, then perform beadsummary and again calculate Pearson correlation coefficients, then I only get values that are essentially zero! Here is the code that I use: For reading from .txt files: BLData.Illumina.txt.013 = readIllumina(textType=".txt", arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", "1814647013_F_2"), singleChannel=TRUE, useImages=FALSE, normalizeMethod="none", backgroundMethod="none") BSData.Illumina.txt.013 <- createBeadSummaryData(BLData.Illumina.txt.0 13,log=FALSE,n=3,imagesPerArray=2,what="G",method="illumina") Gvec1.13 <- NULL Gvec1.13 <- exprs(BSData.Illumina.txt.013)[,1] Gvec2.13 <- NULL Gvec2.13 <- exprs(BSData.Illumina.txt.013)[,2] Gvec3.13 <- NULL Gvec3.13 <- exprs(BSData.Illumina.txt.013)[,3] Gvec4.13 <- NULL Gvec4.13 <- exprs(BSData.Illumina.txt.013)[,4] Gvec5.13 <- NULL Gvec5.13 <- exprs(BSData.Illumina.txt.013)[,5] Gvec6.13 <- NULL Gvec6.13 <- exprs(BSData.Illumina.txt.013)[,6] Gvec13 <- cbind(Gvec1.13,Gvec2.13,Gvec3.13,Gvec4.13,Gvec5.13,Gvec6.13) cor(Gvec13,method="pearson") For reading from .tif files: BLData.sharpen.normexp.txt.tif.013 = readIllumina(textType=".txt", arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", "1814647013_F_2"), singleChannel=TRUE, useImages=TRUE, beadInfo=NULL, targets=NULL, storeXY=TRUE, imageManipulation="sharpen", metrics=TRUE, metricsFile="Metrics.txt", backgroundMethod="normexp", normalizeMethod="none") BSData.sharpen.normexp.txt.tif.013 <- createBeadSummaryData(BLData.sha rpen.normexp.txt.tif.013,log=FALSE,n=3,imagesPerArray=2,what="G",metho d="illumina") ? as above What am I doing wrong with the reading/processing of the .tif files? Thanks again, Ina

ADD REPLY • link 16.3 years ago Ina Hoeschele ▴ 620

0

Entering edit mode

Dear Ina, I can't see anything wrong with your commands (although you might find cor(exprs(BSData.Illumina.txt.013), method="pearson") is an easier way of getting the correlation between pairs of arrays). Are there any quality issues with this data set that could be driving this unusual result? Try plotting the raw data from BLData.Illumina.txt.013 and BLData.sharpen.normexp.txt.tif.013. You could also try running summary(getArrayData(BLData.Illumina.txt.013, what="G", array=1, log=FALSE)) summary(exprs(BSData.Illumina.txt.013)) summary(getArrayData(BLData.sharpen.normexp.txt.tif.013, what="G", array=1, log=FALSE)) summary(exprs( BSData.sharpen.normexp.txt.tif.013)) to check that both the raw and summarised intensities are sensible (they should be on mostly positive and on the range 0 - 80000 (or so) if you have run getArrayData() and createBeadSummaryData() with log=FALSE). If this doesn't turn up anything, perhaps you can send me a few .txt and .tif files from this experiment off list so that I can take a closer look at what is going on. Best wishes, Matt > Hi Matt and others, > I am working with Illumina beadlevel expression data. When I read the > foreground intensities as provided by Illumina in the .txt files (Grn column), > then perform beadsummary and then calculate Pearson correlation coefficients > between the six samples on one chip, I get correlations between .68 and .96. > However, when I read from the .tif files and use backgroundMethod = normexp or > subtract, then perform beadsummary and again calculate Pearson correlation > coefficients, then I only get values that are essentially zero! > Here is the code that I use: > > For reading from .txt files: > > BLData.Illumina.txt.013 = readIllumina(textType=".txt", > arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", > "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", > "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", > "1814647013_F_2"), singleChannel=TRUE, useImages=FALSE, > normalizeMethod="none", backgroundMethod="none") > BSData.Illumina.txt.013 <- > createBeadSummaryData(BLData.Illumina.txt.013,log=FALSE,n=3,imagesPe rArray=2,w > hat="G",method="illumina") > Gvec1.13 <- NULL > Gvec1.13 <- exprs(BSData.Illumina.txt.013)[,1] > Gvec2.13 <- NULL > Gvec2.13 <- exprs(BSData.Illumina.txt.013)[,2] > Gvec3.13 <- NULL > Gvec3.13 <- exprs(BSData.Illumina.txt.013)[,3] > Gvec4.13 <- NULL > Gvec4.13 <- exprs(BSData.Illumina.txt.013)[,4] > Gvec5.13 <- NULL > Gvec5.13 <- exprs(BSData.Illumina.txt.013)[,5] > Gvec6.13 <- NULL > Gvec6.13 <- exprs(BSData.Illumina.txt.013)[,6] > Gvec13 <- cbind(Gvec1.13,Gvec2.13,Gvec3.13,Gvec4.13,Gvec5.13,Gvec6.13) > cor(Gvec13,method="pearson") > > For reading from .tif files: > > BLData.sharpen.normexp.txt.tif.013 = readIllumina(textType=".txt", > arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", > "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", > "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", > "1814647013_F_2"), singleChannel=TRUE, useImages=TRUE, > beadInfo=NULL, targets=NULL, storeXY=TRUE, > imageManipulation="sharpen", metrics=TRUE, metricsFile="Metrics.txt", > backgroundMethod="normexp", normalizeMethod="none") > BSData.sharpen.normexp.txt.tif.013 <- > createBeadSummaryData(BLData.sharpen.normexp.txt.tif.013,log=FALSE,n =3,imagesP > erArray=2,what="G",method="illumina") > ? as above > > What am I doing wrong with the reading/processing of the .tif files? > > Thanks again, Ina

ADD REPLY • link 16.3 years ago Matt Ritchie ▴ 460

score 0 · Answer 2 · 2008-08-26

Matt, when I work with the Grn values from the .txt files (i.e. I read the data into beadarray with useImages=FALSE), everything looks good. Only when I read from the .tif files (useImages=TRUE), then I have a problem. What I am unclear about it whether the Grn values in the .txt files are foreground intensities or already background corrected. I posed this question to our Illumina representative but his answer is not clear to me (see below). "The value "Grn" is in fact the foreground signal intensity of that particular bead. There has not been a global background subtraction applied at this step. There is a local pixel level correction that is done to generate this bead level intensity value, but this image pre-processing cannot be avoided." So I am wondering whether I can simply work with the Grn values from the .txt files and still perform background subtraction, using useImages=FALSE and backgroundMethod=normexp? I.e., should this produce the same result as setting useImages=TRUE and backgroundMethod=normexp? Thanks, Ina ----- Original Message ----- From: "Matt Ritchie" <matt.ritchie@cancer.org.uk> To: "Ina Hoeschele" <inah at="" vbi.vt.edu=""> Cc: bioconductor at stat.math.ethz.ch Sent: Monday, August 25, 2008 11:16:02 PM GMT -05:00 US/Canada Eastern Subject: Re: [BioC] Illumina BeadChips and beadarray Dear Ina, I can't see anything wrong with your commands (although you might find cor(exprs(BSData.Illumina.txt.013), method="pearson") is an easier way of getting the correlation between pairs of arrays). Are there any quality issues with this data set that could be driving this unusual result? Try plotting the raw data from BLData.Illumina.txt.013 and BLData.sharpen.normexp.txt.tif.013. You could also try running summary(getArrayData(BLData.Illumina.txt.013, what="G", array=1, log=FALSE)) summary(exprs(BSData.Illumina.txt.013)) summary(getArrayData(BLData.sharpen.normexp.txt.tif.013, what="G", array=1, log=FALSE)) summary(exprs( BSData.sharpen.normexp.txt.tif.013)) to check that both the raw and summarised intensities are sensible (they should be on mostly positive and on the range 0 - 80000 (or so) if you have run getArrayData() and createBeadSummaryData() with log=FALSE). If this doesn't turn up anything, perhaps you can send me a few .txt and .tif files from this experiment off list so that I can take a closer look at what is going on. Best wishes, Matt > Hi Matt and others, > I am working with Illumina beadlevel expression data. When I read the > foreground intensities as provided by Illumina in the .txt files (Grn column), > then perform beadsummary and then calculate Pearson correlation coefficients > between the six samples on one chip, I get correlations between .68 and .96. > However, when I read from the .tif files and use backgroundMethod = normexp or > subtract, then perform beadsummary and again calculate Pearson correlation > coefficients, then I only get values that are essentially zero! > Here is the code that I use: > > For reading from .txt files: > > BLData.Illumina.txt.013 = readIllumina(textType=".txt", > arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", > "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", > "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", > "1814647013_F_2"), singleChannel=TRUE, useImages=FALSE, > normalizeMethod="none", backgroundMethod="none") > BSData.Illumina.txt.013 <- > createBeadSummaryData(BLData.Illumina.txt.013,log=FALSE,n=3,imagesPe rArray=2,w > hat="G",method="illumina") > Gvec1.13 <- NULL > Gvec1.13 <- exprs(BSData.Illumina.txt.013)[,1] > Gvec2.13 <- NULL > Gvec2.13 <- exprs(BSData.Illumina.txt.013)[,2] > Gvec3.13 <- NULL > Gvec3.13 <- exprs(BSData.Illumina.txt.013)[,3] > Gvec4.13 <- NULL > Gvec4.13 <- exprs(BSData.Illumina.txt.013)[,4] > Gvec5.13 <- NULL > Gvec5.13 <- exprs(BSData.Illumina.txt.013)[,5] > Gvec6.13 <- NULL > Gvec6.13 <- exprs(BSData.Illumina.txt.013)[,6] > Gvec13 <- cbind(Gvec1.13,Gvec2.13,Gvec3.13,Gvec4.13,Gvec5.13,Gvec6.13) > cor(Gvec13,method="pearson") > > For reading from .tif files: > > BLData.sharpen.normexp.txt.tif.013 = readIllumina(textType=".txt", > arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", > "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", > "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", > "1814647013_F_2"), singleChannel=TRUE, useImages=TRUE, > beadInfo=NULL, targets=NULL, storeXY=TRUE, > imageManipulation="sharpen", metrics=TRUE, metricsFile="Metrics.txt", > backgroundMethod="normexp", normalizeMethod="none") > BSData.sharpen.normexp.txt.tif.013 <- > createBeadSummaryData(BLData.sharpen.normexp.txt.tif.013,log=FALSE,n =3,imagesP > erArray=2,what="G",method="illumina") > ? as above > > What am I doing wrong with the reading/processing of the .tif files? > > Thanks again, Ina

score 0 · Answer 3 · 2008-08-26

Dear Ina, Thanks for the further detail. The intensities do look too low, especially when you use the tifs to reconstruct the bead-level intensities. If you put the tif and text files for these arrays online somewhere, I'll get someone to look into this in more detail. In the meantime, try analysing the data you get from running readIllumina() with useImages=FALSE, as it looks more reasonable than the data you are getting with useImages=TRUE. Best wishes, Matt > Hi Matt, > I've done all of the below and much more. I have also had our Illumina > representative look at the data and he claims that everything is fine. Below > are selected quantiles of the bead-level intensities (0.05, 0.25, 0.5, 0.75, > 0.95) for one chip and its 12 strips. > (1) using Illumina Grn values from txt files: >> quantiles > [,1] [,2] [,3] [,4] [,5] > [1,] -1 0 0 11 418.00 > [2,] -1 0 0 12 447.00 > [3,] -2 0 6 80 1364.00 > [4,] -2 0 7 90 1409.45 > [5,] 0 0 0 6 297.00 > [6,] 0 0 0 5 286.00 > [7,] -1 0 4 69 1195.00 > [8,] -2 0 5 78 1301.00 > [9,] 0 0 0 4 251.00 > [10,] 0 0 0 6 298.00 > [11,] -1 0 4 62 1158.00 > [12,] -2 0 5 73 1210.00 > > (2) using beadarray G values with backgroundMethod=subtract (same as none): >> quantiles > [05] [25] [50] [75] [95] > [1,] -5.954797 -0.5736889 0 0.001486963 76.50805 > [2,] -6.716048 -0.6553347 0 0.000000000 74.87552 > [3,] -8.874994 -1.0853280 0 12.898865185 257.36723 > [4,] -9.878324 -1.1983333 0 12.881387407 248.49933 > [5,] -5.474805 -0.4840000 0 0.000000000 44.99409 > [6,] -5.847878 -0.4500096 0 0.000000000 36.67622 > [7,] -10.023832 -1.1582759 0 7.222904311 198.11332 > [8,] -9.920325 -1.1513581 0 9.187925926 218.21951 > [9,] -5.433433 -0.3801279 0 0.000000000 32.71187 > [10,] -5.631454 -0.4804800 0 0.000000000 43.97248 > [11,] -9.375240 -1.0264076 0 6.900505556 200.94659 > [12,] -9.692831 -1.1042407 0 8.113128889 205.05368 > > (3) using beadarray Grn values with backgroundMethod=normexp: >> quantiles > [,1] [,2] [,3] [,4] [,5] > [1,] 6.157799 9.222023 9.625428 9.626493 84.89528 > [2,] 7.833909 10.908199 11.311446 11.311446 83.60846 > [3,] 9.575426 14.956604 15.860443 28.092912 272.54870 > [4,] 8.655667 14.778997 15.819756 28.264072 263.87804 > [5,] 11.174887 13.368365 13.607286 13.607286 52.19423 > [6,] 5.395829 8.226291 8.523353 8.523353 43.72268 > [7,] 8.578752 14.192817 15.117620 21.596692 212.34225 > [8,] 8.297437 14.184411 15.152457 23.788444 232.78697 > [9,] 6.201761 8.595434 8.813951 8.813951 38.92746 > [10,] 4.972100 7.794981 8.128802 8.128802 50.99212 > [11,] 8.195206 13.696098 14.540742 20.851138 214.80211 > [12,] 13.007751 17.648962 18.359116 24.356513 219.90004 > > Here are summaries for the summarized data: > > (1) Illumina Grn bead values from txt files, summary-method = Illumina: >> summary > Min. 1st Qu. Median Mean 3rd Qu. Max. > -26.7600 0.6667 1.6800 104.9000 6.7400 32190.0000 > -2.4750 0.7083 3.1140 297.7000 56.6200 29640.0000 > -20.5900 0.4651 1.5860 155.8000 7.5650 31930.0000 > -11.0800 0.6562 2.9700 298.3000 56.6000 28770.0000 > -54.7200 0.4667 1.3850 112.1000 5.3750 31020.0000 > -5.5780 0.6875 3.0310 320.9000 63.5900 28990.0000 > > (2) backgroundMethod=subtract, summary-method = Illumina: >> summary > Min. 1st Qu. Median Mean 3rd Qu. Max. > -2.75800 -0.29720 -0.15130 -0.07024 -0.06051 167.00000 > -4.9880 -0.9490 -0.5366 -0.4486 -0.1624 46.3600 > -4.54900 -0.13050 -0.05114 0.95690 -0.01015 188.30000 > -5.2030 -0.9115 -0.5194 -0.5233 -0.2052 56.3800 > -13.060000 -0.111400 -0.038540 1.166000 -0.004997 172.300000 > -5.3580 -0.8505 -0.4745 -0.4903 -0.1856 39.3200 > > (3) backgroundMethod=normexp, summary-method = Illumina: >> summary > 8.652 9.946 10.160 10.160 10.370 12.450 > 12.19 14.93 15.30 15.37 15.62 61.94 > 7.72 10.49 10.97 11.03 11.49 15.02 > 11.66 14.27 14.62 14.61 14.90 40.31 > 7.201 8.174 8.291 8.282 8.392 26.040 > 12.61 15.60 16.16 16.24 16.80 44.74 > > > Where I really see a problem is with the MA plots of the unnormalized > beadsummary data (see attached plots): > (1) For the summaries created from the Illumina Grn values from txt files, MA > plots look normal to me. > (2) For the summaries created from the beadarray G values using > backgroundMethod=normexp, the MA plots do NOT look as expected to me. > > Can I upload some of the .txt and .tif files for you somewhere, if you are > still willing to look at them? > > MANY thanks, Ina > > > ----- Original Message ----- > From: "Matt Ritchie" <matt.ritchie at="" cancer.org.uk=""> > To: "Ina Hoeschele" <inah at="" vbi.vt.edu=""> > Cc: bioconductor at stat.math.ethz.ch > Sent: Monday, August 25, 2008 11:16:02 PM GMT -05:00 US/Canada Eastern > Subject: Re: [BioC] Illumina BeadChips and beadarray > > Dear Ina, > > I can't see anything wrong with your commands (although you might find > > cor(exprs(BSData.Illumina.txt.013), method="pearson") > > is an easier way of getting the correlation between pairs of arrays). > > Are there any quality issues with this data set that could be driving this > unusual result? Try plotting the raw data from BLData.Illumina.txt.013 and > BLData.sharpen.normexp.txt.tif.013. You could also try running > > summary(getArrayData(BLData.Illumina.txt.013, what="G", array=1, log=FALSE)) > summary(exprs(BSData.Illumina.txt.013)) > summary(getArrayData(BLData.sharpen.normexp.txt.tif.013, what="G", array=1, > log=FALSE)) > summary(exprs( BSData.sharpen.normexp.txt.tif.013)) > > to check that both the raw and summarised intensities are sensible (they > should be on mostly positive and on the range 0 - 80000 (or so) if you have > run getArrayData() and createBeadSummaryData() with log=FALSE). If this > doesn't turn up anything, perhaps you can send me a few .txt and .tif files > from this experiment off list so that I can take a closer look at what is > going on. > > Best wishes, > > Matt > >> Hi Matt and others, >> I am working with Illumina beadlevel expression data. When I read the >> foreground intensities as provided by Illumina in the .txt files (Grn >> column), >> then perform beadsummary and then calculate Pearson correlation coefficients >> between the six samples on one chip, I get correlations between .68 and .96. >> However, when I read from the .tif files and use backgroundMethod = normexp >> or >> subtract, then perform beadsummary and again calculate Pearson correlation >> coefficients, then I only get values that are essentially zero! >> Here is the code that I use: >> >> For reading from .txt files: >> >> BLData.Illumina.txt.013 = readIllumina(textType=".txt", >> arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", >> "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", >> "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", >> "1814647013_F_2"), singleChannel=TRUE, useImages=FALSE, >> normalizeMethod="none", backgroundMethod="none") >> BSData.Illumina.txt.013 <- >> createBeadSummaryData(BLData.Illumina.txt.013,log=FALSE,n=3,imagesPerA rray=2,>> w >> hat="G",method="illumina") >> Gvec1.13 <- NULL >> Gvec1.13 <- exprs(BSData.Illumina.txt.013)[,1] >> Gvec2.13 <- NULL >> Gvec2.13 <- exprs(BSData.Illumina.txt.013)[,2] >> Gvec3.13 <- NULL >> Gvec3.13 <- exprs(BSData.Illumina.txt.013)[,3] >> Gvec4.13 <- NULL >> Gvec4.13 <- exprs(BSData.Illumina.txt.013)[,4] >> Gvec5.13 <- NULL >> Gvec5.13 <- exprs(BSData.Illumina.txt.013)[,5] >> Gvec6.13 <- NULL >> Gvec6.13 <- exprs(BSData.Illumina.txt.013)[,6] >> Gvec13 <- cbind(Gvec1.13,Gvec2.13,Gvec3.13,Gvec4.13,Gvec5.13,Gvec6.13) >> cor(Gvec13,method="pearson") >> >> For reading from .tif files: >> >> BLData.sharpen.normexp.txt.tif.013 = readIllumina(textType=".txt", >> arrayNames = c("1814647013_A_1","1814647013_A_2","1814647013_B_1", >> "1814647013_B_2","1814647013_C_1","1814647013_C_2","1814647013_D_1", >> "1814647013_D_2","1814647013_E_1","1814647013_E_2","1814647013_F_1", >> "1814647013_F_2"), singleChannel=TRUE, useImages=TRUE, >> beadInfo=NULL, targets=NULL, storeXY=TRUE, >> imageManipulation="sharpen", metrics=TRUE, metricsFile="Metrics.txt", >> backgroundMethod="normexp", normalizeMethod="none") >> BSData.sharpen.normexp.txt.tif.013 <- >> createBeadSummaryData(BLData.sharpen.normexp.txt.tif.013,log=FALSE,n=3 ,images>> P >> erArray=2,what="G",method="illumina") >> ? as above >> >> What am I doing wrong with the reading/processing of the .tif files? >> >> Thanks again, Ina