Low-level analysis of custom microarrays

0

Entering edit mode

Martin Morgan 25k

@martin-morgan-1513

Last seen 10 weeks ago

United States

Hi Teresa -- "Teresa Colombo" <teresa.colombo at="" gmail.com=""> writes: > Hi Dear BioC list, > > I am a newbie and apologize in advance if this is just a stupid > question. But I've been trapped by this problem since 2 weeks now and > dunno how to move on from here without a little help... > > my TASK: Perform background correction on (miRNA) microarray data from > a custom chip, taking into account slide spatial info (no simple > subtraction of background intensities). Others on the list can speak more knowledgeably than me about these things, so please take my input lightly. I guess these are two-color Agilent arrays. The data you present below is not from the 'gpr' files required to perform 'background correction', but from some later point in the analysis that you must determine (because the data has likely already had some kinds of data transformation applied). Background correction usually involves transformations of individual spot foreground and background intensities, perhaps taking into account some properties of all spots in the array but not usually spatial location. Your data do not include foreground and background information for each channel (probably some background correction method has already been applied), so background correction cannot be performed. Spatial effects might typically be accommodated by within-array normalization. These methods attempt to make the difference in (background-corrected) channel intensities ('M' values) statistically independent of the average intensities ('A' values) of each spot. The usual methods implicitly incorporate spatial variation (as a factor contributing to variation in 'A'). An important assumption is that expression of the majority of spots does not differ between channels. This may not be the case for your miRNA arrays. miRNAs also likely exhibit significant dye effects, and these need to be accommodated. A starting point for two-color analyses is the limma package and its comprehensive user guide. limma would take you from gpr files through background correction, normalization, and assessment of differential expression. Though again miRNAs require special consideration. Hope that helps. Martin > my INPUT DATA FORMAT: For each slide, a tab delimited text file > carrying the following info: > "Probe_ID" "Row" "Column" "Density_mean_{A}" > "Density_st.dev._ {A}" > > For example, the following are the first 5 lines for one of the slides: > "empty" 1 1 174,2 8,57 > "hsa-let-7a" 1 2 49522,89 343,1 > "hsa-miR-150" 1 3 40738,46 677,54 > "hsa-miR-204" 1 4 209,61 15,48 > "hsa-miR-32" 1 5 223,07 15,24 > > There are 7 replicates for each experimental probe + many internal > control probes (row.names are not unique). > > my QUESTION: > Is there any R package/function available to perform background > correction taking into account the slide design/spatial info (amenable > to be used with this kind of raw input data - e.g., neither .CEL nor > Illumina input data)? > > my R version - attached packages: >> sessionInfo() > R version 2.4.0 Patched (2006-11-25 r39997) > i486-pc-linux-gnu > > locale: > LC_CTYPE=it_IT at euro;LC_NUMERIC=C;LC_TIME=it_IT at euro;LC_COLLATE=it_IT at euro;LC_MONETARY=it_IT at euro;LC_MESSAGES=it_IT at euro;LC_PAPER=it_IT at euro;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT at euro;LC_IDENTIFICATION=C > > attached base packages: > [1] "tools" "stats" "graphics" "grDevices" "utils" "datasets" > [7] "methods" "base" > > other attached packages: > affy affyio Biobase > "1.12.2" "1.2.0" "1.12.2" > > > Thank you in advance for your help and time! > > Best, > teresa > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793

Microarray Normalization Cancer probe affy limma affyio Microarray Normalization Cancer • 1.3k views

ADD COMMENT • link updated 17.3 years ago by Wolfgang Huber ★ 13k • written 17.3 years ago by Martin Morgan 25k

0

Entering edit mode

Wolfgang Huber ★ 13k

@wolfgang-huber-3550

Last seen 6 weeks ago

EMBL European Molecular Biology Laborat…

Dear Teresa if I understand your question correctly (please correct me if not), you want to estimate and adjust for a spatially dependent background signal (e.g. a "gradient"), and that estimate is not provided by the image analysis software. Doing this well is hard, "well" meaning that you don't just remove apparent nuisance trends, but do keep the real, biology-related changes in the intensities. The print-tip normalisation (e.g. in limma, also via the "strata" argument of vsn) is often a good proxy for adjusting spatial trends. You can also try 2D local regression (loess function, or the locfit package, or with the OLIN package in Bioconductor) References: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1523216 http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=126873 http://bioinformatics.oxfordjournals.org/cgi/content/full/21/8/1724 Best wishes Wolfgang > "Teresa Colombo" <teresa.colombo at="" gmail.com=""> writes: > >> Hi Dear BioC list, >> >> I am a newbie and apologize in advance if this is just a stupid >> question. But I've been trapped by this problem since 2 weeks now and >> dunno how to move on from here without a little help... >> >> my TASK: Perform background correction on (miRNA) microarray data from >> a custom chip, taking into account slide spatial info (no simple >> subtraction of background intensities). > > Others on the list can speak more knowledgeably than me about these > things, so please take my input lightly. > > I guess these are two-color Agilent arrays. The data you present below > is not from the 'gpr' files required to perform 'background > correction', but from some later point in the analysis that you must > determine (because the data has likely already had some kinds of data > transformation applied). > > Background correction usually involves transformations of individual > spot foreground and background intensities, perhaps taking into > account some properties of all spots in the array but not usually > spatial location. Your data do not include foreground and background > information for each channel (probably some background correction > method has already been applied), so background correction cannot be > performed. > > Spatial effects might typically be accommodated by within-array > normalization. These methods attempt to make the difference in > (background-corrected) channel intensities ('M' values) > statistically independent of the average intensities ('A' values) of > each spot. The usual methods implicitly incorporate spatial variation > (as a factor contributing to variation in 'A'). > > An important assumption is that expression of the majority of spots > does not differ between channels. This may not be the case for your > miRNA arrays. miRNAs also likely exhibit significant dye effects, and > these need to be accommodated. > > A starting point for two-color analyses is the limma package and its > comprehensive user guide. limma would take you from gpr files through > background correction, normalization, and assessment of differential > expression. Though again miRNAs require special consideration. > > Hope that helps. > > Martin > >> my INPUT DATA FORMAT: For each slide, a tab delimited text file >> carrying the following info: >> "Probe_ID" "Row" "Column" "Density_mean_{A}" >> "Density_st.dev._ {A}" >> >> For example, the following are the first 5 lines for one of the slides: >> "empty" 1 1 174,2 8,57 >> "hsa-let-7a" 1 2 49522,89 343,1 >> "hsa-miR-150" 1 3 40738,46 677,54 >> "hsa-miR-204" 1 4 209,61 15,48 >> "hsa-miR-32" 1 5 223,07 15,24 >> >> There are 7 replicates for each experimental probe + many internal >> control probes (row.names are not unique). >> >> my QUESTION: >> Is there any R package/function available to perform background >> correction taking into account the slide design/spatial info (amenable >> to be used with this kind of raw input data - e.g., neither .CEL nor >> Illumina input data)? >> >> my R version - attached packages: >>> sessionInfo() >> R version 2.4.0 Patched (2006-11-25 r39997) >> i486-pc-linux-gnu >> >> locale: >> LC_CTYPE=it_IT at euro;LC_NUMERIC=C;LC_TIME=it_IT at euro;LC_COLLATE=it_IT at euro;LC_MONETARY=it_IT at euro;LC_MESSAGES=it_IT at euro;LC_PAPER=it_IT at euro;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT at euro;LC_IDENTIFICATION=C >> >> attached base packages: >> [1] "tools" "stats" "graphics" "grDevices" "utils" "datasets" >> [7] "methods" "base" >> >> other attached packages: >> affy affyio Biobase >> "1.12.2" "1.2.0" "1.12.2" >> >> >> Thank you in advance for your help and time! >> >> Best, >> teresa >>

ADD COMMENT • link 17.3 years ago Wolfgang Huber ★ 13k

Login before adding your answer.