Question

Maybe a usefull for limma users

0

Entering edit mode

Breuk102 ▴ 50

@breuk102-836

Last seen 10.6 years ago

Dear Limma users, In our lab we print our own microarray slides. We print both dedicated and whole genome arrays. With dedicated arrays there is the possibility to find large subsets of data which is up or down regulated, rendering it difficult to normalize this data with loess or printtip loess. Recently we have a slide containing tags for plant as well as pathogen together with spikes and other control spots. It now happens that if you have a time course experiment with the pathogen on the plants you get a very large subset of genes that seem to be upregulated however this is is not true. This is because in our control (not infected plants) there is almost no RNA present of our pathogen. In time this pathogen grows so relatively more pathogen RNA is produced. To follow these gene products trough time I needed a way to normalize on a subset of genes (pathogen) only. If have written a "little" function and extension for the normalization function that can be used in limma to render it possible to normalize on a subset of genes only. The first function is called flags() RG <- flags(RG, "regex") where regex is eg. "genesX|genesY|genesZ" The regex uses the RG$genes$Name column and searches for names similar to the ones in your regex list. e.g conrol1, control2 can be regex as "con" After this I added to the normalizeWithinArrays() function an extra option: MA<-normalizeWithinArrays(RG, method="subsetloess") Basically the method flags gives all selected genes a 1 and the others a 0 than it uses this as weights in the normalization (loess). If you woul like to also use the spotweights themselves than you can multiply the RG$flags column with the weights column. After normalization you'll see that only your subset of genes have been used to calculate the loess fit. In our case this works very well. I have validated my data against other experiments in which we compensate for the mRNA level difference (which is time consuming) and the results are similar. I fit the data lmFit only on the genes of interest and voila great results. If you would like to know more or see some scatterplots I'm happy to e-mail them. I'm well aware that other possibilities for normalization exists however this function is simple and fast (at least for us) and therefore it might be usefull for others too. Yours, Bas van Breukelen === functions === normalizeWithinArrays // the whole function is long... only the last part containing the addition is shown here. }, robustspline = { if (is.null(layout)) stop("Layout argument not specified") for (j in 1:narrays) object$M[, j] <- normalizeRobustSpline(object$M[, j], object$A[, j], layout, df = df, method = robust) }, subsetloess = { controlspots = TRUE for (j in 1:narrays){ y <- object$M[,j] x <- object$A[,j] w <- object$flags object$M[,j] <- loessFit(y,x,w, span=span, iterations = iterations)$residuals } }) object } flags <- function (object, regex) { object$flags <- regexpr(regex, as.vector(object$genes[,6])) rows <- NROW(object$flags) for (j in 1:rows){ if (object$flags[j] < 0){ object$flags[j] <- 0 }else{ object$flags[j] <- 1 } } return(object) } =========== ================================= Dr. Ir. B. van Breukelen PostDoc, Bioinformatics, Molecular genetics Dept. of Biology. Room N407: H.R. Kruytgebouw Padualaan 8 3574 CH Utrecht Tel: +31(0)30 253 3355 Mobile: +31(0)6 24 996046 e-mail: b.vanbreukelen@bio.uu.nl website: http://genomics.bio.uu.nl/

Microarray Normalization limma Microarray Normalization limma • 990 views

ADD COMMENT • link updated 20.4 years ago by Gordon Smyth 52k • written 20.4 years ago by Breuk102 ▴ 50

score 0 · Answer 1 · 2004-12-17

> Date: Fri, 17 Dec 2004 10:35:24 +0100 > From: "Dr. Ir. B. van Breukelen" <b.vanbreukelen@bio.uu.nl> > Subject: [BioC] Maybe a usefull for limma users > To: <bioconductor@stat.math.ethz.ch> > Message-ID: <solis1023z4d5htayd70000053a@solis102.soliscom.uu.nl> > Content-Type: text/plain; charset="us-ascii" > > Dear Limma users, > > In our lab we print our own microarray slides. We print both dedicated and > whole genome arrays. With dedicated arrays there is the possibility to find > large subsets of data which is up or down regulated, rendering it difficult > to normalize this data with loess or printtip loess. Recently we have a > slide containing tags for plant as well as pathogen together with spikes and > other control spots. It now happens that if you have a time course > experiment with the pathogen on the plants you get a very large subset of > genes that seem to be upregulated however this is is not true. This is > because in our control (not infected plants) there is almost no RNA present > of our pathogen. In time this pathogen grows so relatively more pathogen RNA > is produced. To follow these gene products trough time I needed a way to > normalize on a subset of genes (pathogen) only. If have written a "little" > function and extension for the normalization function that can be used in > limma to render it possible to normalize on a subset of genes only. The > first function is called flags() > RG <- flags(RG, "regex") where regex is eg. "genesX|genesY|genesZ" The > regex uses the RG$genes$Name column and searches for names similar to the > ones in your regex list. e.g conrol1, control2 can be regex as "con" > After this I added to the normalizeWithinArrays() function an extra option: > MA<-normalizeWithinArrays(RG, method="subsetloess") > Basically the method flags gives all selected genes a 1 and the others a 0 > than it uses this as weights in the normalization (loess). If you woul like > to also use the spotweights themselves than you can multiply the RG$flags > column with the weights column. After normalization you'll see that only > your subset of genes have been used to calculate the loess fit. In our case > this works very well. I have validated my data against other experiments in > which we compensate for the mRNA level difference (which is time consuming) > and the results are similar. I fit the data lmFit only on the genes of > interest and voila great results. > > If you would like to know more or see some scatterplots I'm happy to e-mail > them. > I'm well aware that other possibilities for normalization exists however > this function is simple and fast (at least for us) and therefore it might be > usefull for others too. > > Yours, > > Bas van Breukelen Thanks for your suggestion. I definitely agree with you that normalizing on a subset of control genes can be a very useful technique. For some reason it an idea which seems to have been slow to catch on. Another way to do what you want using existing limma functions would be (assuming RG to be an RGList and 'flags' to be a vector of 0s and 1s) w <- array(flags, dim(RG)) MA <- normalizeWithinArrays(RG, method="loess", weights=w) I think this is equivalent to your code below. Alternatively one could use modifyWeights() to alter existing quality weights in RG rather than overwrite them, as on page 15 of the Limma User's Guide. Gordon > === functions === > > normalizeWithinArrays > > // the whole function is long... only the last part containing the addition > is shown here. > }, robustspline = { > if (is.null(layout)) > stop("Layout argument not specified") > for (j in 1:narrays) object$M[, j] <- > normalizeRobustSpline(object$M[, > j], object$A[, j], layout, df = df, method = robust) > }, subsetloess = { > controlspots = TRUE > for (j in 1:narrays){ > y <- object$M[,j] > x <- object$A[,j] > w <- object$flags > object$M[,j] <- loessFit(y,x,w, span=span, iterations = > iterations)$residuals > } > }) > object > } > > flags <- function (object, regex) > { > object$flags <- regexpr(regex, as.vector(object$genes[,6])) > rows <- NROW(object$flags) > > for (j in 1:rows){ > if (object$flags[j] < 0){ > object$flags[j] <- 0 > }else{ > object$flags[j] <- 1 > } > } > return(object) > } > > =========== > > ================================= > Dr. Ir. B. van Breukelen > PostDoc, Bioinformatics, Molecular genetics > Dept. of Biology. > Room N407: H.R. Kruytgebouw > Padualaan 8 > 3574 CH Utrecht > > Tel: +31(0)30 253 3355 > Mobile: +31(0)6 24 996046 > e-mail: b.vanbreukelen@bio.uu.nl > website: http://genomics.bio.uu.nl/