affy chips' quality control question

0

Entering edit mode

Ren Na ▴ 250

@ren-na-870

Last seen 10.6 years ago

Hi, First, I am very sorry that my attachments are still over size limit. I couldn't find a way to make it smaller and readable at the same time. I got 30 affy arrays (hgu133a) which were carried out over a period of two and half years. There are six types of samples, a,b,c,d,e,f. Each type has different number of biological replicates except type f (no technical replicates). I am going to do pair wise comparisons among these types. For checking array's quality, I read previous posts about this subject and compared my plots with PLM Image Gallery. But how to decide which array is acceptable or not acceptable is still puzzling me. How much difference in these plots across arrays is considered as serious difference? And corresponding arrays should be excluded in down-stream data analysis? sample c2 is obviously different from other arrays in density plot, NUSE plot, and it has large residuals in residual plot and large average background. Maybe it's better to remove this array. But how about the arrays with difference not as big as c2. Please see attachment files. In file "residual.png",I included four arrays'residual images which has bigger artifacts comparing with other arrays. Any comments and suggestions would be greatly appreciated. Thanks! Ren -------------- next part -------------- A non-text attachment was scrubbed... Name: hist_NUSE_RLE.png Type: image/png Size: 7358 bytes Desc: hist_NUSE_RLE.png Url : https://stat.ethz.ch/pipermail/bioconductor/attachments/20060203 /27e1eed9/hist_NUSE_RLE.png -------------- next part -------------- A non-text attachment was scrubbed... Name: residual.png Type: image/png Size: 42943 bytes Desc: residual.png Url : https://stat.ethz.ch/pipermail/bioconductor/attachments/20060203 /27e1eed9/residual.png

affy affy • 1.6k views

ADD COMMENT • link 19.2 years ago Ren Na ▴ 250

0

Entering edit mode

Francois Collin ▴ 130

@francois-collin-470

Last seen 10.6 years ago

Hello Ren, Being puzzled in this case is a good thing as there are no hard and fast rules as to which outlier chips should be excluded from an analysis. In a sense, the answer really depends on the analysis in question. One way to answer the question of which chips should be excluded, or treated differently, is to see the impact that each chip has on the results. You mentioned planning to do pairwise comparisons. Why not initially include all chips and see how aberrant the results are for the pairs that include the outlier chips. If your pairwise comparison entails looking for differentially expressed genes, I suspect that b6 will be the worst offender in spite of having normal looking NUSE values. Hope this helps. -francois --- "Na, Ren" <na at="" uthscsa.edu=""> wrote: > Hi, > > > > First, I am very sorry that my attachments are still > over size limit. > > I couldn't find a way to make it smaller and > readable at the > > same time. > > I got 30 affy arrays (hgu133a) which were carried > out over a period > > of two and half years. There are six types of > samples, a,b,c,d,e,f. > > Each type has different number of biological > replicates except type > > f (no technical replicates). I am going to do pair > wise comparisons > > among these types. > > For checking array's quality, I read previous posts > about this > > subject and compared my plots with PLM Image > Gallery. But how > > to decide which array is acceptable or not > acceptable is still > > puzzling me. How much difference in these plots > across arrays is > > considered as serious difference? And corresponding > arrays should > > be excluded in down-stream data analysis? > > sample c2 is obviously different from other arrays > in density plot, > > NUSE plot, and it has large residuals in residual > plot and large > > average background. Maybe it's better to remove this > array. But how > > about the arrays with difference not as big as c2. > Please see > > attachment files. > > In file "residual.png",I included four > arrays'residual images which > > has bigger artifacts comparing with other arrays. > > > > Any comments and suggestions would be greatly > appreciated. > > > > Thanks! > > > > Ren > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >

ADD COMMENT • link 19.2 years ago Francois Collin ▴ 130

0

Entering edit mode

Dear users, I am trying to use limma for analysis of Q-PCR data. I have a matrix of log2 intensities for 96genes X 20 samples that I have imported using read.table. Some of the cells (not many) have missing values because the PCR did not work for that gene on that sample and I left them blank. > myEset Expression Set (exprSet) with 96 genes 19 samples phenoData object with 1 variables and 19 cases varLabels cov1: read from file I read in the help vignette that lmFit can handle missing values, but when I try it, I get the following error: treatments = factor(c(0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1),labels = c ("BD","C")) > design = model.matrix(~0+treatments) > fit <- lmFit(exprs(myEset),design) Error in lm.fit(design, t(M)) : NA/NaN/Inf in foreign function call (arg 4) In addition: Warning message: NAs introduced by coercion I checked the mailing archives and found a few emails saying that limma can handle missing values. I also tried setting na.rm=TRUE Any ideas as to how to proceed? I also learned elsewhere that it is better to use a non-parametric test for PCR data. Any opinions about this? Regards, David

ADD REPLY • link 19.2 years ago kfbargad@ehu.es ▴ 270

0

Entering edit mode

Dear List, I have come across an article (Choudary et al. 2005, PNAS 102,15653- 15658) where they state that the FC values after RMA preprocessing "always" remain below a maximum of 2.0 Is this right? I have performed some analyses using RMA, quantile normalisation and limma and am getting M values higher than 2, and if M = log2(FC), then FC values are higher than 4. What am I missing? Any comments on this? Thanks in advance, David

ADD REPLY • link 19.1 years ago kfbargad@ehu.es ▴ 270

0

Entering edit mode

Two points: 1. In general estimates of FC off microarrays tend to be smaller than the truth, irrespective of processing algorithm. 2. There is no specific reason why RMA should limit to FC values of 2.0 (and it does not do this in general, as you have observed with your own dataset). In the case of Choudary et al they are studying gene expression changes in the brain and my understanding is that these fold changes are typically small, perhaps explaining the comment. Ben On Tue, 2006-02-14 at 10:30 +0100, kfbargad at ehu.es wrote: > Dear List, > > I have come across an article (Choudary et al. 2005, PNAS 102,15653- > 15658) where they state that the FC values after RMA > preprocessing "always" remain below a maximum of 2.0 Is this right? > > I have performed some analyses using RMA, quantile normalisation and > limma and am getting M values higher than 2, and if M = log2(FC), then > FC values are higher than 4. What am I missing? Any comments on this? > > Thanks in advance, > > David > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor

ADD REPLY • link 19.1 years ago Ben Bolstad ★ 1.2k

0

Entering edit mode

Hi David, kfbargad at ehu.es wrote: > Dear List, > > I have come across an article (Choudary et al. 2005, PNAS 102,15653- > 15658) where they state that the FC values after RMA > preprocessing "always" remain below a maximum of 2.0 Is this right? No, this is not correct. There are other factual errors in that paper as well, which makes me wonder if there was a breakdown in communication between the statisticians and those who wrote the paper. That said, it is my understanding that fold change values in brain are often very small, so they may simply be trying to indicate that using a fold change of two is not reasonable in that context. Best, Jim > > I have performed some analyses using RMA, quantile normalisation and > limma and am getting M values higher than 2, and if M = log2(FC), then > FC values are higher than 4. What am I missing? Any comments on this? > > Thanks in advance, > > David > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623

ADD REPLY • link 19.1 years ago James W. MacDonald 68k

0

Entering edit mode

Ren Na ▴ 250

@ren-na-870

Last seen 10.6 years ago

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20060206/ 1dea0b94/attachment.pl

ADD COMMENT • link 19.2 years ago Ren Na ▴ 250

Login before adding your answer.