Limma analysis of focused arrays vs. whole genome arrays

0

Entering edit mode

Mike Schaffer ▴ 90

@mike-schaffer-424

Last seen 10.2 years ago

Hi, The lab I work with has used "whole genome" human arrays (~18,000 genes) for a couple years and I have helped with the analysis using Limma. Now, due to costs, they are now considering switching from whole genome arrays to focused arrays with ~400 genes of interest (selected from the whole-genome array results). The obvious analysis problems with a focused array where most genes are changing are: 1. LOESS normalization assumes most genes are not changing. If most of the genes are expected to change, there is no basis to recenter the data around zero. The response from the lab was that they would be willing to include 100-150 genes that are not expected to change. 2. The B-statistic in Limma requires a parameter indicating a certain fraction of genes are changing. The corresponding moderated t-statistic uses the data from all genes to moderate the standard error in the t calculation. Both of these could change dramatically if most of the genes on the array are changing. My questions are: 1. Are my concerns valid and are there ways around around them? Are there other analysis pitfalls with this scenario? 2. Can Limma handle situations where most of an array is expected to change? What modifications, if any, need to be made to the Limma analysis to account for this? 3. Alternatively, is there a more appropriate statistical package to use in this case? Thanks. -- Mike

Normalization limma Normalization limma • 1.1k views

ADD COMMENT • link updated 19.5 years ago by Gordon Smyth 51k • written 19.5 years ago by Mike Schaffer ▴ 90

0

Entering edit mode

A.J. Rossini ▴ 210

@aj-rossini-973

Last seen 10.2 years ago

I would've rephrased the problem differently: Given that you can't depend on the "typical" assumption of "zero-expression", what features should you design in for comparability? The idea of housekeeping genes seems sensible in theory -- in practice, I'm not sure how to protect from inadvertent "discovery". best, -tony On 6/7/05, Mike Schaffer <mschaff@bu.edu> wrote: > Hi, > > The lab I work with has used "whole genome" human arrays (~18,000 > genes) for a couple years and I have helped with the analysis using > Limma. Now, due to costs, they are now considering switching from > whole genome arrays to focused arrays with ~400 genes of interest > (selected from the whole-genome array results). > > The obvious analysis problems with a focused array where most genes are > changing are: > > 1. LOESS normalization assumes most genes are not changing. If most of > the genes are expected to change, there is no basis to recenter the > data around zero. The response from the lab was that they would be > willing to include 100-150 genes that are not expected to change. > > 2. The B-statistic in Limma requires a parameter indicating a certain > fraction of genes are changing. The corresponding moderated > t-statistic uses the data from all genes to moderate the standard error > in the t calculation. Both of these could change dramatically if most > of the genes on the array are changing. > > > My questions are: > > 1. Are my concerns valid and are there ways around around them? Are > there other analysis pitfalls with this scenario? > > 2. Can Limma handle situations where most of an array is expected to > change? What modifications, if any, need to be made to the Limma > analysis to account for this? > > 3. Alternatively, is there a more appropriate statistical package to > use in this case? > > > Thanks. > > -- > Mike > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > -- best, -tony "Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes" (AJR, 4Jan05). A.J. Rossini blindglobe@gmail.com

ADD COMMENT • link 19.5 years ago A.J. Rossini ▴ 210

0

Entering edit mode

Gordon Smyth 51k

@gordon-smyth

Last seen 3 minutes ago

WEHI, Melbourne, Australia

>Date: Tue, 7 Jun 2005 09:33:51 -0400 >From: Mike Schaffer <mschaff@bu.edu> >Subject: [BioC] Limma analysis of focused arrays vs. whole genome > arrays >To: bioconductor@stat.math.ethz.ch > >Hi, > >The lab I work with has used "whole genome" human arrays (~18,000 >genes) for a couple years and I have helped with the analysis using >Limma. Now, due to costs, they are now considering switching from >whole genome arrays to focused arrays with ~400 genes of interest >(selected from the whole-genome array results). > >The obvious analysis problems with a focused array where most genes are >changing are: > >1. LOESS normalization assumes most genes are not changing. If most of >the genes are expected to change, there is no basis to recenter the >data around zero. The response from the lab was that they would be >willing to include 100-150 genes that are not expected to change. > >2. The B-statistic in Limma requires a parameter indicating a certain >fraction of genes are changing. The corresponding moderated >t-statistic uses the data from all genes to moderate the standard error >in the t calculation. Both of these could change dramatically if most >of the genes on the array are changing. > > >My questions are: > >1. Are my concerns valid and are there ways around around them? Are >there other analysis pitfalls with this scenario? > >2. Can Limma handle situations where most of an array is expected to >change? What modifications, if any, need to be made to the Limma >analysis to account for this? To quote from the Limma User's Guide (page 15): "In such a situation, the best strategy is to include on the arrays a series of non-differentially expressed control spots, such as a titration series of whole-library- pool spots, and to use the up-weighting method discussed below. In the absence of the such control spots, normalization of boutique arrays requires specialist advice." >3. Alternatively, is there a more appropriate statistical package to >use in this case? I don't know of any other available methods. In my opinion, you have to put down control spots, "house-keeping" genes if that is all you can get, but preferably constructed spots as described above. Gordon >Thanks. > >-- >Mike

ADD COMMENT • link 19.5 years ago Gordon Smyth 51k

Login before adding your answer.