Correcting for batch effects in limma
1
0
Entering edit mode
@khadeeja-ismail-4711
Last seen 8.8 years ago
Hi, Is it possible to correct for batch effects in limma when doing a paired analysis? I have pairs from two runs Batch 1 and Batch 2. There  are no pairs where one is in Batch1 and the other in Batch 2. If I enter the Batch no. into the design matrix, no coefficients are generated as there is no difference between run no. for any pair. Any advice on this would be most appreciated. Thanks, Khadeeja [[alternative HTML version deleted]]
limma limma • 1.2k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
Hi Khadeeja, On 5/15/2012 6:06 AM, khadeeja ismail wrote: > Hi, > > Is it possible to correct for batch effects in limma when doing a paired analysis? I have pairs from two > runs Batch 1 and Batch 2. There are no pairs where one is in Batch1 and the > other in Batch 2. If I enter the Batch no. into the design matrix, > no coefficients are generated as there is no difference between run no. > for any pair. Without seeing your code it is hard to say much. In addition, it isn't really clear what you mean by 'no coefficients are generated'. However, you should note that accounting for the pairing structure and the batches will to a certain extent be doing the same thing. As an example, let's consider a single pair from one batch. If we were to consider the conventional approach for paired data, you would first compute pair1_treated - pair1_control, and then using these differences to compute statistics. By computing the paired differences, we have subtracted out any sample-specific variability, which includes a batch effect (e.g., if batch 1 has higher overall expression due to some technical reasons, you would expect both of the pairs to reflect this higher expression, and subtracting the two would thus eliminate the batch effect, modulo variability). When you fit a batch effect, you are in essence computing a mean expression value for all samples in a particular batch and then subtracting that from each sample. This is very similar to what you have done by pairing. Doing both is not likely IMO to add benefit, and is just wasting degrees of freedom. Best, Jim > > Any advice on this would be most appreciated. > > Thanks, > Khadeeja > > [[alternative HTML version deleted]] > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Oh good! That's a very useful explanation. Thanks, Jim :) Best, Khadeeja ________________________________ From: James W. MacDonald <jmacdon@uw.edu> Cc: "bioconductor@r-project.org" <bioconductor@r-project.org>; Gordon K Smyth <smyth@wehi.edu.au> Sent: Tuesday, May 15, 2012 4:16 PM Subject: Re: [BioC] Correcting for batch effects in limma Hi Khadeeja, On 5/15/2012 6:06 AM, khadeeja ismail wrote: > Hi, > > Is it possible to correct for batch effects in limma when doing a paired analysis? I have pairs from two > runs Batch 1 and Batch 2. There� are no pairs where one is in Batch1 and the > other in Batch 2. If I enter the Batch no. into the design matrix, > no coefficients are generated as there is no difference between run no. > for any pair. Without seeing your code it is hard to say much. In addition, it isn't really clear what you mean by 'no coefficients are generated'.� However, you should note that accounting for the pairing structure and the batches will to a certain extent be doing the same thing. As an example, let's consider a single pair from one batch. If we were to consider the conventional approach for paired data, you would first compute pair1_treated -� pair1_control, and then using these differences to compute statistics. By computing the paired differences, we have subtracted out any sample-specific variability, which includes a batch effect (e.g., if batch 1 has higher overall expression due to some technical reasons, you would expect both of the pairs to reflect this higher expression, and subtracting the two would thus eliminate the batch effect, modulo variability). When you fit a batch effect, you are in essence computing a mean expression value for all samples in a particular batch and then subtracting that from each sample. This is very similar to what you have done by pairing. Doing both is not likely IMO to add benefit, and is just wasting degrees of freedom. Best, Jim > > Any advice on this would be most appreciated. > > Thanks, > Khadeeja > > ��� [[alternative HTML version deleted]] > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 566 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6