full confounding between batch and treatment effects
1
0
Entering edit mode
@bhaktidwivedi-8895
Last seen 4.6 years ago
United States

I have RNAseq from three treatments (A, B, C), each with biological replicates ran in two batches (batch1 with A & B; batch2 with C). Unfortunately, the batch effects are fully confounded with the condition (e.g., to compare A vs. C). Here, I understand it is impossible to separate the batch effects from the treatment, regardless of what statistical method I use. I thought of methods such as ComBat, removeBatcheffect (limma) but can not estimate covariates to include in the batch correction. Then thought of using the control-genes based methods such as RUVg-method; however I do not have ERCC control genes in the data or an independent data set of similar treatments to obtain the negative (or positive) control genes. I understand best would be to redo the experiment with good design. With that said, I am curious if anyone have suggestions or options that I can explore and be able to use the data in some way? Appreciate your help. Thank you.

batch effects limma batch-correction RNAseq RUVseq • 2.3k views
ADD COMMENT
0
Entering edit mode

Perform analysis as usual and validate the key findings with independent experiments to be sure that the major conclusions you make are biology- rather than batch.driven.

ADD REPLY
0
Entering edit mode

Thank you for your response.

ADD REPLY
2
Entering edit mode
@james-w-macdonald-5106
Last seen 3 days ago
United States

If you want to analyze the data, you will have to assume that the biological differences you want to detect are 'larger' than the technical batch effects. And that usually has more to do with how the batches came about rather than the underlying biology. For example, if you prepared all the total RNA with the same batch of reagents, just days apart, and then sent off to be sequenced, then there is a good chance that any technical batch effects will be pretty minor. In which case you could probably make a compelling argument that it's fine.

However, if these batches were processed at different times, by different people, using different reagents and then sequenced at really different times as well I would be shocked if the batch effects didn't dominate. Not that there is any way to determine if it's batch or biology. If you are feeling lucky, and have the ability to get new data to validate, then doing what ATpoint says is an option. But that is dependent on how easy/cheap it is to get new data. If that's not really an option, then IMO doing the analysis is essentially the same as not doing it - unless you can validate, the results you get are about as informative as not doing the analysis at all (e.g., you don't know anything, really, regardless).

ADD COMMENT
0
Entering edit mode

Thank you for your response and suggestions. The batches were processed at different time, but followed the same library prep, reagents and sequencing protocol. Looks like I don't have much choice other than generating the new data and validating the findings.

ADD REPLY

Login before adding your answer.

Traffic: 587 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6