My mouse array data and human array data were derived from the same two conditions, condition A and condition B. I fitted two array data individually using limma and the fitted expression values were obtained using the as.matrix(y). Now my question is: may I directly compare the mouse expression profile at condition A with the human expression profile at condition A?
At the very least, you will have technical batch effects, e.g., differences in probe design/affinity between arrays, differences in RNA extraction/RT efficiency, variation in non-specific binding from the rest of the transcriptome. So, even if you had a homologous gene with the same abundance of transcript molecules in mouse and human, I would be amazed if you managed to obtain the same signal on the associated arrays. There is almost certainly a strong batch effect that will confound any comparison between species.
The other major problem is the biological interpretation of differences. Let's say that you did manage to somehow resolve the batch effects and get comparable intensities between species. Then what? If you say that gene X has higher intensity (i.e., more transcript molecules) in mouse compared to human, how would I even attempt to interpret that? Do 10 transcript molecules of gene X in mouse cells have the same effect as 10 molecules of gene X in human cells? Perhaps mice need 20 molecules of X to achieve the same biological activity as that in humans. Or 5. Or 100.
A more rigorous approach would be to compare within each species, then do a meta-analysis across species. You could ask things like "which genes are DE between A and B in mice, but not in humans" and vice versa. This also seems to answer a more relevant scientific question than a direct comparison between species.
We have done meta-analysis as you mentioned. But, we are still interested in comparing mouse and human at condition A since we actually would like to know if mouse is reasonable model for human...
Regarding the batch effect, may I do "normalizedBetweenArrays" using only homologous probes? That is to say, I manually modify the input files so that only data from the background and homologous probes are kept...
But, we are still interested in comparing mouse and human at condition A since we actually would like to know if mouse is reasonable model for human...
And how, exactly, are you going to answer this question? Even if you were able to compare across species, what would that tell you? If you get 500 DE genes, would that mean that mouse is too different from human to be a reasonable model? What about 1000 genes? Or 200?
If you had an equivalent dataset involving, e.g., fish, you might be able to see that mouse vs human has fewer DE genes than fish vs human, and thus conclude that mouse is a better model for human than fish. But a standalone comparison between two species doesn't really tell you much, even if you were able to overcome the technical and biological problems I mentioned above.
Regarding the batch effect, may I do "normalizedBetweenArrays" using only homologous probes?
This will not solve the batch effect issues. Batch effects occur on a per-probe basis, and normalization only adjusts for systematic biases that occur across many probes. You won't be able to get rid of probe-specific biases with normalization.
Thank you very much for your reply!
We have done meta-analysis as you mentioned. But, we are still interested in comparing mouse and human at condition A since we actually would like to know if mouse is reasonable model for human...
Regarding the batch effect, may I do "normalizedBetweenArrays" using only homologous probes? That is to say, I manually modify the input files so that only data from the background and homologous probes are kept...
Thanks a lot!
C.
And how, exactly, are you going to answer this question? Even if you were able to compare across species, what would that tell you? If you get 500 DE genes, would that mean that mouse is too different from human to be a reasonable model? What about 1000 genes? Or 200?
If you had an equivalent dataset involving, e.g., fish, you might be able to see that mouse vs human has fewer DE genes than fish vs human, and thus conclude that mouse is a better model for human than fish. But a standalone comparison between two species doesn't really tell you much, even if you were able to overcome the technical and biological problems I mentioned above.
This will not solve the batch effect issues. Batch effects occur on a per-probe basis, and normalization only adjusts for systematic biases that occur across many probes. You won't be able to get rid of probe-specific biases with normalization.