Hello,
I am after some advice! I have a set of proteomic data run on healthy individuals; there is therefore no "treatment" or "status" outcome and the data will be tested against different continuous outcomes.
Although the data initially looked well, closer look indicated the levels of some proteins in some individuals were unaturaly low- even closer look indicated that this decrease was specific to the location of the lab and also the year of extraction; however these two are not independent i.e. blood extraction in labs 1 and 2 took place mainly in the first two years and in labs 3 and 4 in the next two years. Linear regression analyses however indicate that the effect of both lab and year is independent- i.e. even within the same lab there was a difference by year of extraction.
I am a bit confused of how to proceed. My problems/questions are
a) I have no variable of interest - only covariates such as sex- is this OK for combat?
b) can I use combat with lab as a batch -controlling for year and sex as covariates and then a second round of combat with year as a batch? Is it suggested to control for year in the first step?
c) when I did b- results were slightly corrected but still remains a difference in the levels of proteins- should I use sva as well?
Thanks so much in advance, really appreciated
TP
Thanks so much for your advice; if you have partially confounding variables you still do it sequentially? do you add first variable in the mod first or you just adjust twice or even do an interaction between the two? Thanks again for the info
id just do it sequentially using your covariate strategy as originally described and check results with principle component regression or whatever you use to estimate batch effects, however you do it, the point is to remove the batch effect while minimising disruption to your biological effect so you can just test your approach works. so far i just been doing a PCA to check for batch and haven't seen any in our array data, but we use one lab to do everything and it is done in very similar conditions every time.