Question

paired sample - SVA model matrix

0

Entering edit mode

flippy23 • 0

@flippy23-18925

Last seen 5.5 years ago

Hello,

I am running SVA and had a question regarding the null and full model matrices. I am comparing a before/after effect of a treatment within each patient. How do I account for this in building the model matrices? How do I specify to SVA that the before/after variable is within the same patient?

thanks

limma sva paired sample • 1.6k views

ADD COMMENT • link updated 5.9 years ago by James W. MacDonald 67k • written 5.9 years ago by flippy23 • 0

score 1 · Answer 1 · 2018-12-18

1

Entering edit mode

James W. MacDonald 67k

@james-w-macdonald-5106

Last seen 2 days ago

United States

You can find an example of a paired analysis in the limma User's Guide, section 9.4.1

For sva you will want to define both the full and reduced model by hand though, as the default reduced model has only an intercept, which isn't the right reduced model for your situation.

ADD COMMENT • link 5.9 years ago James W. MacDonald 67k

0

Entering edit mode

So...just to make sure I understand.

null model (only adjustment covariates) would consist of: covariate 1 + covariate 2 + covariate 3

full model: ~Study_ID+Treatment + covariate 1 + covariate 2 + covariate 3

how will it differentiate the treatment effect from the other covariates?

ADD REPLY • link 5.9 years ago flippy23 • 0

0

Entering edit mode

It doesn't. A statistician would call the three covariates 'nuisance variables', which are things that you think (know) will have an effect on at least some of the genes, and so you need to account for them in your model.

In your case you said you have paired samples. So if one patient has a much higher level of expression for a given gene than the other patients, you don't necessarily care about that fact. Instead, you want to know how much the treatment affects the gene expression after accounting for any patient-specific differences. And you account for any between-patient differences by adding a patient-level blocking factor to your model, which estimates the mean expression for that patient, and removes that patient-level expression, to give you a cleaner signal.

Put another way, this is just simple algebra. You are saying, that for Gene X, the expression for patient 1 is

Gene_X_expression = patient_effect + treatment_effect + covariate1_effect + covariate2_effect + covariate3_effect

Which is the same as saying that

treatment_effect = Gene_X_expression - patient_effect - covariate1_effect - covariate2_effect - covariate3_effect

So the treatment effect is estimated from the expression data, after subtracting out all these other things that might affect the gene expression, but are not of interest to you. It's a little more complicated than that, but that's the basic idea.

ADD REPLY • link 5.9 years ago James W. MacDonald 67k

0

Entering edit mode

Or maybe you are asking a different question? You might be asking how does sva differentiate the treatment covariate from the others? Again, it doesn't.

The idea behind sva is to say that you might have other nuisance variables lurking in your data that you don't really know about. In which case you want to have the data after you have adjusted out all the known nuisance variables (your reduced model), as well as the data after you have adjusted out all the known nuisance variables plus the variable you care about. You then try to see if there are any extra patterns in the data (after adjusting for all the variables you know about) that you can include as extra nuisance variables, but with the caveat that you don't want to erase anything that might pertain to the treatment.

ADD REPLY • link 5.9 years ago James W. MacDonald 67k

0

Entering edit mode

I have a follow-up question based on this. In downstream analysis, subject ID (two samples - pre/post) is the random effect group. Because I'm controlling for patient specific effects in downstream analysis, I guess the within-individual technical variability is now of interest - as that may be correlated with treatment effect in the pre/post within a patient. In this case, using a patient blocking factor for SVA wouldn't consider the pre/post differences within a patient, and may contribute to "noisier" data downstream?

ADD REPLY • link 5.5 years ago flippy23 • 0

0

Entering edit mode

@flippy23

How did you finally designed the full and null model? Did you incorporate Study_ID in both the full and null model or only in the full model?

mod = ~ Subject_ID + Treatment

mod0 = ~ Subject_ID

OR

mod = ~ Subject_ID + Treatment

mod0 = ~1

ADD REPLY • link 5.2 years ago sara.blocquiaux • 0