Design matrix for cohort with samples outside "block"
0
0
Entering edit mode
kentfung ▴ 20
@kentfung-17051
Last seen 4.6 years ago

Hi,

I am using limma-voom for a RNA-Seq differential expression analysis. So there are 260 samples from leukaemia patients and they fall into different clusters except for some 40 samples. So let's say I have clusters A to H, each contains 5 - 40 samples, and then 40 samples that don't fall into any clusters (I'll call them "others" for convenience' sake). I want to see how the expression of each cluster is compare to the rest of the cohort, e.g. cluster A vs (clusters B to H + others) and run through A to H. Which way should I use to make the design matrix (1,2 or 3 below)?

And as I need to do sva to adjust for batch effect, should I use 1, 2 or 3 for svaseq() as the" mod1"?

EDITED: since I have made some mistakes which makes the question sounds like nonsense I have changed the questions

For each cluster I make a clusterX and not_clusterX vector, which will be like: Let's say 1st to 5th are A and there are 45 samples... clusterA = rep(c(TRUE,FALSE),times = c(5,40)) not_clusterA = !rep(c(TRUE,FALSE),times = c(5,40))

... 6th to 10th are B ... clusterB = rep(c(FALSE,TRUE,FALSE),times = c(5,5,35)) not_clusterB = !rep(c(TRUE,FALSE),times = c(5,40))

... and so on and so forth

  1. Only mark which samples belong to which cluster and ignore the "others", which would be something like this (just an example) model.matrix(~clusterA + clusterB + clusterC + clusterD + clusterE + clusterF + clusterG + clusterH)

  2. Make an extra column for each cluster as negative For convenience' sake, again, clusterX are factors which labelled which samples are in the corresponding clusters model.matrix(~clusterA + clusterB + clusterC + clusterD + clusterE + clusterF + clusterG + clusterH + not_clusterA + not_clusterB + not_clusterC + not_clusterD + not_clusterE + not_clusterF + not_clusterG + not_clusterH)

  3. Make one vector only Group everything in one column cluster = rep(c("A","B","C","D","E","F","G","H",NA), times = rep(5,9)) model.matrix(~cluster)

I actually tried 1 and 3 and it seems it doesn't work with contrast matrix since that will need a -1 as contrast to 1, but I am not sure if 2 makes sense and if this will affect the way voom or sva estimate the model.

Thanks a lot!

limma voom sva • 1.1k views
ADD COMMENT
0
Entering edit mode

What do the vectors clusterA, clusterB etc contain?

ADD REPLY
0
Entering edit mode

Sorry I have reformulated the questions. It should make more sense now.

ADD REPLY
0
Entering edit mode

Sorry I have reformulated the questions so it makes more sense

ADD REPLY

Login before adding your answer.

Traffic: 776 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6