Question

How to design block vector for duplicateCorrelation()

0

Entering edit mode

gulusunshine • 0

@gulusunshine-23261

Last seen 4.5 years ago

Hi, we are now trying to analyzing a dataset, with 3 cell lines, each treated with low pH and normal pH. We have three replicates for each group, for some reason we have to treat them as technical replicates. The goal is to obtain differentially expressed genes at low pH shared by all three cell lines.

  names      cell       pH 
  Sample1     MDA      low     
  Sample2     MDA      low     
  Sample3     MDA      low   
  Sample4     MDA      normal   
  Sample5     MDA      normal   
  Sample6     MDA      normal   
  Sample7     Panc     low   
  Sample8     Panc     low 
  Sample9     Panc     low 
  Sample10    Panc     normal 
  Sample11    Panc     normal
  Sample12    Panc     normal
  Sample13    MCF7     low
  Sample14    MCF7     low
  Sample15    MCF7     low
  Sample16    MCF7     normal
  Sample17    MCF7     normal
  Sample18    MCF7     normal

We want to use the function duplicateCorrelation() to address technical replicates, which needs an appropriate block vector. We are now wondering what's the difference between the following two ways for modelling:

model.matrix(~pH)

with block vector:

MDA MDA MDA MDA MDA MDA Panc Panc Panc Panc Panc Panc MCF MCF MCF MCF MCF MCF

and

model.matrix(~cell+pH)

with block vector:

 MDA1 MDA1 MDA1 MDA2 MDA2 MDA2
 Panc1 Panc1 Panc1 Panc2 Panc2 Panc2
 MCF1 MCF1 MCF1 MCF2 MCF2 MCF2

Thanks in advance

limma • 376 views

ADD COMMENT • link updated 4.4 years ago by Gordon Smyth 51k • written 4.4 years ago by gulusunshine • 0

score 0 · Answer 1 · 2020-05-13

0

Entering edit mode

Gordon Smyth 51k

@gordon-smyth

Last seen 9 hours ago

WEHI, Melbourne, Australia

The second model is correct. The first is too simple as it neither adjusts for basline differences between the cell lines nor correctly reflects the technical replicate structure.

ADD COMMENT • link 4.4 years ago Gordon Smyth 51k