Hello
I have normalized and log2 transformed RNA microarray data from GEO, and I want to do differential gene expression analysis with the limma package in R.
My dataset contains 25 samples of cancer og 25 samples of stroma. Each sample have 33297 gene expression values.
I've loaded my data as:
> Array_tumorandstroma <- read.table("25sT_og25T-samples.csv", header = T, sep = ";", stringsAsFactors = T, nrow = 100)
> Array_tumorandstroma2 <- Array_tumorandstroma[,-1]
> rownames(Array_tumorandstroma2) <- Array_tumorandstroma[,1]
Then I just started to design a model matrix. I am unsure, whether I can just do this right away?
> design <-model.matrix((~0+factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2))))
> colnames(design) <- c("sT", "T")
> design
sT T
1 1 0
2 1 0
3 1 0
4 1 0
5 1 0
6 1 0
7 1 0
8 1 0
9 1 0
10 1 0
11 1 0
12 1 0
13 1 0
14 1 0
15 1 0
16 1 0
17 1 0
18 1 0
19 1 0
20 1 0
21 1 0
22 1 0
23 1 0
24 1 0
25 1 0
26 0 1
27 0 1
28 0 1
29 0 1
30 0 1
31 0 1
32 0 1
33 0 1
34 0 1
35 0 1
36 0 1
37 0 1
38 0 1
39 0 1
40 0 1
41 0 1
42 0 1
43 0 1
44 0 1
45 0 1
46 0 1
47 0 1
48 0 1
49 0 1
50 0 1
attr(,"assign")
[1] 1 1
attr(,"contrasts")
attr(,"contrasts")$`factor(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2))`
[1] "contr.treatment"
Is my matrix design correct? I am very insure about this, and nobody I my lab has experience in the limma package.
When I try to do the moderated t-test, I run into trouble:
> fit <- lm.fit(Array_tumorandstroma2, design)
Error in lm.fit(Array_tumorandstroma2, design) : incompatible dimensions
I guess, my matrix design does not fit the dimentions of my microarray dataset (Array_tumorandstroma2). And probably this is way the lm.fit function does not work. Can anyone please advise me to better matrix design?
I also tried an other matrix design, but that resulted in the same problem:
> design <- model.matrix((~0+factor(c(1,2))))
> colnames(design) <- c("sT", "T")
> design
sT T
1 1 0
2 0 1
attr(,"assign")
[1] 1 1
attr(,"contrasts")
attr(,"contrasts")$`factor(c(1, 2))`
[1] "contr.treatment"
> fit <- lm.fit(Array_tumorandstroma2, design)
Error in lm.fit(Array_tumorandstroma2, design) : incompatible dimensions
Thank you in advance for any help or suggestions!
/Stine
It looks like you're using the wrong function. Limma provides 'lmFit', not the lm.fit function you're using, which is something entirely different. Also, it's probably worth turning Array_tumorandstroma2 into a numeric matrix, rather than the data.frame that read.table returns. And for safety, I'd find some way of deriving the factor describing T and sT programmatically from the column names of the table, rather than explicitly enumerating them, or at least enumerating them in a more readable (and debug-able) fashion, e.g. rep(c("cancer", "stroma"), times=c(25,25)) - if you're really sure that the samples come in blocks, with cancer first.