Entering edit mode
Shona Wood
▴
70
@shona-wood-4559
Last seen 10.2 years ago
Hi,
I have been trying to compare RNA-seq data 3 groups; young, middle
aged and old.
I have done pairwise comparisons between these groups but i would like
to
compare all three together. I have been trying to use the GLM method
to do this,
following the tutorial, 11 Case study: Oral carcinomas vs matched
normal tissue.
i understand that this only compares two conditions but i tried to
adapt it to
my needs. Being new at this I am unsure if I am doing it right and I
get an
error when trying to estimateCRdisp. Though I think the problem is
actually to
do with the way i have specified the design because "middle" doesnt
seem to be
in the design. Please see below:
R version 2.12.2 (2011-02-25)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i386-pc-mingw32/i386 (32-bit)
library(edgeR)
setwd ("/Users/swood/Documents/rna-seq analysis")
targets<-read.delim (file="targets.txt",
+ stringsAsFactors = FALSE, row.names = "label")
Error in data[[rowvar]] : attempt to select less than one element
targets<-read.delim (file="targets.txt")
targets$age<-factor(targets$age)
targets$sample<-factor(targets$sample)
targets
X files age sample
1 p1 p1.txt young 1
2 p2 p2.txt young 2
3 p3 p3.txt young 3
4 p4 p4.txt old 4
5 p5 p5.txt old 5
6 p6 p6.txt old 6
7 p7 p7.txt middle 7
8 p8 p8.txt middle 8
9 p9 p9.txt middle 9
d<-readDGE(targets, skip=1, comment.char='#')
colnames(d)<-row.names(targets)
d<-calcNormFactors(d)
d
An object of class "DGEList"
$samples
X files age sample lib.size norm.factors
1 p1 p1.txt young 1 674816.7 0.7598895
2 p2 p2.txt young 2 648856.9 0.9095203
3 p3 p3.txt young 3 449797.3 1.2774867
4 p4 p4.txt old 4 699036.0 0.8532141
5 p5 p5.txt old 5 441649.7 1.1919854
6 p6 p6.txt old 6 670755.9 0.8147330
7 p7 p7.txt middle 7 699630.1 1.0262252
8 p8 p8.txt middle 8 505795.8 1.2979023
9 p9 p9.txt middle 9 478527.1 1.0262468
$counts
1 2 3 4 5 6
7
ENSRNOG00000000007 38.23700 37.89900 34.85820 43.6703 40.81260 43.4009
42.4969
ENSRNOG00000000008 0.00000 0.00000 0.00000 0.0000 0.00000 0.0000
0.0000
ENSRNOG00000000009 0.00000 0.00000 0.00000 0.0000 0.00000 0.0000
0.0000
ENSRNOG00000000010 1.40963 8.28797 2.09491 17.1853 1.76635 14.7567
1.0840
ENSRNOG00000000012 0.00000 0.00000 0.00000 0.0000 0.00000 0.0000
0.0000
8 9
ENSRNOG00000000007 45.81270 48.6937
ENSRNOG00000000008 0.00000 0.0000
ENSRNOG00000000009 0.00000 0.0000
ENSRNOG00000000010 5.44889 17.7295
ENSRNOG00000000012 0.00000 0.0000
29510 more rows ...
d<-d[rowSums(d$counts)>9,]
design <- model.matrix(~ sample + age, data = targets)
design
(Intercept) sample2 sample3 sample4 sample5 sample6 sample7 sample8
sample9
1 1 0 0 0 0 0 0 0
0
2 1 1 0 0 0 0 0 0
0
3 1 0 1 0 0 0 0 0
0
4 1 0 0 1 0 0 0 0
0
5 1 0 0 0 1 0 0 0
0
6 1 0 0 0 0 1 0 0
0
7 1 0 0 0 0 0 1 0
0
8 1 0 0 0 0 0 0 1
0
9 1 0 0 0 0 0 0 0
1
ageold ageyoung
1 0 1
2 0 1
3 0 1
4 1 0
5 1 0
6 1 0
7 0 0
8 0 0
9 0 0
attr(,"assign")
[1] 0 1 1 1 1 1 1 1 1 2 2
attr(,"contrasts")
attr(,"contrasts")$sample
[1] "contr.treatment"
attr(,"contrasts")$age
[1] "contr.treatment"
d<-estimateCRDisp(d, design)
Error in beta[i, ] <- z %*% X :
number of items to replace is not a multiple of replacement length
If you could help me out with the design or suggest a reason for the
error i
would really appreciate it.
Thanks!