Entering edit mode
Hello List
I have a question about the use of DESeq to analyse RNA-Seq data from
paired samples.
Briefly my experimental design involves cells from 6 animals either
infected or uninfected. I would like to use DESeq to examine
differences between these groups taking the biological pairing into
account. The data are very variable (I think - I've no experience with
RNA-Seq data until now) with some animals showing very little response
and others showing large movement for a given gene; some animals have
low expression of a given gene while others have large counts.
I have used DESeq to simply look at the control v infected groups and,
on FC this gives me biologically relevant data but my FDR never drops
below 1. I am assuming that taking the animal pairing into account
will help overcome the inherent variablilty in the dataset
Anyway my question relates to creating the estimateDispersions
function in DESeq. I create my CountDataSet object following the
directions in the DESeq vignette, passing in counts and a dataframe of
factors (group = infection & description = animal).
cds <- newCountDataSet( counts, conds )
> class(cds)
[1] "CountDataSet"
attr(,"package")
[1] "DESeq"
> cds
CountDataSet (storageMode: environment)
assayData: 12063 features, 12 samples
? element names: counts
protocolData: none
phenoData
? sampleNames: 2713:InfAF 2700:InfAF ... 2684:Cont (12 total)
? varLabels: sizeFactor group description
? varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation:?
> pData(cds)
?????????? sizeFactor group description
2713:InfAF? 1.1611902 InfAF??????? 2713
2700:InfAF? 1.0896422 InfAF??????? 2700
2700:Cont?? 1.0799056? Cont??????? 2700
2721:Cont?? 0.9315077? Cont??????? 2721
2721:InfAF? 1.3688457 InfAF??????? 2721
2713:Cont?? 1.6007580? Cont??????? 2713
2714:InfAF? 0.6519720 InfAF??????? 2714
2714:Cont?? 0.9796116? Cont??????? 2714
2689:Cont?? 1.0116211? Cont??????? 2689
2689:InfAF? 0.4251266 InfAF??????? 2689
2684:InfAF? 0.7910000 InfAF??????? 2684
2684:Cont?? 2.2416324? Cont??????? 2684
group = infection and description = sample i.e. animal
So, all this seems to work fine. However when I try to run:
cds <- estimateDispersions( cds )
I get the following error:
Error in rowSums(sapply(tapply((1:ncol(counts))[replicated_sample],
factor(conditions[replicated_sample]),? :
? 'x' must be an array of at least two dimensions
I can't reproduce this error with the 'pasilla' dataset used in the
vignette so it must be something to do with the way I'm creating my
CountDataSet object. To me it looks just like the one in the vignette
though!
Can anyone shed any light on this?
I have successfully used edgeR for paired sample analysis and this
gives biologically relevant results along with good FDR. I was
interested in the comparison with DESeq for my own education as much
as anything else.
Thanks
iain
> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
?[1] LC_CTYPE=en_GB.utf8?????? LC_NUMERIC=C????????????
?[3] LC_TIME=en_GB.utf8??????? LC_COLLATE=en_GB.utf8???
?[5] LC_MONETARY=en_GB.utf8??? LC_MESSAGES=en_GB.utf8??
?[7] LC_PAPER=C??????????????? LC_NAME=C???????????????
?[9] LC_ADDRESS=C????????????? LC_TELEPHONE=C??????????
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C?????
attached base packages:
[1] stats???? graphics? grDevices utils???? datasets? methods??
base????
other attached packages:
?[1] annotate_1.32.1??????? GO.db_2.6.1???????????
goseq_1.6.0??????????
?[4] geneLenDataBase_0.99.8 BiasedUrn_1.04????????
org.Bt.eg.db_2.6.4???
?[7] RSQLite_0.11.1???????? DBI_0.2-5?????????????
AnnotationDbi_1.16.11
[10] edgeR_2.4.3??????????? limma_3.10.2??????????
DESeq_1.6.1??????????
[13] locfit_1.5-6?????????? lattice_0.20-0????????
akima_0.5-7??????????
[16] Biobase_2.14.0???????
loaded via a namespace (and not attached):
?[1] biomaRt_2.10.0??????? Biostrings_2.22.0???? BSgenome_1.22.0?????
?[4] genefilter_1.36.0???? geneplotter_1.32.1??? GenomicFeatures_1.6.7
?[7] GenomicRanges_1.6.6?? grid_2.14.1?????????? IRanges_1.12.5??????
[10] Matrix_1.0-3????????? mgcv_1.7-13?????????? nlme_3.1-103????????
[13] RColorBrewer_1.0-5??? RCurl_1.9-5?????????? rtracklayer_1.14.4??
[16] splines_2.14.1??????? survival_2.36-10????? tools_2.14.1????????
[19] XML_3.9-2???????????? xtable_1.6-0????????? zlibbioc_1.0.0??????
>