We are trying to read in htseq count files into the function
DESeqDataSetFromHTSeqCount(), however we are experiencing an issue
with the design parameter. Please see the code and error below:
> conditions=factor(c("ShhWT", "ShhNULL", "ShhCondMUT"))
> DESeqDataSetFromHTSeqCount(sampleTable,directory=getwd(),design=form
ula(~ conditions))
Error in DESeqDataSet(se, design = design, ignoreRank) :
all variables in design formula must be columns in colData
Our sample table was read in separately and is a data frame with 6
rows and 3 columns:
> sampleTable
SampleName
1 Shh_het3
2 Shh_null2
3 Shh_flox1
4 Shh_flox2
5 Shh_flox3
6 Shh_flox4
FileName
1
Sample_Shh_het3_accepted_hits.RG.rmdup.sam.htseq.gene_wise.readcounts
2
Sample_Shh_null2_accepted_hits.RG.rmdup.sam.htseq.gene_wise.readcounts
3
Sample_Shhflox_1_accepted_hits.RG.rmdup.sam.htseq.gene_wise.readcounts
4
Sample_Shhflox_2_accepted_hits.RG.rmdup.sam.htseq.gene_wise.readcounts
5
Sample_Shhflox_3_accepted_hits.RG.rmdup.sam.htseq.gene_wise.readcounts
6
Sample_Shhflox_4_accepted_hits.RG.rmdup.sam.htseq.gene_wise.readcounts
Metadata
1 ShhWT
2 ShhNULL
3 ShhCondMUT
4 ShhCondMUT
5 ShhCondMUT
6 ShhCondMUT
Any insight on how to specify the design will be helpful.
Thanks,
Anand
-- output of sessionInfo():
> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods
[8] base
other attached packages:
[1] DESeq2_1.4.5 RcppArmadillo_0.4.400.0 Rcpp_0.11.2
[4] GenomicRanges_1.16.4 GenomeInfoDb_1.0.2 IRanges_1.22.10
[7] BiocGenerics_0.10.0
loaded via a namespace (and not attached):
[1] annotate_1.42.1 AnnotationDbi_1.26.0 Biobase_2.24.0
[4] DBI_0.2-7 genefilter_1.46.1 geneplotter_1.42.0
[7] grid_3.1.0 lattice_0.20-29 locfit_1.5-9.1
[10] RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.1.0
[13] stats4_3.1.0 survival_2.37-7 tools_3.1.0
[16] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0
--
Sent via the guest posting facility at bioconductor.org.
On Aug 26, 2014 8:42 PM, "Guest [guest]" <guest at="" bioconductor.org="">
wrote:
>
> We are trying to read in htseq count files into the function
DESeqDataSetFromHTSeqCount(), however we are experiencing an issue
with the
design parameter. Please see the code and error below:
>
>
> > conditions=factor(c("ShhWT", "ShhNULL", "ShhCondMUT"))
> >
DESeqDataSetFromHTSeqCount(sampleTable,directory=getwd(),design=formul
a(~
conditions))
> Error in DESeqDataSet(se, design = design, ignoreRank) :
> all variables in design formula must be columns in colData
>
The error here is informing you that the 'conditions' vector should be
a
column of the column data, in this case, the 'sampleTable' data.frame.
DESeq2 doesn't use variables from the global environment for the
design,
because we need to be sure that the information is tied to the columns
of
the count matrix (order, subsetting, etc.), which is accomplished by
only
using the columns of colData for the design.
Mike
> Our sample table was read in separately and is a data frame with 6
rows
and 3 columns:
>
> > sampleTable
> SampleName
> 1 Shh_het3
> 2 Shh_null2
> 3 Shh_flox1
> 4 Shh_flox2
> 5 Shh_flox3
> 6 Shh_flox4
>
FileName
> 1
Sample_Shh_het3_accepted_hits.RG.rmdup.sam.htseq.gene_wise.readcounts
> 2
Sample_Shh_null2_accepted_hits.RG.rmdup.sam.htseq.gene_wise.readcounts
> 3
Sample_Shhflox_1_accepted_hits.RG.rmdup.sam.htseq.gene_wise.readcounts
> 4
Sample_Shhflox_2_accepted_hits.RG.rmdup.sam.htseq.gene_wise.readcounts
> 5
Sample_Shhflox_3_accepted_hits.RG.rmdup.sam.htseq.gene_wise.readcounts
> 6
Sample_Shhflox_4_accepted_hits.RG.rmdup.sam.htseq.gene_wise.readcounts
> Metadata
> 1 ShhWT
> 2 ShhNULL
> 3 ShhCondMUT
> 4 ShhCondMUT
> 5 ShhCondMUT
> 6 ShhCondMUT
>
>
> Any insight on how to specify the design will be helpful.
>
> Thanks,
>
> Anand
>
>
>
>
> -- output of sessionInfo():
>
> > sessionInfo()
> R version 3.1.0 (2014-04-10)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets
methods
> [8] base
>
> other attached packages:
> [1] DESeq2_1.4.5 RcppArmadillo_0.4.400.0 Rcpp_0.11.2
> [4] GenomicRanges_1.16.4 GenomeInfoDb_1.0.2 IRanges_1.22.10
> [7] BiocGenerics_0.10.0
>
> loaded via a namespace (and not attached):
> [1] annotate_1.42.1 AnnotationDbi_1.26.0 Biobase_2.24.0
> [4] DBI_0.2-7 genefilter_1.46.1 geneplotter_1.42.0
> [7] grid_3.1.0 lattice_0.20-29 locfit_1.5-9.1
> [10] RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.1.0
> [13] stats4_3.1.0 survival_2.37-7 tools_3.1.0
> [16] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0
>
>
> --
> Sent via the guest posting facility at bioconductor.org.
[[alternative HTML version deleted]]