I was hoping I could get advice in terms of experimental design on a project that I am doing with CellMix.
I am studying disease WHOLE BLOOD tissue samples and I would like to get an estimate of the proportions of cells in the tissue samples and from this I need to eliminate any confounding in terms of differential expression that may be due to the cell types.
Illumina was the tech used to assay the samples and RSN (Robust spline) was used to normalise the data.
Originally this was done using CellCODE. I produced a heat map that estimated differentially expressed genes with respect to cell type. And then plotted the estimated proportions against the physically calculated cell counts to see how they compared (I have data for both of these expression and physical cell counts). This seems to be exactly what your CellMix vignette describes in the Figure 1 from the CellMix original paper: http://bioinformatics.oxfordjournals.org/content/29/17/2211.long
However the CellCODE package is still under development and CellMix seems to have a richer array of commands so I was hoping to compare them.
My problem is similar to the one that has already been asked on the Bioconductor Forum: CellMix package use with own data matrix
Using a GEO dataset as a contrast (in CellCODE I used "IRIS"), I would like to use my own expression data to estimate a the cell proportions of my tissue samples.
Based on the advice at the following link: CellMix package use with own data matrix, I have tried the following (and not got very far):
"E.rnaMix<-as.matrix(Expression_data)
eset<-ExpressionSet(E.rnaMix)"
Beyond this implementing "eset" in any of the CellMix commands does not work as it is not compatible with "gedBlood()".
gedBlood(eset, verbose = TRUE)
Error:
Using ged algorithm: “lsfit”
Estimating cell proportions from cell-specific signatures [lsfit: ls]
Error in while (any(a < 0) && length(i) < r) { :
missing value where TRUE/FALSE needed
The "eset" matrix is a 47323 row by 656 column dataset the rows are genes and the columns specify disease status.
I tried it with gedProportions:
> gedProportions(as.matrix(counts1), ExpressionSet(as.matrix(E.rna)))
Error in gedProportions(as.matrix(counts1), ExpressionSet(as.matrix(E.rna))) :
Empty data after limiting to common features: signatures [0] - target data [0]
I think that the file that contains the physical cell-counts data is incompatible with "ged"...
I should note that the file has been subject to rearrangement and has a lot of missing values. I plan to clean this data but for now I am just looking fro some preliminary instructions in terms of my experimental design.
Overall my question is how do I implement CellCODE to implement a heat map and plot similar to the ones that I plotted in CellCODE. Am I using it the right way? Can someone point me in the right direction?