Hi All,
I'm trying to use frozen RMA using the frma package to normalize Gene 1.0 ST array CEL files against a custom dataset using the following code:
For reproducibility, I've demonstrated my issue using this public dataset of 15 samples
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE48134
# load libraries
library(oligo)
library(oligoClasses)
library(frma)
library(frmaTools)
library(pd.hugene.1.0.st.v1)
library(hugene.1.0.st.v1frmavecs)
data(hugene.1.0.st.v1frmavecs)
# read in CEL files
celdir <- "GSE48134_RAW"
celfiles <- oligoClasses::list.celfiles(celdir, listGzipped=TRUE)
celfiles.fp <- paste0(celdir, celfiles)
# there are 15 CEL files, assign them to 3 batches of 5 samples each
batch <- c(rep(1,5), rep(2,5), rep(3,5))
# create custom vector
frozenvector <- makeVectorsFeatureSet(files=celfiles.fp, batch=batch, pkgname="pd.hugene.1.0.st.v1")
# renormalize the sample CEL files using frma and the custom vector
featureset <- oligo::read.celfiles(filenames=celfiles.fp, pkgname="pd.hugene.1.0.st.v1")
newnorm <- frma(featureset, input.vecs=frozenvector, target="core")
I end up getting an error when trying to use a custom input.vecs and target="core":
> newnorm <- frma(featureset, input.vecs=frozenvector, target="core")
Either probeVarWithin or probeVarBetween is 0 for some probes -- setting corresponding weights to 1
Error in split.default(N, pns) : group length is 0 but data length > 0
I think the issue stems from the fact that the custom vector I create using makeVectorsFeatureSet does not include the "probeVecCore" slot like the pre-built vector does which is ultimately accessed by the frmaFeatureSet function in frma when target="core":
> names(hugene.1.0.st.v1frmavecs)
[1] "normVec" "probeVec" "probeVarWithin"
[4] "probeVarBetween" "probesetSD" "medianSE"
[7] "probeVecCore" "mapCore"
> names(frozenvector)
[1] "normVec" "probeVec" "probeVarWithin"
[4] "probeVarBetween" "probesetSD" "medianSE"
I took a peak at the code on GitHub it doesn't look like there is anyway to make makeVectorsFeatureSet() compute "probeVecCore". My end goal is to get an ExpressionSet out of frma that has the same probeset IDs using a custom frozen vector as one would get using the pre-computed frozen vector (hugene.1.0.st.v1frmavecs) or using the standard oligo::rma function and the pd.hugene.1.0.st.v1 annotation file.
I'd appreciate any help or insight you could provide!
Regards,
Joe
Session Info:
R version 3.2.3 (2015-12-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices utils
[7] datasets methods base
other attached packages:
[1] hugene.1.0.st.v1frmavecs_1.1.0
[2] pd.hugene.1.0.st.v1_3.14.1
[3] RSQLite_1.0.0
[4] DBI_0.3.1
[5] frmaTools_1.22.0
[6] affy_1.48.0
[7] frma_1.22.0
[8] oligo_1.34.2
[9] Biostrings_2.38.4
[10] XVector_0.10.0
[11] IRanges_2.4.8
[12] S4Vectors_0.8.11
[13] Biobase_2.30.0
[14] oligoClasses_1.32.0
[15] BiocGenerics_0.16.1
loaded via a namespace (and not attached):
[1] affxparser_1.42.0 MASS_7.3-45
[3] GenomicRanges_1.22.4 splines_3.2.3
[5] zlibbioc_1.16.0 bit_1.1-12
[7] foreach_1.4.3 GenomeInfoDb_1.6.3
[9] tools_3.2.3 SummarizedExperiment_1.0.2
[11] ff_2.2-13 iterators_1.0.8
[13] preprocessCore_1.32.0 affyio_1.40.0
[15] codetools_0.2-14 BiocInstaller_1.20.1