Hello, I am very new to analyzing microarray data and could use some help. I am trying to read in celfiles and plot a boxplot of the probe intensities, but I'm running into a problem. Any help is greatly appreciated! These are the commands I'm running:
library(GEOquery) library(oligo) library(limma) source("https://bioconductor.org/biocLite.R") #Install annotation biocLite("pd.hugene.1.0.st.v1") library(pd.hugene.1.0.st.v1) getGEOSuppFiles("GSE52882") #Untar GEO tar file in shell celFilelist = list.celfiles("GSE52882", full.name=T)[1:6] affyRaw=read.celfiles(celFiles) boxplot(affyRaw)
The error I'm getting is:
Error in match.arg(target, c("probeset", "core", "full", "extended")) : 'arg' should be one of “probeset”, “core”, “full”, “extended”
Traceback returns:
14: stop(gettextf("'arg' should be one of %s", paste(dQuote(choices), collapse = ", ")), domain = NA) 13: match.arg(target, c("probeset", "core", "full", "extended")) 12: stArrayPmInfo(object, target = target, sortBy = NULL) 11: .local(object, ...) 10: pmindex(getPlatformDesign(object), subset = subset, target = target) 9: pmindex(getPlatformDesign(object), subset = subset, target = target) 8: .local(object, ...) 7: pmindex(x, target = target) 6: pmindex(x, target = target) 5: getProbeIndex(x, type = which, target = target) 4: unique(getProbeIndex(x, type = which, target = target)) 3: .local(x, ...) 2: boxplot(affyRaw) 1: boxplot(affyRaw)
affyRaw seems to be correct:
affyRaw GeneFeatureSet (storageMode: lockedEnvironment) assayData: 1102500 features, 6 samples element names: exprs protocolData rowNames: GSM1277464_bh20130307hg10_07_A375P31_773_HuGene-1_0-st-v1_.CEL GSM1277465_bh20130307hg10_08_A375P32_773_HuGene-1_0-st-v1_.CEL ... GSM1277469_bh20130307hg10_12_A375PLX33_773_HuGene-1_0-st-v1_.CEL (6 total) varLabels: exprs dates varMetadata: labelDescription channel phenoData rowNames: GSM1277464_bh20130307hg10_07_A375P31_773_HuGene-1_0-st-v1_.CEL GSM1277465_bh20130307hg10_08_A375P32_773_HuGene-1_0-st-v1_.CEL ... GSM1277469_bh20130307hg10_12_A375PLX33_773_HuGene-1_0-st-v1_.CEL (6 total) varLabels: index varMetadata: labelDescription channel featureData: none experimentData: use 'experimentData(object)' Annotation: pd.hugene.1.0.st.v1
<font face="monospace">And my session info is:</font>
R version 3.2.3 (2015-12-10) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.12.4 (unknown) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.6-1 [3] RSQLite_1.1-2 limma_3.26.9 [5] oligo_1.34.2 Biostrings_2.38.4 [7] XVector_0.10.0 IRanges_2.4.8 [9] S4Vectors_0.8.11 oligoClasses_1.32.0 [11] GEOquery_2.36.0 Biobase_2.30.0 [13] BiocGenerics_0.16.1 loaded via a namespace (and not attached): [1] Rcpp_0.12.10 affxparser_1.42.0 [3] splines_3.2.3 GenomicRanges_1.22.4 [5] zlibbioc_1.16.0 bit_1.1-12 [7] foreach_1.4.3 GenomeInfoDb_1.6.3 [9] tools_3.2.3 SummarizedExperiment_1.0.2 [11] ff_2.2-13 iterators_1.0.8 [13] digest_0.6.12 preprocessCore_1.32.0 [15] affyio_1.40.0 bitops_1.0-6 [17] codetools_0.2-15 RCurl_1.95-4.8 [19] memoise_1.0.0 BiocInstaller_1.20.3 [21] XML_3.98-1.6
I don't want to say something wrong as I am myself also a beginner, but I think the problem lies that to use the boxplot fonction of oligo, you need to use data coming from the oligo::rma (even if not normalized yet), because you need to select the "Level of summarization (only for Exon/Gene arrays)" (see ?oligo::rma and the oligo user guide). That what the error message explains you.
Here is an example, where I worked on the gene level (see the target="core") :
This might help you.