Hello guys, I need your support in order to understand how to solve my problem, because I already spent too much time and I don't know how to solve it.
The steps done are the following (as in vignette of beadarray):
> source("http://www.bioconductor.org/biocLite.R")
Bioconductor version 3.4 (BiocInstaller 1.24.0), ?biocLite for help
> library (GEOquery)
> library ("beadarray")
> gse <- getGEO("GSE16570")
https://ftp.ncbi.nlm.nih.gov/geo/series/GSE16nnn/GSE16570/matrix/
OK
Found 1 file(s)
GSE16570_series_matrix.txt.gz
Using locally cached version: /tmp/RtmpqrUsaF/GSE16570_series_matrix.txt.gz
Using locally cached version of GPL6947 found here:
/tmp/RtmpqrUsaF/GPL6947.soft
Warning message:
In read.table(file = file, header = header, sep = sep, quote = quote, :
not all columns named in 'colClasses' exist
> head(exprs(gse[[1]]))
GSM416669 GSM416670 GSM416671 GSM416672 GSM416673 GSM416674 GSM416675 GSM416676 GSM416677 GSM416678 GSM416679 GSM416680
ILMN_1343291 41536.0 45373.90 44729.9344 45373.9000 37678.2414 45373.9000 41401.46 41710.67 45373.90 43194.8834 45373.90 39210.2438
ILMN_1343295 15916.2 18469.74 14981.4909 17471.4313 15566.8978 14633.9670 17822.06 18731.63 16440.84 14719.4626 20182.79 12236.7710
ILMN_1651199 50.7 48.00 43.1333 51.0000 49.5333 42.3000 48.15 45.90 47.30 46.5000 44.60 46.1500
ILMN_1651209 66.4 61.60 66.9000 80.8750 73.8407 85.5833 100.76 80.50 58.40 82.5733 90.50 96.5059
ILMN_1651210 51.3 45.35 48.5000 45.8000 47.5000 51.2500 48.40 50.95 46.40 47.4000 53.20 49.7500
ILMN_1651221 49.6 58.70 57.8000 67.9846 57.1000 55.5500 67.54 61.10 54.00 47.4000 61.50 63.6200
GSM416681 GSM416682 GSM416683 GSM416684
ILMN_1343291 34557.5028 38430.8676 45373.9000 42919.1497
ILMN_1343295 16962.7363 12016.3114 15753.2659 17366.6322
ILMN_1651199 50.9500 51.8000 41.4600 56.2500
ILMN_1651209 86.9375 92.5625 75.0444 76.8083
ILMN_1651210 48.9500 52.6500 48.9500 42.2250
ILMN_1651221 73.1000 51.9500 55.2333 50.9500
> gse
$GSE16570_series_matrix.txt.gz
ExpressionSet (storageMode: lockedEnvironment)
assayData: 48803 features, 16 samples
element names: exprs
protocolData: none
phenoData
sampleNames: GSM416669 GSM416670 ... GSM416684 (16 total)
varLabels: title geo_accession ... data_row_count (32 total)
varMetadata: labelDescription
featureData
featureNames: ILMN_1343291 ILMN_1343295 ... ILMN_2416019 (48803 total)
fvarLabels: ID nuID ... GB_ACC (30 total)
fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL6947
But when I try to import to the native class to work on beadarray, I get the following:
> summaryData <- as(gse[[1]], "ExpressionSetIllumina")
Error in object@channelData[[1]] : subscript out of bounds
Notice that I get the same using the code from NCBI:
> gse <- getGEO("GSE16570", GSEMatrix =TRUE, getGPL=FALSE) https://ftp.ncbi.nlm.nih.gov/geo/series/GSE16nnn/GSE16570/matrix/ OK Found 1 file(s) GSE16570_series_matrix.txt.gz Using locally cached version: /tmp/RtmpqrUsaF/GSE16570_series_matrix.txt.gz > if (length(gset) > 1) idx <- grep("GPL6947", attr(gse, "names")) else idx <- 1 > gse <- gse[[idx]] > head(exprs(gse)) GSM416669 GSM416670 GSM416671 GSM416672 GSM416673 GSM416674 GSM416675 GSM416676 GSM416677 GSM416678 GSM416679 GSM416680 ILMN_1343291 41536.0 45373.90 44729.9344 45373.9000 37678.2414 45373.9000 41401.46 41710.67 45373.90 43194.8834 45373.90 39210.2438 ILMN_1343295 15916.2 18469.74 14981.4909 17471.4313 15566.8978 14633.9670 17822.06 18731.63 16440.84 14719.4626 20182.79 12236.7710 ILMN_1651199 50.7 48.00 43.1333 51.0000 49.5333 42.3000 48.15 45.90 47.30 46.5000 44.60 46.1500 ILMN_1651209 66.4 61.60 66.9000 80.8750 73.8407 85.5833 100.76 80.50 58.40 82.5733 90.50 96.5059 ILMN_1651210 51.3 45.35 48.5000 45.8000 47.5000 51.2500 48.40 50.95 46.40 47.4000 53.20 49.7500 ILMN_1651221 49.6 58.70 57.8000 67.9846 57.1000 55.5500 67.54 61.10 54.00 47.4000 61.50 63.6200 GSM416681 GSM416682 GSM416683 GSM416684 ILMN_1343291 34557.5028 38430.8676 45373.9000 42919.1497 ILMN_1343295 16962.7363 12016.3114 15753.2659 17366.6322 ILMN_1651199 50.9500 51.8000 41.4600 56.2500 ILMN_1651209 86.9375 92.5625 75.0444 76.8083 ILMN_1651210 48.9500 52.6500 48.9500 42.2250 ILMN_1651221 73.1000 51.9500 55.2333 50.9500 > summaryData <- as(gse, "ExpressionSetIllumina") Error in object@channelData[[1]] : subscript out of bounds |
|
|
Please, any suggest is welcome.
Here is my sessionInfo:
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] beadarray_2.24.0 ggplot2_2.2.0 GEOquery_2.40.0 Biobase_2.34.0 BiocGenerics_0.20.0 BiocInstaller_1.24.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.8 plyr_1.8.4 GenomeInfoDb_1.10.1 XVector_0.14.0 bitops_1.0-6 tools_3.3.2
[7] zlibbioc_1.20.0 digest_0.6.10 base64_2.0 RSQLite_1.1 memoise_1.0.0 tibble_1.2
[13] gtable_0.2.0 DBI_0.5-1 httr_1.2.1 stringr_1.1.0 S4Vectors_0.12.1 IRanges_2.8.1
[19] stats4_3.3.2 grid_3.3.2 R6_2.2.0 AnnotationDbi_1.36.0 XML_3.98-1.5 limma_3.30.6
[25] BeadDataPackR_1.26.0 reshape2_1.4.2 magrittr_1.5 scales_0.4.1 GenomicRanges_1.26.1 assertthat_0.1
[31] colorspace_1.3-1 stringi_1.1.2 openssl_0.9.5 RCurl_1.95-4.8 lazyeval_0.2.0 munsell_0.4.3
[37] illuminaio_0.16.0
Best regards
What are you planning to do with the object? Does it actually need to be a
ExpressionSetIllumina
?validObject()
is failing for very similar reasons to the first error, namely that dimension checks in Biobase aren't actually appropriate for theExpressionSetIllumina
class.It's been a long time since I've worked with beadarray data, but I seem to remember the
ExpressionSetIllumina
mostly extends a regularExpressionSet
in order to store the number of replicate beads for each observation, the standard error for each bead type and some measure of detection above background. I'm not sure you have any of that information in the GEO data you're downloading, so does it ever need to be converted toExpressionSetIllumina
?I might suggest continuing without that step and seeing what happens. Either tools will work or they won't, but it's probably worth proceeding anyway.