I am not familiar with Iluumina arrays I need some hints because I am trying to work with a data set from Human HT-12 V4 BeadChip array deposited at GEO : "GSE73255"
I am following to 2 approaches to get the data
explaned in Beadarray package:
library(GEOquery) url <- "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE33126/" download.file(paste(url, filenm, sep=""), destfile=filenm) gse <- getGEO(filename=filenm) head(exprs(gse))
explained in GEO:
gset <- getGEO("GSE73255", GSEMatrix =TRUE, getGPL=FALSE) if (length(gset) > 1) idx <- grep("GPL6947", attr(gset, "names")) else idx <- 1 gset <- gset[[idx]] gset <- exprs(gset)
based on the pData(gset)$data_processing this file has been normalized withby Bioconductor (3.0) lumi pipeline with loess normalization
, if I am not mistaken?!
When I try to summarize the expression to have one probe per gene using beadarray
as follow:
library("illuminaHumanv4.db") library(beadarray) summaryData <- as(gse, "ExpressionSetIllumina")or
summaryData <- as(gset, "ExpressionSetIllumina") I get errorError in object@channelData[[1]] : subscript out of bounds
in R
What am I doing wrong at this stage??
I also like to know if I can use the RAW data and perform RMA normalization on this type of data?
I appreciate if anyone could help me with the answer.
I can only say I use limma code for importing and normalising this type of array and then use the limma avereps function to average to genes, it is easy, I never got on well with beadarray for some reason.
Thanks for the hints Chris! I have always been working with Affy arrays so I have not much of idea about the the beadarrays. But I am gonna look into what you have suggested.
What version of Bioconductor are you using? It seems fine for me on Bioconductor 3.6.
I wouldn't recommend averaging the probes for the same gene though. Some of the probes on these arrays can be badly annotated, so by averaging you can dilute the signal for the gene. If you really want one measurement for a gene, what I usually do is pick the probe with the highest variance.
By converting the GEOquery object to a beadarray one, you get all the information about the probe annotation
Hi @Mark,
I am using
I have not manage to solve the problem yet. I manage to download the data matrix with
but
gives previous error!
Would it be possible for you to try with `GSE73255` instead? I also appreciate if you could explain how can I proceed to pick the probe with highest variance for each gene?