Question

Probe summerization of Human HT-12 V4 BeadChip arrays

0

Entering edit mode

Seymoo • 0

@seymoo-12522

Last seen 5 days ago

Oslo

I am not familiar with Iluumina arrays I need some hints because I am trying to work with a data set from Human HT-12 V4 BeadChip array deposited at GEO : "GSE73255"

I am following to 2 approaches to get the data

explaned in Beadarray package:

library(GEOquery)

url <- "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE33126/" download.file(paste(url, filenm, sep=""), destfile=filenm)

gse <- getGEO(filename=filenm)

head(exprs(gse))

explained in GEO:

gset <- getGEO("GSE73255", GSEMatrix =TRUE, getGPL=FALSE)
if (length(gset) > 1) idx <- grep("GPL6947", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]
gset <- exprs(gset)


based on the pData(gset)$data_processing this file has been normalized with by Bioconductor (3.0) lumi pipeline with loess normalization , if I am not mistaken?!

When I try to summarize the expression to have one probe per gene using beadarray as follow:

library("illuminaHumanv4.db")

library(beadarray)

summaryData <- as(gse, "ExpressionSetIllumina") orsummaryData <- as(gset, "ExpressionSetIllumina")

I get error Error in object@channelData[[1]] : subscript out of bounds in R

What am I doing wrong at this stage??

I also like to know if I can use the RAW data and perform RMA normalization on this type of data?

I appreciate if anyone could help me with the answer.

beadarray bead chip illuminahumanv4.db normalization RMA • 2.2k views

ADD COMMENT • link updated 7.4 years ago by James W. MacDonald 68k • written 7.4 years ago by Seymoo • 0

0

Entering edit mode

I can only say I use limma code for importing and normalising this type of array and then use the limma avereps function to average to genes, it is easy, I never got on well with beadarray for some reason.

ADD REPLY • link 7.4 years ago chris86 ▴ 420

0

Entering edit mode

Thanks for the hints Chris! I have always been working with Affy arrays so I have not much of idea about the the beadarrays. But I am gonna look into what you have suggested.

ADD REPLY • link 7.4 years ago Seymoo • 0

0

Entering edit mode

What version of Bioconductor are you using? It seems fine for me on Bioconductor 3.6.

library(GEOquery)
gse <- getGEO("GSE33126")[[1]]
eset <- as(gse, "ExpressionSetIllumina")
sessionInfo()

I wouldn't recommend averaging the probes for the same gene though. Some of the probes on these arrays can be badly annotated, so by averaging you can dilute the signal for the gene. If you really want one measurement for a gene, what I usually do is pick the probe with the highest variance.

By converting the GEOquery object to a beadarray one, you get all the information about the probe annotation

table(fData(eset)$PROBEQUALITY)

ADD REPLY • link 7.4 years ago Mark Dunning ★ 1.1k

0

Entering edit mode

Hi @Mark,

I am using

BioC_mirror: https://bioconductor.org
Using Bioconductor 3.4 (BiocInstaller 1.24.0), R 3.3.2 (2016-10-31)

I have not manage to solve the problem yet. I manage to download the data matrix with

gse <- getGEO("GSE73255", GSEMatrix = FALSE)

but

eset <- as(gse, "ExpressionSetIllumina")

gives previous error!
Would it be possible for you to try with `GSE73255` instead? I also appreciate if you could explain how can I proceed to pick the probe with highest variance for each gene?

ADD REPLY • link 7.4 years ago Seymoo • 0

score 0 · Answer 1 · 2017-11-27

These data are from Illumina arrays, so by definition you cannot run RMA! That algorithm is intended for Affymetrix arrays, not Illumina.

You can use getGEOSuppFiles to download the raw data, but those data are simply a file where they have summarized the beads to an average detection value, as well as the detection p-value, so you don't get the IDAT files, and you will have to figure out how to stuff those data into a useful container. Probably the easiest thing to do would be to extract the AVG_Signal columns and put into a limma EList object, and then normalize using a loess normalization.

If you don't know what all that means, you would be better off to find somebody local who can help, as this is a non-trivial exercise for a newcomer.