affy::rma results doesn't match with ncbi website
2
0
Entering edit mode
akp9132 • 0
@024693d4
Last seen 12 hours ago
Iran

I'm trying to find out how RMA normalized signal intensity values in this sample have been calculated. I tried to perform RMA in R, but the results are different

library(affy)
cell_file = ReadAffy()
mat = exprs(cell_file)
exprs(affy::rma(cell_file))[1]

This code returns the value 11.79282 for the first probe (1007_s_at), but it should be 12.123249 according to the website.

I also asked the same question here

affy Preprocessing Microarray • 274 views
ADD COMMENT
0
Entering edit mode
ATpoint ★ 4.7k
@atpoint-13662
Last seen 8 hours ago
Germany

Exact OMCIS reproduction down to the values without code and software version is borderline impossible. Either use the values as provided or start with the CEL files and apply the established best practices. The latter is more reproducible, the former is easier. Choice is yours.

ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 hours ago
United States

Please note that RMA is a method intended to normalize and summarize a set of samples rather than a single sample (it doesn't make much sense in the context of a single sample). You should instead get all the data and process as a set.

> library(GEOquery)


> options(timeout = 1e6)
> getGEOSuppFiles("GSE2990")
> setwd("GSE2990/")
> untar("GSE2990_RAW.tar")

> dat <- read.celfiles(list.celfiles(listGzipped = TRUE))
Loading required package: pd.hg.u133a

Loading required package: pd.hg.u133a
Loading required package: RSQLite
Loading required package: DBI
Platform design info loaded.
Reading in : gsm65316.cel.gz
Reading in : gsm65317.cel.gz

<snip>

Warning message:
package 'RSQLite' was built under R version 4.4.2 
> eset <- rma(dat)
Background correcting
Normalizing
Calculating Expression
> exprs(eset)[1,1,drop = FALSE]
          gsm65316.cel.gz
1007_s_at        12.11781

That's closer than you got. It's not entirely clear how the submitter processed the data. They just say something like 'normal RMA'. If you really want their data, it's easy enough to do that as well

> z <- getGEO("GSE2990")[[1]]
Found 1 file(s)
GSE2990_series_matrix.txt.gz

> exprs(z)[1,1,drop = FALSE]
          GSM65316
1007_s_at 12.12325

But there's something weird going on with the normalization that the submitter performed.

> plot(density(exprs(eset)[,1]))
> for(i in 2:ncol(eset)) lines(density(exprs(eset)[,i]))

enter image description here

> plot(density(exprs(z)[,1]))
> for(i in 2:ncol(z)) lines(density(exprs(z)[,i]))

enter image description here

So I would probably just re-run the analysis using the raw data.

Login before adding your answer.

Traffic: 574 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6