Affymetrix hgu133plus2.db: How to best derive an expression value for genes that map to multiple probe ids
1
0
Entering edit mode
@matmu
Last seen 3 months ago
Germany

I want to map probe ids of the Affymetrix HG-U133_Plus_2 Array to Ensembl gene ids using the package hgu133plus2.db. There are a lot of genes that have multiple probe identifiers assigned to them. And there are also probe ids that have multiple Ensembl gene ids assigned to them (those I am removing right now). I wonder what the best approach is to select the best expression value that best represents the expression of a gene? Or is aggregating them by mean the better way to go? I guess one could do this using the probe id suffixes.

Suffixes included in hgu133plus2.db: "s_at" "at" "g_at" "i_at" "f_at" "a_at" "x_at" "r_at" "3_at" "5_at" "M_at" "MA_at" "MB_at" "alu_at"

library(hgu133plus2.db)
library(stringi)

anno = AnnotationDbi::select(hgu133plus2.db,
                             keys = keys(hgu133plus2.db, keytype = "PROBEID"),
                             keytype = "PROBEID",
                             columns = c("ENSEMBL"))

suffixes = unique(unlist(lapply(anno$PROBEID, function(x) stringi::stri_split_fixed(str = x, pattern = "_", n = 2, simplify = TRUE)[2])))
affydata hgu133plus2.db AffymetrixChip • 833 views
ADD COMMENT
0
Entering edit mode
ATpoint ★ 4.6k
@atpoint-13662
Last seen 2 hours ago
Germany

Hello, this has been discussed quite extensively before, many links in here: How to combine multiple probes representing a single gene?

ADD COMMENT

Login before adding your answer.

Traffic: 792 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6