Dear list,
I have gene expression data with probeset IDs & gene set data with
gene names. So both gene set and expression data are not using the
same gene ID system. Both gene set and expression data should use the
same GENE ID system which is a requirement of the GAGE analysis.
So the problem is that if i convert the Probeset IDs to gene name, i
get a single gene name for multiple probes. So the expression data
will now have multiple rows with same gene name. IS THIS VALID IN
GAGE? How the GAGE package tackles this problem? GSEA can address this
problem but how GAGE does it & If i convert the gene names in gene set
to probeset IDs then the problem is WHICH probeset ID should i choose
if multiple probeset IDs are associated with that gene.
Thanks,
Jung
[[alternative HTML version deleted]]
Jung,
This issue is covered in Section 7 of gage vignette as
below:
Expression data have multiple probesets (as in Affymetrix
GeneChip Data) for asingle gene, but gene set analysis requires one
entry per
gene. You may pick up the most differentially expressed probeset for a
gene and
discard the rest, or process the raw intensity data using customized
probe set
definition (CDF file), where probes are re-mapped on a gene by gene
base. Check
the Methods section of GAGE paper for details.
Weijun Luo
--- On Thu, 3/1/12, Javerjung Sandhu <jsandhu@bcgsc.ca> wrote:
From: Javerjung Sandhu <jsandhu@bcgsc.ca>
Subject: GAGE Package: Problem with probeset IDs conversion to Gene
names or vice versa
To: "bioconductor@r-project.org" <bioconductor@r-project.org>
Cc: "luo_weijun@yahoo.com" <luo_weijun@yahoo.com>
Date: Thursday, March 1, 2012, 9:51 PM
Dear list,
I have gene expression data with probeset IDs & gene set data with
gene names. So both gene set and expression data are not using the
same gene ID system. Both gene set and expression data should use the
same GENE
ID system which is a requirement of the GAGE analysis.
So the problem is that if i convert the Probeset IDs to gene name, i
get a single gene name for multiple probes. So the expression data
will now have multiple rows with same gene name. IS THIS VALID IN
GAGE? How the
GAGE package tackles this problem? GSEA can address this problem but
how GAGE does it &
If i convert the gene names in gene set to probeset IDs then the
problem is WHICH probeset ID should i choose if multiple probeset IDs
are associated with that gene.
Thanks,
Jung
[[alternative HTML version deleted]]