Question

GAGE Package: Problem with probeset IDs conversion to Gene names or vice versa

0

Entering edit mode

Javerjung Sandhu ▴ 200

@javerjung-sandhu-5043

Last seen 10.6 years ago

Dear list, I have gene expression data with probeset IDs & gene set data with gene names. So both gene set and expression data are not using the same gene ID system. Both gene set and expression data should use the same GENE ID system which is a requirement of the GAGE analysis. So the problem is that if i convert the Probeset IDs to gene name, i get a single gene name for multiple probes. So the expression data will now have multiple rows with same gene name. IS THIS VALID IN GAGE? How the GAGE package tackles this problem? GSEA can address this problem but how GAGE does it & If i convert the gene names in gene set to probeset IDs then the problem is WHICH probeset ID should i choose if multiple probeset IDs are associated with that gene. Thanks, Jung [[alternative HTML version deleted]]

convert gage convert gage • 1.2k views

ADD COMMENT • link updated 13.1 years ago by Luo Weijun ★ 1.6k • written 13.1 years ago by Javerjung Sandhu ▴ 200

score 0 · Answer 1 · 2012-03-20

Jung, This issue is covered in Section 7 of gage vignette as below: Expression data have multiple probesets (as in Affymetrix GeneChip Data) for asingle gene, but gene set analysis requires one entry per gene. You may pick up the most differentially expressed probeset for a gene and discard the rest, or process the raw intensity data using customized probe set definition (CDF file), where probes are re-mapped on a gene by gene base. Check the Methods section of GAGE paper for details. Weijun Luo --- On Thu, 3/1/12, Javerjung Sandhu <jsandhu@bcgsc.ca> wrote: From: Javerjung Sandhu <jsandhu@bcgsc.ca> Subject: GAGE Package: Problem with probeset IDs conversion to Gene names or vice versa To: "bioconductor@r-project.org" <bioconductor@r-project.org> Cc: "luo_weijun@yahoo.com" <luo_weijun@yahoo.com> Date: Thursday, March 1, 2012, 9:51 PM Dear list, I have gene expression data with probeset IDs & gene set data with gene names. So both gene set and expression data are not using the same gene ID system. Both gene set and expression data should use the same GENE ID system which is a requirement of the GAGE analysis. So the problem is that if i convert the Probeset IDs to gene name, i get a single gene name for multiple probes. So the expression data will now have multiple rows with same gene name. IS THIS VALID IN GAGE? How the GAGE package tackles this problem? GSEA can address this problem but how GAGE does it & If i convert the gene names in gene set to probeset IDs then the problem is WHICH probeset ID should i choose if multiple probeset IDs are associated with that gene. Thanks, Jung [[alternative HTML version deleted]]