converting probeset id to gene id's : fRMA

0

Entering edit mode

Abhishek Pratap ▴ 190

@abhishek-pratap-4927

Last seen 8.5 years ago

United States

Hey Guys I am using fRMA to normalize n summarize some affy data. Post normalization I am getting 54k uniq probe id's in the eSet. I am wondering whats the best way to convert this expressionSet into a gene level data. I thought fRMA's documentation said it would produce gene level summarized values but I dont see that in the result. Can fRMA do that or is there any other standard way of achieving this in BioC. affyBatch_object <- ReadAffy(celfile.path=data_dir)) normData1 <- frma(affyBatch_object, summarize="random_effect") appreciate any pointers. Thanks! -Abhi

probe affy convert frma probe affy convert frma • 3.6k views

ADD COMMENT • link updated 10.4 years ago by Peter Langfelder ★ 3.0k • written 10.4 years ago by Abhishek Pratap ▴ 190

0

Entering edit mode

Bernd Klaus ▴ 610

@bernd-klaus-6281

Last seen 6.1 years ago

Germany

Hi Abhi, I am not familiar with the fRMA package but the ExpressioSet you have have after normalization should contain probset ids of some sort in the featureData slot, try fData(normData1) These id can then be easily mapped to e.g. ENSEMBL IDs using an appropriate chip-annotation database, see http://www.bioconductor.org/help/workflows/annotation/annotation /#sample-workflow-ChipDb for an example, which uses the Human Genome U133 2.0 chip. However similar databases are available for many microarrays on Bioconductor. Note that some probesets might map to multiple IDs. The easiest thing is to discard them, however you can of resolve these multiple mappings in a more sophisticated manner of course if you wish. Hope that helps, Bernd On Fri, 20 Jun 2014 17:05:46 -0700 Abhishek Pratap <abhishek.vit at="" gmail.com=""> wrote: > Hey Guys > > I am using fRMA to normalize n summarize some affy data. Post > normalization I am getting 54k uniq probe id's in the eSet. > > I am wondering whats the best way to convert this expressionSet into a > gene level data. I thought fRMA's documentation said it would produce > gene level summarized values but I dont see that in the result. Can > fRMA do that or is there any other standard way of achieving this in > BioC. > > affyBatch_object <- ReadAffy(celfile.path=data_dir)) > normData1 <- frma(affyBatch_object, summarize="random_effect") > > appreciate any pointers. > > Thanks! > -Abhi > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 10.4 years ago Bernd Klaus ▴ 610

0

Entering edit mode

Abhi, Sorry for the slow response. fRMA (like RMA) summarizes to the probeset-level, which is a step closer to the gene-level than the original probe-level data. As Bernd mentioned, you can go from probesets to genes via the appropriate annotation package. Best, Matt On Sun, Jun 22, 2014 at 10:21 AM, Bernd Klaus <bernd.klaus@embl.de> wrote: > Hi Abhi, > > I am not familiar with the fRMA package but the > ExpressioSet you have have after normalization should > contain probset ids of some sort in the featureData > slot, try > > fData(normData1) > > These id can then be easily mapped to e.g. ENSEMBL IDs using > an appropriate chip-annotation database, see > > > http://www.bioconductor.org/help/workflows/annotation/annotation /#sample-workflow-ChipDb > > for an example, which uses the Human Genome U133 2.0 chip. > However similar databases are available for many > microarrays on Bioconductor. > > Note that some probesets might map to multiple IDs. > The easiest thing is to discard them, however you can > of resolve these multiple mappings > in a more sophisticated manner of course if you wish. > > Hope that helps, > > Bernd > > > > On Fri, 20 Jun 2014 17:05:46 -0700 > Abhishek Pratap <abhishek.vit@gmail.com> wrote: > > > Hey Guys > > > > I am using fRMA to normalize n summarize some affy data. Post > > normalization I am getting 54k uniq probe id's in the eSet. > > > > I am wondering whats the best way to convert this expressionSet into a > > gene level data. I thought fRMA's documentation said it would produce > > gene level summarized values but I dont see that in the result. Can > > fRMA do that or is there any other standard way of achieving this in > > BioC. > > > > affyBatch_object <- ReadAffy(celfile.path=data_dir)) > > normData1 <- frma(affyBatch_object, summarize="random_effect") > > > > appreciate any pointers. > > > > Thanks! > > -Abhi > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Matthew N McCall, PhD 112 Arvine Heights Rochester, NY 14611 Cell: 202-222-5880 [[alternative HTML version deleted]]

ADD REPLY • link 10.4 years ago Matthew McCall ▴ 830

0

Entering edit mode

Peter Langfelder ★ 3.0k

@peter-langfelder-4469

Last seen 4 weeks ago

United States

On Fri, Jun 20, 2014 at 5:05 PM, Abhishek Pratap <abhishek.vit at="" gmail.com=""> wrote: > Hey Guys > > I am using fRMA to normalize n summarize some affy data. Post > normalization I am getting 54k uniq probe id's in the eSet. > > I am wondering whats the best way to convert this expressionSet into a > gene level data. I thought fRMA's documentation said it would produce > gene level summarized values but I dont see that in the result. Can > fRMA do that or is there any other standard way of achieving this in > BioC. One way to get from probe-level data to gene-level data is the function collapseRows from the CRAN package WGCNA. The approach is described in Miller JA, Cai C, Langfelder P, Geschwind DH, Kurian SM, Salomon DR, Horvath S (2011) Strategies for aggregating gene expression data: The collapseRows R function. BMC Bioinformatics12:322, http://www.biomedcentral.com/1471-2105/12/322 and some more information, including an example, is provided at http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/collapseRows / You do need to supply the probe to gene mapping, which is available either from the chip manufacturer (presumably Affymetrix) or in an appropriate Bioconductor package. HTH, Peter

ADD COMMENT • link 10.4 years ago Peter Langfelder ★ 3.0k

0

Entering edit mode

Another thing to consider is the platform you are using, it might be relevant since some Affy chips (i.e. HuGene 1.0 ST) can be summarized in different ways (probeset or core if i recall correctly). On Sun, Jun 22, 2014 at 11:14 PM, Peter Langfelder <peter.langfelder at="" gmail.com=""> wrote: > On Fri, Jun 20, 2014 at 5:05 PM, Abhishek Pratap <abhishek.vit at="" gmail.com=""> wrote: >> Hey Guys >> >> I am using fRMA to normalize n summarize some affy data. Post >> normalization I am getting 54k uniq probe id's in the eSet. >> >> I am wondering whats the best way to convert this expressionSet into a >> gene level data. I thought fRMA's documentation said it would produce >> gene level summarized values but I dont see that in the result. Can >> fRMA do that or is there any other standard way of achieving this in >> BioC. > > One way to get from probe-level data to gene-level data is the > function collapseRows from the CRAN package WGCNA. The approach is > described in > > Miller JA, Cai C, Langfelder P, Geschwind DH, Kurian SM, Salomon DR, > Horvath S (2011) Strategies for aggregating gene expression data: The > collapseRows R function. BMC Bioinformatics12:322, > http://www.biomedcentral.com/1471-2105/12/322 > > and some more information, including an example, is provided at > > http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/collapseRo ws/ > > You do need to supply the probe to gene mapping, which is available > either from the chip manufacturer (presumably Affymetrix) or in an > appropriate Bioconductor package. > > HTH, > > Peter > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 10.4 years ago Federico Lasa ▴ 80

0

Entering edit mode

Thanks everyone specially for replying on the weekend. Cheers! -Abhi On Mon, Jun 23, 2014 at 10:18 AM, Federico Lasa <felasa at="" gmail.com=""> wrote: > Another thing to consider is the platform you are using, it might be > relevant since some Affy chips (i.e. HuGene 1.0 ST) can be summarized > in different ways (probeset or core if i recall correctly). > > On Sun, Jun 22, 2014 at 11:14 PM, Peter Langfelder > <peter.langfelder at="" gmail.com=""> wrote: >> On Fri, Jun 20, 2014 at 5:05 PM, Abhishek Pratap <abhishek.vit at="" gmail.com=""> wrote: >>> Hey Guys >>> >>> I am using fRMA to normalize n summarize some affy data. Post >>> normalization I am getting 54k uniq probe id's in the eSet. >>> >>> I am wondering whats the best way to convert this expressionSet into a >>> gene level data. I thought fRMA's documentation said it would produce >>> gene level summarized values but I dont see that in the result. Can >>> fRMA do that or is there any other standard way of achieving this in >>> BioC. >> >> One way to get from probe-level data to gene-level data is the >> function collapseRows from the CRAN package WGCNA. The approach is >> described in >> >> Miller JA, Cai C, Langfelder P, Geschwind DH, Kurian SM, Salomon DR, >> Horvath S (2011) Strategies for aggregating gene expression data: The >> collapseRows R function. BMC Bioinformatics12:322, >> http://www.biomedcentral.com/1471-2105/12/322 >> >> and some more information, including an example, is provided at >> >> http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/collapseR ows/ >> >> You do need to supply the probe to gene mapping, which is available >> either from the chip manufacturer (presumably Affymetrix) or in an >> appropriate Bioconductor package. >> >> HTH, >> >> Peter >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 10.4 years ago Abhishek Pratap ▴ 190

0

Entering edit mode

just for completeness I found the following work from Aron et al to be useful. They have looked the the probeset specificity, coverage etc to select an optimal probeset for a gene thereby enabling simple one to one mapping. Jetset: selecting an optimal microarray probe set to represent a gene Qiyuan Li, Nicolai J. Birkbak, Balazs Gyorffy, Zoltan Szallasi, and Aron C. Eklund BMC Bioinformatics 2011, 12:474 http://www.cbs.dtu.dk/biotools/jetset/ Cheers! -Abhi On Mon, Jun 23, 2014 at 11:56 AM, Abhishek Pratap <abhishek.vit at="" gmail.com=""> wrote: > Thanks everyone specially for replying on the weekend. > > Cheers! > -Abhi > > On Mon, Jun 23, 2014 at 10:18 AM, Federico Lasa <felasa at="" gmail.com=""> wrote: >> Another thing to consider is the platform you are using, it might be >> relevant since some Affy chips (i.e. HuGene 1.0 ST) can be summarized >> in different ways (probeset or core if i recall correctly). >> >> On Sun, Jun 22, 2014 at 11:14 PM, Peter Langfelder >> <peter.langfelder at="" gmail.com=""> wrote: >>> On Fri, Jun 20, 2014 at 5:05 PM, Abhishek Pratap <abhishek.vit at="" gmail.com=""> wrote: >>>> Hey Guys >>>> >>>> I am using fRMA to normalize n summarize some affy data. Post >>>> normalization I am getting 54k uniq probe id's in the eSet. >>>> >>>> I am wondering whats the best way to convert this expressionSet into a >>>> gene level data. I thought fRMA's documentation said it would produce >>>> gene level summarized values but I dont see that in the result. Can >>>> fRMA do that or is there any other standard way of achieving this in >>>> BioC. >>> >>> One way to get from probe-level data to gene-level data is the >>> function collapseRows from the CRAN package WGCNA. The approach is >>> described in >>> >>> Miller JA, Cai C, Langfelder P, Geschwind DH, Kurian SM, Salomon DR, >>> Horvath S (2011) Strategies for aggregating gene expression data: The >>> collapseRows R function. BMC Bioinformatics12:322, >>> http://www.biomedcentral.com/1471-2105/12/322 >>> >>> and some more information, including an example, is provided at >>> >>> http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/collapse Rows/ >>> >>> You do need to supply the probe to gene mapping, which is available >>> either from the chip manufacturer (presumably Affymetrix) or in an >>> appropriate Bioconductor package. >>> >>> HTH, >>> >>> Peter >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 10.4 years ago Abhishek Pratap ▴ 190

Login before adding your answer.