normalization , multiple gene symbols
1
0
Entering edit mode
lily ▴ 20
@lily-11438
Last seen 3.5 years ago
India

Hi all

     I did normalization process through RMA in R for my datasets with Affymetrix platform (HGU133plus2). But my problem is i am getting duplicate gene symbols with different expression values. What step can i do to get unique gene symbols. Please help me out.

normalization • 1.0k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen just now
United States

There are a couple of possibilities. You could use one of the MBNI re-mapped cdf files (google the last four words to find their site), which collate all the probes into single probesets per gene. Then you will just have one value for each gene. An alternative is to use findLargest in the genefilter package which will automagically select the probesets with the biggest difference in a given comparison.

ADD COMMENT
0
Entering edit mode

Thanks for the response. But i am not able to run this particular package findLargest. Could you suggest me any more options to do it.

ADD REPLY
0
Entering edit mode

I recommeded a function called findLargest, which you can find in the genefilter package.

ADD REPLY
0
Entering edit mode

Thank you Sir for your response. I did try. I can run findLargest in genefilter package. One more doubt, suppose my datasets consists of more than 50000 genes. What maximum rnorm do i need to give?

 stats <- rnorm(50000).

ADD REPLY
0
Entering edit mode

You are taking the example too literally. That's a fake example that is used simply to show how the function works, and isn't intended to be how you use it in the real world. Instead, look at the Arguments section of the help:

Arguments:

      gN: A vector of probe identifiers for the chip.

testStat: A vector of test statistics, of the same length as 'gN' with
          the per probe test statistics.

    data: The character string identifying the chip.

The testStat argument isn't a randomly generated set of numbers - it is a set of statistics that you generated when testing for differences between your groups, and the findLargest function finds the largest of these statistics, for each Entrez Gene ID.

ADD REPLY
0
Entering edit mode

Thank you for your valuable suggestion. I got it :-)

ADD REPLY

Login before adding your answer.

Traffic: 760 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6