probe annotation in hugene20sttranscriptcluster.db
1
0
Entering edit mode
sylvia ▴ 10
@sylvia-5630
Last seen 6.8 years ago

Hello,

I'm currently working on hugene20sttranscriptcluster.db and realized some probe can match to multiple genes for example:

> select(hugene20sttranscriptcluster.db, '17080408', c("SYMBOL","ENTREZID"))
   PROBEID  SYMBOL  ENTREZID
1 17080408   RAD21      5885
2 17080408 MIR3610 100500914

I was wondering if there is a specific factor that determines the order of the gene symbol ? or it's just random. If I wish to annotate the mRNA profile, would you recommend I collapse all the possible gene symbol for one probe or just use the first entry for each probe?

Best,

Sylvia

 

hugene20sttranscriptcluster.db • 1.4k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 31 minutes ago
United States

The ChipDb packages are really just a SQLite database with an API wrapper that allows you to make queries without having to know SQL. But since the underlying functions are generating SQL queries, and in general the returned values from a DB are unordered, I don't think there is any particular order to the returned data. However, the order of the input values is guaranteed to be the same in the output (e.g., the probeset IDs will be in the same order).

How to deal with the multiple mapping probes is up to you and whomever you are working with, and how you plan to present the data. I tend to use mapIds(), which by default will just take the first entry, because I am simple like that. You can however return a list, or possibly better, a CharacterList. But if you are using things like limma and/or ReportingTools for analysis and presentation, the list structure isn't as pleasant to deal with.

ADD COMMENT
0
Entering edit mode

Hi James,

Got it. Thanks for the quick response!

Best,

Sylvia

ADD REPLY

Login before adding your answer.

Traffic: 976 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6