Question

Questions regarding convertIDs in the rnaseqGene Package

0

Entering edit mode

Cheng-Yuan Kao ▴ 100

@cheng-yuan-kao-3472

Last seen 7.6 years ago

Taiwan

Hey, there,

I was following the instruction of http://www.bioconductor.org/help/workflows/rnaseqGene/#annotate to update the annotation in my DEG files. I used the code they designed:

convertIDs <- function( ids, from, to, db, ifMultiple=c("putNA", "useFirst")) {
  stopifnot( inherits( db, "AnnotationDb" ) )
  ifMultiple <- match.arg( ifMultiple )
  suppressWarnings( selRes <- AnnotationDbi::select(
    db, keys=ids, keytype=from, columns=c(from,to) ) )
  if ( ifMultiple == "putNA" ) {
    duplicatedIds <- selRes[ duplicated( selRes[,1] ), 1 ]
    selRes <- selRes[ ! selRes[,1] %in% duplicatedIds, ]
  }
  return( selRes[ match( ids, selRes[,1] ), 2 ] )
}

I loaded my DEG csv file to deg_01 first. The id column has all the refseq ids.

Then

> deg_01$newMGI_symbol <- convertIDs("deg_01$id", "REFSEQ", "SYMBOL", org.Mm.eg.db)

Error in .testForValidKeys(x, keys, keytype) :
None of the keys entered are valid keys for 'REFSEQ'. Please use the keys method to see a listing of valid arguments.

I checked and the deg_01$id are all refseq ids.

> deg_01$id
[1] NM_026268 NM_025777 NM_013478 NR_015524 NM_001099297 NM_177610 NM_001122954
[8] NM_144834 NM_011110 NM_016668 NM_013820 NM_001145874 NM_001081060 NM_080457
[15] NM_007646 NM_020279 NM_172415 NM_007686 NM_001112723 NM_001173505 NM_001290565

Any suggestions? Thanks.

rnaseqGene convertIDs • 2.0k views

ADD COMMENT • link 10.1 years ago Cheng-Yuan Kao ▴ 100

0

Entering edit mode

It looks like your IDs are factors, not character. However that should result in a different error:

> d.f[,1]
 [1] NM_026268              NM_025777              NM_013478             
 [4] NR_015524              NM_001099297 NM_177610 NM_001122954 NM_144834
 [7] NM_011110              NM_016668              NM_013820             
[10] NM_001145874           NM_001081060           NM_080457             
[13] NM_007646              NM_020279              NM_172415             
[16] NM_007686              NM_001112723           NM_001173505          
[19] NM_001290565          
19 Levels: NM_001081060 NM_001099297 NM_177610 ... NR_015524

> convertIDs(d.f[,1], "REFSEQ","SYMBOL", org.Mm.eg.db)
Error in .testForValidKeys(x, keys, keytype) :
  'keys' must be a character vector

> convertIDs(as.character(d.f[,1]), "REFSEQ","SYMBOL", org.Mm.eg.db)
 [1] "Dusp6"     "Duoxa2"    "Azgp1"     "Cep83os"   NA          NA         
 [7] "Pla2g5"    "Bhmt"      "Hk2"       "Muc20"     "Slc9a3"    "Muc4"     
[13] "Cd38"      "Ccl28"     "Arhgef10l" "Cfi"       "Arhgef10l" "Thpo"     
[19] "Syne4"

ADD REPLY • link 10.1 years ago James W. MacDonald 68k

score 0 · Answer 1 · 2015-04-07

0

Entering edit mode

Cheng-Yuan Kao ▴ 100

@cheng-yuan-kao-3472

Last seen 7.6 years ago

Taiwan

Great. Thanks for your help.

I followed your suggestion and did:

> deg_01$newMGI_symbol <- convertIDs(as.character(deg_01$id), "REFSEQ", "SYMBOL", org.Mm.eg.db)

It worked. I did not know that factor vs character makes such a difference because the same input works well with biomaRt .

ADD COMMENT • link 10.1 years ago Cheng-Yuan Kao ▴ 100

0

Entering edit mode

Indeed. Sometimes a programmer will include error checking to enforce the correct form for input values, and sometimes a programmer will include explicit conversion from e.g., factor to character. Doing the conversion yourself is the safest way to go, IMO.

ADD REPLY • link 10.1 years ago James W. MacDonald 68k