Question

No gene can be mapped error with clusterprofiler

0

Entering edit mode

prp291 • 0

@prp291-9622

Last seen 7.3 years ago

I am trying to use clusterprofiler to enrich the pathways in my datasets. But I am getting the error "No gene can be mapped". Any insight will be really helpful. Thanks

    > require(clusterProfiler)
    Loading required package: clusterProfiler
    Loading required package: DOSE

    > rawdata<- read.delim("tobacco",header=TRUE,check.names=FALSE)
    > head(rawdata)
   Pathway   Gene
1 map00010 K00001
2 map00010 K00016
3 map00010 K00121
4 map00010 K00128
5 map00010 K00131
6 map00010 K00134
> deg<- read.delim("deg",header=FALSE)
> head(deg)
      V1
1 K00001
2 K00016
3 K00121
4 K00128
5 K00131
6 K00134
> disease2gene=rawdata[, c("Pathway","Gene")]
> x = enricher(deg, TERM2GENE=disease2gene)
No gene can be mapped....
--> return NULL...

clusterprofiler r • 9.5k views

ADD COMMENT • link updated 15 months ago by Guido Hooiveld ★ 4.1k • written 8.3 years ago by prp291 • 0

score 0 · Answer 1 · 2017-01-17

0

Entering edit mode

Guangchuang Yu ★ 1.2k

@guangchuang-yu-5419

Last seen 6 months ago

China/Guangzhou/Southern Medical Univer…

x = enricher(as.character(deg[,1]), TERM2GENE=disease2gene)

input gene should be a character vector.

ADD COMMENT • link 8.3 years ago Guangchuang Yu ★ 1.2k

0

Entering edit mode

Thanks for your answer. I have already pointed out the problem and changed it with package varhandle. But your solution is more elegant. :)

ADD REPLY • link 8.3 years ago prp291 • 0

0

Entering edit mode

hello,

I am also getting same error.

Can you please explain how to resolve?

>kegg_enrich <- enrichKEGG(gene = names(log_counts),
+ organism = 'uma')

No gene can be mapped....
--> Expected input gene ID: UMAG_02508,UMAG_04159,UMAG_03306,UMAG_05130,UMAG_04797,UMAG_01604--> return NULL..

>head(log_counts)

axenic_1 axenic_2 axenic_3 hpi12_1 hpi12_2 hpi12_3 hpi24_1 hpi24_2 hpi24_3 dpi2_1
UMAG_00001 11.00352 9.459432 9.350939 4.247928 4.321928 4.857981 4.459432 4.584963 4.643856 4.087463
UMAG_00002 11.65553 11.663558 11.904258 4.807355 5.044394 5.857981 4.700440 3.700440 4.247928 6.491853
UMAG_00003 11.15102 10.675957 10.954196 5.700440 5.781360 5.832890 5.426265 5.000000 5.321928 6.584963
UMAG_00005 11.91064 11.043711 12.130249 8.164907 7.800900 9.169925 7.459432 7.434628 7.982994 9.027906
UMAG_00006 11.41574 11.131857 10.406205 4.954196 5.357552 5.727920 5.209453 4.000000 4.754888 5.459432
UMAG_00007 12.07180 11.971184 12.109831 5.700440 5.129283 5.727920 5.044394 4.754888 4.906891 6.491853
dpi2_2 dpi2_3 dpi4_1 dpi4_2 dpi4_3 dpi6_1 dpi6_2 dpi6_3 dpi8_1
UMAG_00001 4.247928 3.459432 4.087463 4.247928 4.754888 5.169925 5.554589 5.392317 5.169925
UMAG_00002 6.794416 6.209453 5.882643 5.832890 5.807355 7.076816 7.066089 7.129283 6.727920
UMAG_00003 6.870365 6.700440 6.357552 6.426265 6.357552 7.257388 7.321928 7.022368 7.044394
UMAG_00005 8.744834 8.353147 10.250298 9.917372 9.965784 11.469133 11.467606 11.639793 11.651500
UMAG_00006 5.554589 5.491853 5.523562 5.584963 5.491853 6.857981 6.459432 6.658211 6.357552
UMAG_00007 6.491853 6.247928 5.169925 5.285402 5.169925 6.442943 6.189825 6.228819 5.523562
dpi8_2 dpi8_3 dpi12_1 dpi12_2 dpi12_3
UMAG_00001 5.832890 5.930737 6.672425 6.228819 7.139551
UMAG_00002 7.139551 6.554589 7.409391 6.870365 8.471675
UMAG_00003 7.826548 6.965784 7.876517 7.665336 8.682995
UMAG_00005 12.395802 11.747773 12.177420 12.038576 12.241089
UMAG_00006 7.285402 6.442943 7.787903 7.475733 8.262095
UMAG_00007 6.285402 5.614710 7.303781 7.348728 8.280771
description symbol entrez
UMAG_00001 hypothetical protein UMAG_00001 23561423
UMAG_00002 hypothetical protein UMAG_00002 23561424
UMAG_00003 hypothetical protein UMAG_00003 23561425
UMAG_00005 putative Benzoate 4-monooxygenase cytochrome P450 UMAG_00005 23561426
UMAG_00006 hypothetical protein UMAG_00006 23561427
UMAG_00007 hypothetical protein UMAG_00007 23561428

ADD REPLY • link 6.7 years ago sbbinfo90 • 0

0

Entering edit mode

Your code is working for me. I used as input the 12 UMA IDs above.

Although not directly related to the problem your reported, please note that the function enrichKEGG() is used to check which KEGG pathways are over-represented in a list of selected genes, e.g. identified by an differential gene expression analysis. In other words, you should NOT 'feed' it your whole data set as you do in your code, but only a subset (e.g. the significant ones). If needed, you can use all genes in your data set to define the background genes (using the argument universe).

> library(clusterProfiler)
>
> # use all 12 UMA IDs above as input (i.e. being 'selected' genes).
> ids <- c("UMAG_02508","UMAG_04159","UMAG_03306","UMAG_05130","UMAG_04797",
+ "UMAG_01604", "UMAG_00001", "UMAG_00002", "UMAG_00003", "UMAG_00005", "UMAG_00006", "UMAG_00007")
>
> # perform over-representation analysis
> kegg_enrich <- enrichKEGG(gene = ids, organism = 'uma')
>
> #view results
> head(kegg_enrich)
               ID                           Description GeneRatio  BgRatio
uma00030 uma00030             Pentose phosphate pathway       3/8  20/1641
uma01110 uma01110 Biosynthesis of secondary metabolites       6/8 257/1641
uma01200 uma01200                     Carbon metabolism       4/8  84/1641
uma01130 uma01130           Biosynthesis of antibiotics       5/8 195/1641
uma00620 uma00620                   Pyruvate metabolism       2/8  31/1641
uma01230 uma01230           Biosynthesis of amino acids       3/8  99/1641
               pvalue    p.adjust      qvalue
uma00030 8.351133e-05 0.001753738 0.001142787
uma01110 2.966033e-04 0.002678273 0.001745240
uma01200 3.826104e-04 0.002678273 0.001745240
uma01130 9.348908e-04 0.004908177 0.003198311
uma00620 9.012598e-03 0.028769629 0.018747127
uma01230 9.554376e-03 0.028769629 0.018747127
                                                                    geneID
uma00030                                  UMAG_04159/UMAG_03306/UMAG_04797
uma01110 UMAG_02508/UMAG_04159/UMAG_03306/UMAG_05130/UMAG_04797/UMAG_00007
uma01200                       UMAG_04159/UMAG_03306/UMAG_05130/UMAG_04797
uma01130            UMAG_02508/UMAG_04159/UMAG_03306/UMAG_05130/UMAG_04797
uma00620                                             UMAG_02508/UMAG_05130
uma01230                                  UMAG_04159/UMAG_03306/UMAG_04797
         Count
uma00030     3
uma01110     6
uma01200     4
uma01130     5
uma00620     2
uma01230     3
>
>

> sessionInfo()
R version 3.5.1 Patched (2018-08-13 r75130)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] clusterProfiler_3.8.1

loaded via a namespace (and not attached):
 [1] ggrepel_0.8.0        Rcpp_0.12.18         lattice_0.20-35     
 [4] tidyr_0.8.1          GO.db_3.6.0          assertthat_0.2.0    
 [7] digest_0.6.16        ggforce_0.1.3        R6_2.2.2            
[10] plyr_1.8.4           ggridges_0.5.0       stats4_3.5.1        
[13] RSQLite_2.1.1        ggplot2_3.0.0        pillar_1.3.0        
[16] rlang_0.2.2          lazyeval_0.2.1       data.table_1.11.4   
[19] blob_1.1.1           S4Vectors_0.18.3     Matrix_1.2-14       
[22] qvalue_2.12.0        splines_3.5.1        BiocParallel_1.14.2
[25] stringr_1.3.1        igraph_1.2.2         bit_1.1-14          
[28] munsell_0.5.0        fgsea_1.6.0          compiler_3.5.1      
[31] pkgconfig_2.0.2      BiocGenerics_0.26.0  tidyselect_0.2.4    
[34] tibble_1.4.2         gridExtra_2.3        IRanges_2.14.11     
[37] enrichplot_1.0.2     viridisLite_0.3.0    crayon_1.3.4        
[40] dplyr_0.7.6          MASS_7.3-50          grid_3.5.1          
[43] gtable_0.2.0         DBI_1.0.0            magrittr_1.5        
[46] units_0.6-0          scales_1.0.0         stringi_1.1.7       
[49] GOSemSim_2.6.2       reshape2_1.4.3       viridis_0.5.1       
[52] bindrcpp_0.2.2       DO.db_2.9            rvcheck_0.1.0       
[55] cowplot_0.9.3        fastmatch_1.1-0      tools_3.5.1         
[58] bit64_0.9-7          Biobase_2.40.0       glue_1.3.0          
[61] tweenr_0.1.5         purrr_0.2.5          ggraph_1.0.2        
[64] parallel_3.5.1       AnnotationDbi_1.42.1 colorspace_1.3-2    
[67] UpSetR_1.3.3         DOSE_3.6.1           memoise_1.1.0       
[70] bindr_0.1.1         
>