I am trying to runing the example of clusterprofiler from "http://www.bioconductor.org/packages/devel/bioc/vignettes/clusterProfiler/inst/doc/clusterProfiler.html#abstract" and "https://guangchuangyu.github.io/2016/05/convert-biological-id-with-kegg-api-using-clusterprofiler/". I got an question in BgRatio column. Why these two example have a big different in background genes? I check the newest database of human KEGG pathway. It only has the 7192 genes. Am I make a mistake?
in bioconductor's example, the background genes have 7164.kk <- enrichKEGG(gene = gene, organism = 'hsa', pvalueCutoff = 0.05) head(kk)
...
## BgRatio pvalue p.adjust qvalue ## hsa04110 124/7164 1.706341e-07 2.951969e-05 0.0000290976 ## hsa04114 124/7164 1.569415e-06 1.357544e-04 0.0001338133 ## hsa03320 72/7164 1.884398e-05 1.086670e-03 0.0010711317 ## hsa04914 98/7164 9.664771e-04 4.180013e-02 0.0412024432 ## hsa04115 69/7164 1.226862e-03 4.244943e-02 0.0418424543 ## hsa04062 187/7164 1.485638e-03 4.283588e-02 0.0422233843
...
in github's example, the background genes have 9275.
x <- enrichKEGG(np2up[,2], organism='hsa', keyType='uniprot') ... ## BgRatio pvalue p.adjust qvalue ## hsa04072 216/9275 0.0002654190 0.03901659 0.03240905 ## hsa04060 354/9275 0.0005349245 0.03931695 0.03265855 ## hsa04390 213/9275 0.0009536247 0.04199404 0.03488227 ## hsa04975 58/9275 0.0014014886 0.04199404 0.03488227 ## hsa05221 86/9275 0.0014283687 0.04199404 0.03488227 ...
Thanks a lot ! I am trying to imply it into differentially expression genes list from my projects. And I had try these two methods. The number of genes in the same enriched KEGG pathway are different and consequently results to the KEGG pathway rank differently. So that I can only put my gene list into bioconductor's example? I can't convert my gene list into uniprot ID to analysis?
it depends whether your input list is at gene level or protein level.
Thanks. I have a another question. why setReadable function can't support in enrichKEGG function output with entrenz gene ID but can be used in uniprot ID. I had google that the previously version had the readable parameter, but it is useless now.
`setReadable` function is always exists and work with enrichKEGG output.
I guess you mean the `readable` parameter.
Since now enrichKEGG work with online data and support more than 4000 species. For most of the speices, there are no data to support ID conversion. So `readable` parameter was removed since enrichKEGG supports using online data.
For those species that have OrgDb object/package available, you can still convert ID using `setReadable` function.
If some ID types can work for you and some cannot. Follow the guide, https://guangchuangyu.github.io/2016/07/how-to-bug-author/, and post a reproducible example.
Thank you very much! It is the reason that I am not using the newest version of clusterProfilier. It might be the reason of my Bioconducter(V3.3) is not the newest version. So when I following the installation instructions as:
It download the clusterProfiler 3.0.5 automatically. But the up to date version is 3.2.8.
The release version of Bioconductor is 3.4.
You should use the latest clusterProfiler.
see `setReadable` session in https://guangchuangyu.github.io/2016/05/convert-biological-id-with-kegg-api-using-clusterprofiler/
If your input gene list is entrez gene IDs.
You can use something like:
y <- setReadable(x, 'org.Hs.eg.db', keytype="ENTREZID")
Thanks! It works fine now!