I have a clarifying question for clusterProfiler's enrichKegg function. I am setting my own universe based on Kegg gene IDs I generated using Koala Blast. When I run enrichKegg, the Bg Ratio denominator, N, is less than the total of unique Kegg gene IDs in my set background universe (1511 vs 2863). From my understanding this should be equal. Has anyone else seen this issue or know what is going on? I've included by code and the output below with bolded BgRatios. Thanks!
koalaList %>% group_by(K_geneID) %>% summarize(Nobserved=n()) %>% arrange(desc(Nobserved)) #2863 features
test <- mergeList.wtvfnr.dmso.midexp %>%
filter(padj<0.1 & abs(log2FoldChange)>(1)) %>%
filter(!is.na(K_geneID)) %>%
group_by(K_geneID) %>% summarize(Nobserved=n()) %>% arrange(desc(Nobserved)) #623 sig de features
enriched<-
mergeList.wtvfnr.dmso.midexp %>%
filter(padj<0.1 & abs(log2FoldChange)>(1)) %>%
filter(!is.na(K_geneID)) %>%
pull(K_geneID) %>%
as.character() %>%
enrichKEGG(gene=., organism="ko", universe=koalaList$K_geneID)
enriched@result %>% as.tibble() %>% dplyr::select(-geneID) %>% arrange(desc(Count))
ID Description GeneRatio BgRatio pvalue p.adjust qvalue Count <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <int> 1 ko01120 Microbial metabolism in diverse environments 89/369 235/1511 0.000000366 0.00000376 3.08e⁻ ⁶ 89 2 ko02010 ABC transporters 64/369 176/1511 0.000108 0.000781 6.39e⁻ ⁴ 64 3 ko01200 Carbon metabolism 43/369 94/1511 0.00000252 0.0000227 1.86e⁻ ⁵ 43
many software do the wrong things to give you smaller p values, but
clusterProfiler
do not.Thank you for the clarification.
Thank you for the clarification.