fgsea stuck in an infinite loop in the middle of the run
Assa Yeroslaviz ★ 1.5k
Last seen 5 weeks ago

I'm running fgsea on a single-cell data set with seven clusters.

markers_export0.2.res_0.2$cluster |> table()
    0    1    2    3    4    5    6 
 2318  713 1380 1228 2469 1938 3785 

In a for loop I'm trying to run the GSEA for each of them separately. It runs perfectly for clusters 0 and 1 but got stuck in cluster 2 for many hours. It can't be the number of cells, as cluster 0 is much bigger and done in seconds. It also doesn't matter which database I'm using. I troed msigDB C5 and H, both behaves similarly.

any ideas, what my problem is?


for (clust in 0:6) {
  genes<- markers_export0.2.res_0.2 %>%
  dplyr::filter(cluster == clust) %>%
  arrange(desc(p_val_adj)) %>% 
  dplyr::select(gene, p_val_adj)

  ranks<- deframe(genes)

  fgseaRes<- fgsea(fgsea_sets, stats = ranks,  maxSize = 200, nPermSimple = 10000, nproc = 1 )

  fgseaResTidy <- fgseaRes %>%
    as_tibble() %>%

  # only plot the top 20 pathways
  ggplot(fgseaResTidy %>% filter(padj < 0.008) %>% head(n= 20), aes(reorder(pathway, NES), NES)) +
    geom_col(aes(fill= NES < 7.5)) +
    coord_flip() +
    labs(x="Pathway", y="Normalized Enrichment Score",
         title="GO pathways NES from GSEA") + 

sessionInfo( )
> sessionInfo()
R version 4.4.0 (2024-04-24)
Platform: x86_64-apple-darwin20
Running under: macOS Sonoma 14.6.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] msigdbr_7.5.1      fgsea_1.30.0       patchwork_1.2.0    UCell_2.8.0        corrplot_0.92      RColorBrewer_1.1-3
 [7] cowplot_1.1.3      clustree_0.5.1     ggraph_2.2.1       ggpubr_0.6.0       lubridate_1.9.3    forcats_1.0.0     
[13] stringr_1.5.1      dplyr_1.1.4        purrr_1.0.2        readr_2.1.5        tidyr_1.3.1        tibble_3.2.1      
[19] ggplot2_3.5.1      tidyverse_2.0.0    Seurat_5.1.0       SeuratObject_5.0.2 sp_2.1-4          

loaded via a namespace (and not attached):
  [1] RcppAnnoy_0.0.22            splines_4.4.0               later_1.3.2                 polyclip_1.10-6            
  [5] fastDummies_1.7.3           lifecycle_1.0.4             rstatix_0.7.2               globals_0.16.3             
  [9] lattice_0.22-6              MASS_7.3-60.2               backports_1.4.1             magrittr_2.0.3             
 [13] plotly_4.10.4               httpuv_1.6.15               sctransform_0.4.1           spam_2.10-0                
 [17] spatstat.sparse_3.0-3       reticulate_1.36.1           pbapply_1.7-2               zlibbioc_1.50.0            
 [21] abind_1.4-5                 Rtsne_0.17                  GenomicRanges_1.56.0        BiocGenerics_0.50.0        
 [25] tweenr_2.0.3                GenomeInfoDbData_1.2.12     IRanges_2.38.0              S4Vectors_0.42.0           
 [29] ggrepel_0.9.5               irlba_2.3.5.1               listenv_0.9.1               spatstat.utils_3.1-0       
 [33] goftest_1.2-3               RSpectra_0.16-1             spatstat.random_3.2-3       fitdistrplus_1.1-11        
 [37] parallelly_1.37.1           DelayedArray_0.30.1         leiden_0.4.3.1              codetools_0.2-20           
 [41] ggforce_0.4.2               tidyselect_1.2.1            UCSC.utils_1.0.0            farver_2.1.2               
 [45] viridis_0.6.5               matrixStats_1.3.0           stats4_4.4.0                spatstat.explore_3.2-7     
 [49] jsonlite_1.8.8              BiocNeighbors_1.22.0        tidygraph_1.3.1             progressr_0.14.0           
 [53] ggridges_0.5.6              survival_3.6-4              tools_4.4.0                 ica_1.0-3                  
 [57] Rcpp_1.0.12                 glue_1.7.0                  SparseArray_1.4.3           gridExtra_2.3              
 [61] xfun_0.44                   MatrixGenerics_1.16.0       GenomeInfoDb_1.40.0         withr_3.0.0                
 [65] fastmap_1.2.0               fansi_1.0.6                 digest_0.6.35               timechange_0.3.0           
 [69] R6_2.5.1                    mime_0.12                   colorspace_2.1-0            scattermore_1.2            
 [73] tensor_1.5                  spatstat.data_3.0-4         utf8_1.2.4                  generics_0.1.3             
 [77] data.table_1.15.4           S4Arrays_1.4.0              graphlayouts_1.1.1          httr_1.4.7                 
 [81] htmlwidgets_1.6.4           uwot_0.2.2                  pkgconfig_2.0.3             gtable_0.3.5               
 [85] lmtest_0.9-40               SingleCellExperiment_1.26.0 XVector_0.44.0              htmltools_0.5.8.1          
 [89] carData_3.0-5               dotCall64_1.1-1             Biobase_2.64.0              scales_1.3.0               
 [93] png_0.1-8                   knitr_1.46                  rstudioapi_0.16.0           tzdb_0.4.0                 
 [97] reshape2_1.4.4              nlme_3.1-164                zoo_1.8-12                  cachem_1.0.8               
[101] KernSmooth_2.23-22          parallel_4.4.0              miniUI_0.1.1.1              pillar_1.9.0               
[105] grid_4.4.0                  vctrs_0.6.5                 RANN_2.6.1                  promises_1.3.0             
[109] car_3.1-2                   xtable_1.8-4                cluster_2.1.6               cli_3.6.3                  
[113] compiler_4.4.0              crayon_1.5.2                rlang_1.1.4                 future.apply_1.11.2        
[117] ggsignif_0.6.4              plyr_1.8.9                  stringi_1.8.4               viridisLite_0.4.2          
[121] deldir_2.0-4                BiocParallel_1.38.0         babelgene_22.9              munsell_0.5.1              
[125] lazyeval_0.2.2              spatstat.geom_3.2-9         Matrix_1.7-0                RcppHNSW_0.6.0             
[129] hms_1.1.3                   future_1.33.2               shiny_1.8.1.1               SummarizedExperiment_1.34.0
[133] ROCR_1.0-11                 igraph_2.0.3                broom_1.0.5                 memoise_2.0.1              
[137] fastmatch_1.1-4
Try running it on cluster 2 without the for loop. It will tell you if the loop is the problem or not.

Thanks for the suggestion. AsI said, the loop runs for clusters 0 and 1 before it starts with 2. I have also tried cluster 2 without the loop and it still got stuck.

I have solved the problem using the suggestion from the maintainer on github


