Entering edit mode
I'm running fgsea
on a single-cell data set with seven clusters.
markers_export0.2.res_0.2$cluster |> table()
0 1 2 3 4 5 6
2318 713 1380 1228 2469 1938 3785
In a for loop I'm trying to run the GSEA for each of them separately. It runs perfectly for clusters 0 and 1 but got stuck in cluster 2 for many hours. It can't be the number of cells, as cluster 0 is much bigger and done in seconds. It also doesn't matter which database I'm using. I troed msigDB C5 and H, both behaves similarly.
any ideas, what my problem is?
thanks
for (clust in 0:6) {
genes<- markers_export0.2.res_0.2 %>%
dplyr::filter(cluster == clust) %>%
arrange(desc(p_val_adj)) %>%
dplyr::select(gene, p_val_adj)
ranks<- deframe(genes)
fgseaRes<- fgsea(fgsea_sets, stats = ranks, maxSize = 200, nPermSimple = 10000, nproc = 1 )
fgseaResTidy <- fgseaRes %>%
as_tibble() %>%
arrange(desc(NES))
# only plot the top 20 pathways
ggplot(fgseaResTidy %>% filter(padj < 0.008) %>% head(n= 20), aes(reorder(pathway, NES), NES)) +
geom_col(aes(fill= NES < 7.5)) +
coord_flip() +
labs(x="Pathway", y="Normalized Enrichment Score",
title="GO pathways NES from GSEA") +
theme_minimal()
}
sessionInfo( )
> sessionInfo()
R version 4.4.0 (2024-04-24)
Platform: x86_64-apple-darwin20
Running under: macOS Sonoma 14.6.1
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Berlin
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] msigdbr_7.5.1 fgsea_1.30.0 patchwork_1.2.0 UCell_2.8.0 corrplot_0.92 RColorBrewer_1.1-3
[7] cowplot_1.1.3 clustree_0.5.1 ggraph_2.2.1 ggpubr_0.6.0 lubridate_1.9.3 forcats_1.0.0
[13] stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2 readr_2.1.5 tidyr_1.3.1 tibble_3.2.1
[19] ggplot2_3.5.1 tidyverse_2.0.0 Seurat_5.1.0 SeuratObject_5.0.2 sp_2.1-4
loaded via a namespace (and not attached):
[1] RcppAnnoy_0.0.22 splines_4.4.0 later_1.3.2 polyclip_1.10-6
[5] fastDummies_1.7.3 lifecycle_1.0.4 rstatix_0.7.2 globals_0.16.3
[9] lattice_0.22-6 MASS_7.3-60.2 backports_1.4.1 magrittr_2.0.3
[13] plotly_4.10.4 httpuv_1.6.15 sctransform_0.4.1 spam_2.10-0
[17] spatstat.sparse_3.0-3 reticulate_1.36.1 pbapply_1.7-2 zlibbioc_1.50.0
[21] abind_1.4-5 Rtsne_0.17 GenomicRanges_1.56.0 BiocGenerics_0.50.0
[25] tweenr_2.0.3 GenomeInfoDbData_1.2.12 IRanges_2.38.0 S4Vectors_0.42.0
[29] ggrepel_0.9.5 irlba_2.3.5.1 listenv_0.9.1 spatstat.utils_3.1-0
[33] goftest_1.2-3 RSpectra_0.16-1 spatstat.random_3.2-3 fitdistrplus_1.1-11
[37] parallelly_1.37.1 DelayedArray_0.30.1 leiden_0.4.3.1 codetools_0.2-20
[41] ggforce_0.4.2 tidyselect_1.2.1 UCSC.utils_1.0.0 farver_2.1.2
[45] viridis_0.6.5 matrixStats_1.3.0 stats4_4.4.0 spatstat.explore_3.2-7
[49] jsonlite_1.8.8 BiocNeighbors_1.22.0 tidygraph_1.3.1 progressr_0.14.0
[53] ggridges_0.5.6 survival_3.6-4 tools_4.4.0 ica_1.0-3
[57] Rcpp_1.0.12 glue_1.7.0 SparseArray_1.4.3 gridExtra_2.3
[61] xfun_0.44 MatrixGenerics_1.16.0 GenomeInfoDb_1.40.0 withr_3.0.0
[65] fastmap_1.2.0 fansi_1.0.6 digest_0.6.35 timechange_0.3.0
[69] R6_2.5.1 mime_0.12 colorspace_2.1-0 scattermore_1.2
[73] tensor_1.5 spatstat.data_3.0-4 utf8_1.2.4 generics_0.1.3
[77] data.table_1.15.4 S4Arrays_1.4.0 graphlayouts_1.1.1 httr_1.4.7
[81] htmlwidgets_1.6.4 uwot_0.2.2 pkgconfig_2.0.3 gtable_0.3.5
[85] lmtest_0.9-40 SingleCellExperiment_1.26.0 XVector_0.44.0 htmltools_0.5.8.1
[89] carData_3.0-5 dotCall64_1.1-1 Biobase_2.64.0 scales_1.3.0
[93] png_0.1-8 knitr_1.46 rstudioapi_0.16.0 tzdb_0.4.0
[97] reshape2_1.4.4 nlme_3.1-164 zoo_1.8-12 cachem_1.0.8
[101] KernSmooth_2.23-22 parallel_4.4.0 miniUI_0.1.1.1 pillar_1.9.0
[105] grid_4.4.0 vctrs_0.6.5 RANN_2.6.1 promises_1.3.0
[109] car_3.1-2 xtable_1.8-4 cluster_2.1.6 cli_3.6.3
[113] compiler_4.4.0 crayon_1.5.2 rlang_1.1.4 future.apply_1.11.2
[117] ggsignif_0.6.4 plyr_1.8.9 stringi_1.8.4 viridisLite_0.4.2
[121] deldir_2.0-4 BiocParallel_1.38.0 babelgene_22.9 munsell_0.5.1
[125] lazyeval_0.2.2 spatstat.geom_3.2-9 Matrix_1.7-0 RcppHNSW_0.6.0
[129] hms_1.1.3 future_1.33.2 shiny_1.8.1.1 SummarizedExperiment_1.34.0
[133] ROCR_1.0-11 igraph_2.0.3 broom_1.0.5 memoise_2.0.1
[137] fastmatch_1.1-4
Try running it on cluster 2 without the for loop. It will tell you if the loop is the problem or not.
Thanks for the suggestion. AsI said, the loop runs for clusters 0 and 1 before it starts with 2. I have also tried cluster 2 without the loop and it still got stuck.
I have solved the problem using the suggestion from the maintainer on github