Hello,
I am attempting to using the "preprocessing with CATALYST" workflow for CyTOF data, presented here: https://bioconductor.org/packages/release/bioc/vignettes/CATALYST/inst/doc/preprocessing.html#compcytof-compensation-of-mass-cytometry-data
It seems that I've been successful reading in raw FCS files, performing bead normalization and debarcoding. When I attempt to convert the data back into a flowSet using sce2fcs(), I'm getting warnings which state some data values of various channels exceed $PnR value and will be truncated. I'm not sure if this is of importance, but would be interested to know if I've done something incorrectly to return this error. My code, output, and session info is below.
Thank you.
###Preprocessing
#Bead Normalization (Fluidigm 140, 151, 153, 165, 175)
set.seed(1234)
raw_data <- read.flowSet(path = "/Users/SP/Library/Mobile Documents/com~apple~CloudDocs/R/IRIS/raw_fcs_rebecca",
transformation = FALSE, truncate_max_range = FALSE, which.lines = 5000)
sce <- prepData(raw_data)
table(sce$sample_id)
names(int_colData(sce))
res <- normCytof(sce, beads = "dvs", k = 50, assays = c("counts", "exprs"), overwrite = FALSE)
n <- ncol(sce); ns <- c(ncol(res$beads), ncol(res$removed))
data.frame(check.names = FALSE, "#" = c(ns[1], ns[2]), "%" = 100*c(ns[1]/n, ns[2]/n), row.names = c("beads", "removed"))
sce <- res$data
assayNames(sce)
res$scatter
res$lines
#Debarcoding
samp_key <- read_csv(file = "/Users/SP/Library/Mobile Documents/com~apple~CloudDocs/R/rebecca_barcoding_key_v2.csv") %>% as.data.frame()
str(samp_key)
sce <- assignPrelim(sce, samp_key)
rownames(sce)[rowData(sce)$is_bc]
table(sce$bc_id)
sce <- estCutoffs(sce)
metadata(sce)$sep_cutoffs
plotYields(sce, which = c(0, "1"))
sce2 <- applyCutoffs(sce)
sce3 <- applyCutoffs(sce, mhl_cutoff = 20, sep_cutoffs = 0.3)
c(specific = mean(sce2$bc_id != 0), global = mean(sce3$bc_id != 0))
sce <- sce2
plotEvents(sce, which = c(0, "1"), n = 25)
plotEvents(sce, which = "all", n = 25)
#Convert back to flowSet
(fs <- sce2fcs(sce, split_by = "sample_id", truncate_max_range = FALSE))
all(c(fsApply(fs, nrow)) == table(sce$sample_id))
ids <- fsApply(fs, identifier)
for (id in ids) {
ff <- fs[[id]] # subset 'flowFrame'
fn <- sprintf("sample_%s.fcs", id) # specify output name that includes ID
fn <- file.path("/Users/SP/Library/Mobile Documents/com~apple~CloudDocs/R/IRIS/normalized_debarcoded_fs", fn) # construct output path
write.FCS(ff, fn) # write frame to FCS
}
When I convert back to flowSet, I get the following output:
(fs <- sce2fcs(sce, split_by = "sample_id")) orig_channel_name new_channel_name $P8N live-dead live-dead-1 $P10N live-dead live-dead-2 $P11N live-dead live-dead-3 $P12N live-dead live-dead-4 $P13N live-dead live-dead-5 $P14N live-dead live-dead-6
etcetera
A flowSet with 27 experiments.
column names(69): Time length ... Pb208Di Bi209Di
There were 50 or more warnings (use warnings() to see the first 50)
warnings() Warning messages: 1: In update_channel_by_alias(cn, channel_alias) : channel_alias: Multiple channels from one FCS are matched to the same alias! Integer suffixes added to disambiguate channels. It is also recommended to verify correct mapping of spillover matrix columns.
2: In readFCSdata(con, offsets, txt, transformation, which.lines, ... : Some data values of 'live-dead-2' channel exceed its $PnR value 14 and will be truncated! To avoid truncation, either fix $PnR before generating FCS or set 'truncate_max_range = FALSE' 3: In readFCSdata(con, offsets, txt, transformation, which.lines, ... : Some data values of 'live-dead-3' channel exceed its $PnR value 14 and will be truncated! To avoid truncation, either fix $PnR before generating FCS or set 'truncate_max_range = FALSE' 4: In readFCSdata(con, offsets, txt, transformation, which.lines, ... : Some data values of 'live-dead-4' channel exceed its $PnR value 14 and will be truncated! To avoid truncation, either fix $PnR before generating FCS or set 'truncate_max_range = FALSE' 5: In readFCSdata(con, offsets, txt, transformation, which.lines, ... :
etcetera
sessionInfo() R version 4.2.1 (2022-06-23) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Ventura 13.2
Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] shiny_1.7.4 premessa_0.3.4 mbkmeans_1.12.0 bluster_1.6.0 scran_1.24.1 scuttle_1.8.4 forcats_0.5.1
[8] stringr_1.5.0 dplyr_1.1.0 purrr_1.0.1 readr_2.1.2 tidyr_1.3.0 tibble_3.1.8 tidyverse_1.3.2
[15] plotly_4.10.1 ggplot2_3.4.1 readxl_1.4.1 diffcyt_1.16.0 ConsensusClusterPlus_1.60.0 FlowSOM_2.4.0 igraph_1.4.0
[22] CATALYST_1.20.1 HDCytoData_1.16.0 ExperimentHub_2.4.0 AnnotationHub_3.4.0 BiocFileCache_2.4.0 dbplyr_2.2.1 SingleCellExperiment_1.20.0
[29] SummarizedExperiment_1.28.0 Biobase_2.58.0 GenomicRanges_1.50.2 GenomeInfoDb_1.34.6 IRanges_2.32.0 S4Vectors_0.36.1 BiocGenerics_0.44.0
[36] MatrixGenerics_1.10.0 matrixStats_0.63.0 flowCore_2.8.0
loaded via a namespace (and not attached):
[1] rappdirs_0.3.3 ClusterR_1.3.0 scattermore_0.8 flowWorkspace_4.8.0 bit64_4.0.5 irlba_2.3.5.1
[7] multcomp_1.4-22 DelayedArray_0.24.0 data.table_1.14.8 KEGGREST_1.36.3 RCurl_1.98-1.10 doParallel_1.0.17
[13] generics_0.1.3 ScaledMatrix_1.6.0 cowplot_1.1.1 TH.data_1.1-1 usethis_2.1.6 RSQLite_2.3.0
[19] ggpointdensity_0.1.0 tzdb_0.3.0 bit_4.0.5 lubridate_1.8.0 xml2_1.3.3 httpuv_1.6.9
[25] assertthat_0.2.1 viridis_0.6.2 gargle_1.2.0 jquerylib_0.1.4 hms_1.1.1 promises_1.2.0.1
[31] fansi_1.0.4 Rgraphviz_2.40.0 DBI_1.1.3 htmlwidgets_1.6.1 googledrive_2.0.0 benchmarkmeData_1.0.4
[37] ellipsis_0.3.2 ggcyto_1.24.1 ggnewscale_0.4.8 ggpubr_0.6.0 backports_1.4.1 cytolib_2.8.0
[43] RcppParallel_5.1.6 deldir_1.0-6 sparseMatrixStats_1.10.0 vctrs_0.5.2 remotes_2.4.2 abind_1.4-5
[49] cachem_1.0.6 withr_2.5.0 ggforce_0.4.1 aws.signature_0.6.0 vroom_1.5.7 cluster_2.1.3
[55] lazyeval_0.2.2 crayon_1.5.2 drc_3.0-1 labeling_0.4.2 edgeR_3.38.4 pkgconfig_2.0.3
[61] tweenr_2.0.2 nlme_3.1-157 vipor_0.4.5 rlang_1.0.6 lifecycle_1.0.3 sandwich_3.0-2
[67] filelock_1.0.2 modelr_0.1.8 rsvd_1.0.5 cellranger_1.1.0 polyclip_1.10-4 graph_1.74.0
[73] Matrix_1.5-3 carData_3.0-5 boot_1.3-28 zoo_1.8-11 reprex_2.0.2 base64enc_0.1-3
[79] beeswarm_0.4.0 ggridges_0.5.4 GlobalOptions_0.1.2 googlesheets4_1.0.1 pheatmap_1.0.12 png_0.1-8
[85] viridisLite_0.4.1 rjson_0.2.21 bitops_1.0-7 Biostrings_2.64.1 blob_1.2.3 DelayedMatrixStats_1.20.0
[91] shape_1.4.6 shinyjqui_0.4.1 jpeg_0.1-10 rstatix_0.7.2 ggsignif_0.6.4 aws.s3_0.3.21
[97] beachmat_2.14.0 scales_1.2.1 memoise_2.0.1 magrittr_2.0.3 plyr_1.8.8 hexbin_1.28.2
[103] zlibbioc_1.44.0 compiler_4.2.1 dqrng_0.3.0 RColorBrewer_1.1-3 plotrix_3.8-2 clue_0.3-64
[109] lme4_1.1-31 cli_3.6.0 XVector_0.38.0 ncdfFlow_2.42.1 MASS_7.3-57 tidyselect_1.2.0
[115] stringi_1.7.12 RProtoBufLib_2.8.0 yaml_2.3.7 BiocSingular_1.14.0 locfit_1.5-9.7 latticeExtra_0.6-30
[121] ggrepel_0.9.3 grid_4.2.1 sass_0.4.5 tools_4.2.1 parallel_4.2.1 CytoML_2.8.1
[127] circlize_0.4.15 rstudioapi_0.13 foreach_1.5.2 metapod_1.4.0 gridExtra_2.3 farver_2.1.1
[133] Rtsne_0.16 digest_0.6.31 BiocManager_1.30.19 Rcpp_1.0.10 car_3.1-1 broom_1.0.0
[139] BiocVersion_3.15.2 later_1.3.0 httr_1.4.4 AnnotationDbi_1.58.0 ComplexHeatmap_2.12.1 colorspace_2.1-0
[145] rvest_1.0.2 XML_3.99-0.13 fs_1.6.1 reticulate_1.28 splines_4.2.1 statmod_1.5.0
[151] RBGL_1.72.0 scater_1.24.0 gmp_0.7-1 xtable_1.8-4 jsonlite_1.8.4 nloptr_2.0.3
[157] benchmarkme_1.0.8 R6_2.5.1 pillar_1.8.1 htmltools_0.5.4 mime_0.12 nnls_1.4
[163] glue_1.6.2 fastmap_1.1.0 minqa_1.2.5 BiocParallel_1.32.5 BiocNeighbors_1.16.0 interactiveDisplayBase_1.34.0
[169] codetools_0.2-18 mvtnorm_1.1-3 utf8_1.2.2 bslib_0.4.2 lattice_0.20-45 curl_5.0.0
[175] ggbeeswarm_0.7.1 colorRamps_2.3.1 gtools_3.9.4 interp_1.1-3 survival_3.3-1 limma_3.54.0
[181] munsell_0.5.0 GetoptLong_1.0.5 GenomeInfoDbData_1.2.9 iterators_1.0.14 haven_2.5.0 reshape2_1.4.4
[187] gtable_0.3.1