vignettes/future.Rmd
      future.RmdParallel computing is supported in Signac through the future package,
making it easy to specify different parallelization options. Here we
demonstrate parallelization of the FeatureMatrix function
and show some benchmark results to get a sense for the amount of speedup
you might expect.
The Seurat package also
uses future for parallelization, and you can see the Seurat
vignette
for more information.
The following functions currently enable parallelization in Signac:
Parallelization can be enabled simply by importing the
future package and setting the plan.
## sequential:
## - args: function (..., envir = parent.frame(), workers = "<NULL>")
## - tweaked: FALSE
## - call: plan(sequential)
## FutureBackend to be launched
By default the plan is set to sequential processing (no
parallelization). We can change this to multicore or
multisession to get asynchronous processing, and set the
number of workers to change the number of cores used.
## multicore:
## - args: function (..., workers = 10)
## - tweaked: TRUE
## - call: plan("multicore", workers = 10)
## MulticoreFutureBackend:
## Inherits: MultiprocessFutureBackend, FutureBackend
## UUID: 79059e0c06182cbd4397f0ec85737426
## Number of workers: 10
## Number of free workers: 10
## Available cores: 8
## Automatic garbage collection: FALSE
## Early signaling: FALSE
## Interrupts are enabled: TRUE
## Maximum total size of globals: +Inf
## Maximum total size of value: +Inf
## Argument 'wait.timeout': 86400
## Argument 'wait.interval': 0.01
## Argument 'wait.alpha': 1.01
## Argument 'hooks': FALSE
## Number of active futures: 0
## Number of futures since start: 0 (0 created, 0 launched, 0 finished)
## Total runtime of futures: 0 secs (NaN secs/finished future)
You might also need to increase the maximum memory usage:
options(future.globals.maxSize = 50 * 1024 ^ 3) # for 50 Gb RAMNote that as of future version 1.14.0,
forked processing is disabled when running in RStudio. To enable
parallel computing in RStudio, you will need to select the
“multisession” option.
Here we demonstrate the runtime of FeatureMatrix run on
144,023 peaks for 9,688 human PBMCs under different parallelization
options:
The following code was run on REHL with Intel Platinum 8268 CPU @ 2.00GHz
# download data
wget https://cf.10xgenomics.com/samples/cell-atac/2.0.0/atac_pbmc_10k_nextgem/atac_pbmc_10k_nextgem_fragments.tsv.gz
wget https://cf.10xgenomics.com/samples/cell-atac/2.0.0/atac_pbmc_10k_nextgem/atac_pbmc_10k_nextgem_fragments.tsv.gz.tbi
wget https://cf.10xgenomics.com/samples/cell-atac/2.0.0/atac_pbmc_10k_nextgem/atac_pbmc_10k_nextgem_peaks.bed
wget https://cf.10xgenomics.com/samples/cell-atac/2.0.0/atac_pbmc_10k_nextgem/atac_pbmc_10k_nextgem_singlecell.csvlibrary(Signac)
# load data
fragments <- "../vignette_data/atac_pbmc_10k_nextgem_fragments.tsv.gz"
peaks.10k <- read.table(
  file = "../vignette_data/atac_pbmc_10k_nextgem_peaks.bed",
  col.names = c("chr", "start", "end")
)
peaks <- GenomicRanges::makeGRangesFromDataFrame(peaks.10k)
md <- read.csv("../vignette_data/atac_pbmc_10k_nextgem_singlecell.csv", row.names = 1, header = TRUE)[-1, ]
cells <- rownames(md[md[['is__cell_barcode']] == 1, ])
fragments <- CreateFragmentObject(path = fragments, cells = cells, validate.fragments = FALSE)
# set number of replicates
nrep <- 5
results <- data.frame()
process_n <- 2000
# run sequentially
timing.sequential <- c()
for (i in seq_len(nrep)) {
  start <- Sys.time()
  fmat <- FeatureMatrix(fragments = fragments, features = peaks, cells = cells, process_n = process_n)
  timing.sequential <- c(timing.sequential, as.numeric(Sys.time() - start, units = "secs"))
}
res <- data.frame(
  "setting" = rep("Sequential", nrep),
  "cores" = rep(1, nrep),
  "replicate" = seq_len(nrep),
  "time" = timing.sequential
)
results <- rbind(results, res)
# 4 core
library(future)
plan("multicore", workers = 4)
options(future.globals.maxSize = 100000 * 1024^2)
timing.4core <- c()
for (i in seq_len(nrep)) {
  start <- Sys.time()
  fmat <- FeatureMatrix(fragments = fragments, features = peaks, cells = cells, process_n = process_n)
  timing.4core <- c(timing.4core, as.numeric(Sys.time() - start, units = "secs"))
}
res <- data.frame(
  "setting" = rep("Parallel", nrep),
  "cores" = rep(4, nrep),
  "replicate" = seq_len(nrep),
  "time" = timing.4core
)
results <- rbind(results, res)
# 10 core
plan("multicore", workers = 10)
timing.10core <- c()
for (i in seq_len(nrep)) {
  start <- Sys.time()
  fmat <- FeatureMatrix(fragments = fragments, features = peaks, cells = cells, process_n = process_n)
  timing.10core <- c(timing.10core, as.numeric(Sys.time() - start, units = "secs"))
}
res <- data.frame(
  "setting" = rep("Parallel", nrep),
  "cores" = rep(10, nrep),
  "replicate" = seq_len(nrep),
  "time" = timing.10core
)
results <- rbind(results, res)
# save results
write.table(
  x = results,
  file = paste0("../vignette_data/pbmc10k/timings_", Sys.Date(), ".tsv"),
  quote = FALSE,
  row.names = FALSE
)
## R version 4.5.1 (2025-06-13)
## Platform: aarch64-apple-darwin20
## Running under: macOS Tahoe 26.0.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Asia/Singapore
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggplot2_4.0.0 future_1.67.0
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.6       jsonlite_2.0.0     dplyr_1.1.4        compiler_4.5.1    
##  [5] tidyselect_1.2.1   parallel_4.5.1     dichromat_2.0-0.1  jquerylib_0.1.4   
##  [9] globals_0.18.0     systemfonts_1.3.1  scales_1.4.0       textshaping_1.0.3 
## [13] yaml_2.3.10        fastmap_1.2.0      R6_2.6.1           labeling_0.4.3    
## [17] generics_0.1.4     knitr_1.50         htmlwidgets_1.6.4  tibble_3.3.0      
## [21] desc_1.4.3         pillar_1.11.1      bslib_0.9.0        RColorBrewer_1.1-3
## [25] rlang_1.1.6        cachem_1.1.0       xfun_0.53          fs_1.6.6          
## [29] sass_0.4.10        S7_0.2.0           cli_3.6.5          withr_3.0.2       
## [33] magrittr_2.0.4     pkgdown_2.1.3      digest_0.6.37      grid_4.5.1        
## [37] lifecycle_1.0.4    vctrs_0.6.5        evaluate_1.0.5     glue_1.8.0        
## [41] farver_2.1.2       listenv_0.9.1      codetools_0.2-20   ragg_1.5.0        
## [45] parallelly_1.45.1  rmarkdown_2.30     pkgconfig_2.0.3    tools_4.5.1       
## [49] htmltools_0.5.8.1