vignettes/cicero.Rmd
cicero.Rmd
In this vignette we will demonstrate how to find cis-co-accessible networks with Cicero using single-cell ATAC-seq data. Please see the Cicero website for information about Cicero.
To facilitate conversion between the Seurat (used by Signac) and CellDataSet (used by Cicero) formats, we will use a conversion function in the SeuratWrappers package available on GitHub.
We will use a single-cell ATAC-seq dataset containing human CD34+ hematopoietic stem and progenitor cells published by Satpathy and Granja et al. (2019, Nature Biotechnology). The processed datasets are available on NCBI GEO here: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE129785
This is the same dataset we used in the trajectory vignette, and we’ll start by loading the dataset that was created in that vignette. See the trajectory vignette for the code used to create the object from raw data.
First we will load their dataset and perform some standard preprocessing using Signac.
# load the object created in the Monocle 3 vignette
bone <- readRDS("../vignette_data/cd34.rds")
We can find cis-co-accessible networks (CCANs) using Cicero.
The Cicero developers have developed a separate branch of the package that works with a Monocle 3 CellDataSet
object. We will first make sure this branch is installed, then convert our Seurat
object for the whole bone marrow dataset to CellDataSet
format.
# Install Cicero
if (!requireNamespace("remotes", quietly = TRUE))
install.packages("remotes")
remotes::install_github("cole-trapnell-lab/cicero-release", ref = "monocle3")
library(cicero)
# convert to CellDataSet format and make the cicero object
bone.cds <- as.cell_data_set(x = bone)
bone.cicero <- make_cicero_cds(bone.cds, reduced_coordinates = reducedDims(bone.cds)$UMAP)
We’ll demonstrate running Cicero here using just one chromosome to save some time, but the same workflow can be applied to find CCANs for the whole genome.
Here we demonstrate the most basic workflow for running Cicero. This workflow can be broken down into several steps, each with parameters that can be changed from their defaults to fine-tune the Cicero algorithm depending on your data. We highly recommend that you explore the Cicero website, paper, and documentation for more information.
# get the chromosome sizes from the Seurat object
genome <- seqlengths(bone)
# use chromosome 1 to save some time
# omit this step to run on the whole genome
genome <- genome[1]
# convert chromosome sizes to a dataframe
genome.df <- data.frame("chr" = names(genome), "length" = genome)
# run cicero
conns <- run_cicero(bone.cicero, genomic_coords = genome.df, sample_num = 100)
## [1] "Starting Cicero"
## [1] "Calculating distance_parameter value"
## [1] "Running models"
## [1] "Assembling connections"
## [1] "Successful cicero models: 756"
## [1] "Other models: "
##
## Too many elements in range Zero or one element in range
## 156 86
## [1] "Models with errors: 0"
## [1] "Done"
head(conns)
## Peak1 Peak2 coaccess
## 1 chr1-100003337-100003837 chr1-99791719-99792219 0
## 2 chr1-100003337-100003837 chr1-99828699-99829199 0
## 3 chr1-100003337-100003837 chr1-99835542-99836042 0
## 4 chr1-100003337-100003837 chr1-99836217-99836717 0
## 5 chr1-100003337-100003837 chr1-99839576-99840076 0
## 6 chr1-100003337-100003837 chr1-99840640-99841140 0
Now that we’ve found pairwise co-accessibility scores for each peak, we can now group these pairwise connections into larger co-accessible networks using the generate_ccans()
function from Cicero.
ccans <- generate_ccans(conns)
## [1] "Coaccessibility cutoff used: 0.18"
head(ccans)
## Peak CCAN
## chr1-10009702-10010202 chr1-10009702-10010202 146
## chr1-100151188-100151688 chr1-100151188-100151688 1
## chr1-100164787-100165287 chr1-100164787-100165287 1
## chr1-100165566-100166066 chr1-100165566-100166066 1
## chr1-100191513-100192013 chr1-100191513-100192013 1
## chr1-10021208-10021708 chr1-10021208-10021708 8
We can add the co-accessible links found by Cicero to the ChromatinAssay
object in Seurat. Using the ConnectionsToLinks()
function in Signac we can convert the outputs of Cicero to the format needed to store in the links
slot in the ChromatinAssay
, and add this to the object using the Links<-
assignment function.
links <- ConnectionsToLinks(conns = conns, ccans = ccans)
Links(bone) <- links
We can now visualize these links along with DNA accessibility information by running CoveragePlot()
for a region:
CoveragePlot(bone, region = "chr1-40189344-40252549")
Thanks to the developers of Cicero, especially Cole Trapnell, Hannah Pliner, and members of the Trapnell lab. If you use Cicero please cite the Cicero paper.
## R version 4.0.1 (2020-06-06)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] grid stats4 parallel stats graphics grDevices utils
## [8] datasets methods base
##
## other attached packages:
## [1] cicero_1.3.4.11 Gviz_1.34.1
## [3] monocle3_1.0.0 SingleCellExperiment_1.12.0
## [5] SummarizedExperiment_1.20.0 GenomicRanges_1.42.0
## [7] GenomeInfoDb_1.26.5 IRanges_2.24.1
## [9] S4Vectors_0.28.1 MatrixGenerics_1.2.1
## [11] matrixStats_0.58.0 Biobase_2.50.0
## [13] BiocGenerics_0.36.0 patchwork_1.1.1
## [15] ggplot2_3.3.3 SeuratWrappers_0.3.0
## [17] SeuratObject_4.0.0 Seurat_4.0.1.9005
## [19] Signac_1.2.0
##
## loaded via a namespace (and not attached):
## [1] utf8_1.2.1 reticulate_1.19 tidyselect_1.1.0
## [4] RSQLite_2.2.7 AnnotationDbi_1.52.0 htmlwidgets_1.5.3
## [7] docopt_0.7.1 BiocParallel_1.24.1 Rtsne_0.15
## [10] munsell_0.5.0 codetools_0.2-18 ragg_1.1.2
## [13] ica_1.0-2 future_1.21.0 miniUI_0.1.1.1
## [16] withr_2.4.2 colorspace_2.0-0 highr_0.9
## [19] knitr_1.33 rstudioapi_0.13 ROCR_1.0-11
## [22] tensor_1.5 listenv_0.8.0 labeling_0.4.2
## [25] slam_0.1-48 GenomeInfoDbData_1.2.4 polyclip_1.10-0
## [28] bit64_4.0.5 farver_2.1.0 rprojroot_2.0.2
## [31] parallelly_1.24.0 vctrs_0.3.7 generics_0.1.0
## [34] xfun_0.22 biovizBase_1.38.0 BiocFileCache_1.14.0
## [37] lsa_0.73.2 ggseqlogo_0.1 R6_2.5.0
## [40] rsvd_1.0.5 VGAM_1.1-5 AnnotationFilter_1.14.0
## [43] bitops_1.0-7 spatstat.utils_2.1-0 cachem_1.0.4
## [46] DelayedArray_0.16.3 assertthat_0.2.1 promises_1.2.0.1
## [49] scales_1.1.1 nnet_7.3-15 gtable_0.3.0
## [52] globals_0.14.0 goftest_1.2-2 ensembldb_2.14.0
## [55] rlang_0.4.10 systemfonts_1.0.1 RcppRoll_0.3.0
## [58] splines_4.0.1 rtracklayer_1.50.0 lazyeval_0.2.2
## [61] dichromat_2.0-0 checkmate_2.0.0 spatstat.geom_2.1-0
## [64] BiocManager_1.30.12 yaml_2.2.1 reshape2_1.4.4
## [67] abind_1.4-5 GenomicFeatures_1.42.3 backports_1.2.1
## [70] httpuv_1.6.0 Hmisc_4.5-0 tools_4.0.1
## [73] ellipsis_0.3.1 spatstat.core_2.1-2 jquerylib_0.1.4
## [76] RColorBrewer_1.1-2 ggridges_0.5.3 Rcpp_1.0.6
## [79] plyr_1.8.6 base64enc_0.1-3 progress_1.2.2
## [82] zlibbioc_1.36.0 purrr_0.3.4 RCurl_1.98-1.3
## [85] prettyunits_1.1.1 openssl_1.4.3 rpart_4.1-15
## [88] deldir_0.2-10 pbapply_1.4-3 viridis_0.6.0
## [91] cowplot_1.1.1 zoo_1.8-9 ggrepel_0.9.1
## [94] cluster_2.1.2 fs_1.5.0 magrittr_2.0.1
## [97] data.table_1.14.0 scattermore_0.7 lmtest_0.9-38
## [100] RANN_2.6.1 SnowballC_0.7.0 ProtGenerics_1.22.0
## [103] fitdistrplus_1.1-3 hms_1.0.0 mime_0.10
## [106] evaluate_0.14 xtable_1.8-4 XML_3.99-0.6
## [109] jpeg_0.1-8.1 sparsesvd_0.2 gridExtra_2.3
## [112] compiler_4.0.1 biomaRt_2.46.3 tibble_3.1.1
## [115] KernSmooth_2.23-18 crayon_1.4.1 htmltools_0.5.1.1
## [118] mgcv_1.8-33 later_1.2.0 Formula_1.2-4
## [121] tidyr_1.1.3 DBI_1.1.1 tweenr_1.0.2
## [124] dbplyr_2.1.1 rappdirs_0.3.3 MASS_7.3-53.1
## [127] Matrix_1.3-2 igraph_1.2.6 pkgconfig_2.0.3
## [130] pkgdown_1.6.1 GenomicAlignments_1.26.0 foreign_0.8-81
## [133] plotly_4.9.3 spatstat.sparse_2.0-0 xml2_1.3.2
## [136] bslib_0.2.4 XVector_0.30.0 VariantAnnotation_1.36.0
## [139] stringr_1.4.0 digest_0.6.27 sctransform_0.3.2
## [142] RcppAnnoy_0.0.18 spatstat.data_2.1-0 Biostrings_2.58.0
## [145] rmarkdown_2.7 leiden_0.3.7 fastmatch_1.1-0
## [148] htmlTable_2.1.0 uwot_0.1.10 curl_4.3
## [151] shiny_1.6.0 Rsamtools_2.6.0 lifecycle_1.0.0
## [154] nlme_3.1-152 jsonlite_1.7.2 askpass_1.1
## [157] desc_1.3.0 viridisLite_0.4.0 BSgenome_1.58.0
## [160] fansi_0.4.2 pillar_1.6.0 lattice_0.20-41
## [163] fastmap_1.1.0 httr_1.4.2 survival_3.2-11
## [166] glue_1.4.2 remotes_2.3.0 FNN_1.1.3
## [169] qlcMatrix_0.9.7 png_0.1-7 bit_4.0.4
## [172] ggforce_0.3.3 stringi_1.5.3 sass_0.3.1
## [175] blob_1.2.1 textshaping_0.3.3 latticeExtra_0.6-29
## [178] memoise_2.0.0 dplyr_1.0.5 irlba_2.3.3
## [181] future.apply_1.7.0