Visualization of genomic regions

In this vignette we will demonstrate how to visualize single-cell data in genome-browser-track style plots with Signac.

To demonstrate we’ll use the human PBMC dataset processed in this vignette.

library(Signac)

# load PBMC dataset
pbmc <- readRDS("../vignette_data/pbmc.rds")

There are several different genome browser style plot types available in Signac, including accessibility tracks, gene annotations, peak coordinate, genomic links, and fragment positions.

Plotting aggregated signal

The main plotting function in Signac is CoveragePlot(), and this computes the averaged frequency of sequenced DNA fragments for different groups of cells within a given genomic region.

cov_plot <- CoveragePlot(
  object = pbmc,
  region = "chr2-87011729-87035519",
  annotation = FALSE,
  peaks = FALSE
)
cov_plot

We can also request regions of the genome by gene name. This will use the gene coordinates stored in the Seurat object to determine which genomic region to plot

CoveragePlot(
  object = pbmc,
  region = "CD8A",
  annotation = FALSE,
  peaks = FALSE
)

Plotting gene annotations

Gene annotations within a given genomic region can be plotted using the AnnotationPlot() function.

gene_plot <- AnnotationPlot(
  object = pbmc,
  region = "chr2-87011729-87035519"
)
gene_plot

Plotting peak coordinates

Peak coordinates within a genomic region can be plotted using the PeakPlot() function.

peak_plot <- PeakPlot(
  object = pbmc,
  region = "chr2-87011729-87035519"
)
peak_plot

Plotting genomic links

Relationships between genomic positions can be plotted using the LinkPlot() function. This will display an arc connecting two linked positions, with the transparency of the arc line proportional to a score associated with the link. These links could be used to encode different things, including regulatory relationships (for example, linking enhancers to the genes that they regulate), or experimental data such as Hi-C.

Just to demonstrate how the function works, we’ve created a fake link here and added it to the PBMC dataset.

link_plot <- LinkPlot(
  object = pbmc,
  region = "chr2-87011729-87035519"
)
link_plot

Plotting per-cell fragment abundance

While the CoveragePlot() function computes an aggregated signal within a genomic region for different groups of cells, sometimes it’s also useful to inspect the frequency of sequenced fragments within a genomic region for individual cells, without aggregation. This can be done using the TilePlot() function.

tile_plot <- TilePlot(
  object = pbmc,
  region = "chr2-87011729-87035519",
  idents = c("CD4 Memory", "CD8 Effector")
)
tile_plot

By default, this selects the top 100 cells for each group based on the total number of fragments in the genomic region. The genomic region is then tiled and the total fragments in each tile counted for each cell, and the resulting counts for each position displayed as a heatmap.

Plotting additional data alongside genomic tracks

Multimodal single-cell datasets generate multiple experimental measurements for each cell. Several methods now exist that are capable of measuring single-cell chromatin data (such as chromatin accessibility) alongside other measurements from the same cell, such as gene expression or mitochondrial genotype. In these cases it’s often informative to visualize the multimodal data together in a single plot. This can be achieved using the ExpressionPlot() function. This is similar to the VlnPlot() function in Seurat, but is designed to be incorportated with genomic track plots generated by CoveragePlot().

expr_plot <- ExpressionPlot(
  object = pbmc,
  features = "CD8A",
  assay = "RNA"
)
expr_plot

We can create similar plots for multiple genes at once simply by passing a list of gene names

ExpressionPlot(
  object = pbmc,
  features = c("CD8A", "CD4"),
  assay = "RNA"
)

Combining genomic tracks

Above we’ve demonstrated how to generate individual tracks and panels that can be combined into a single plot for a single genomic region. These panels can be easily combined using the CombineTracks() function.

CombineTracks(
  plotlist = list(cov_plot, tile_plot, peak_plot, gene_plot, link_plot),
  expression.plot = expr_plot,
  heights = c(10, 6, 1, 2, 3),
  widths = c(10, 1)
)

The heights and widths parameters control the relative heights and widths of the individual tracks, according to the order that the tracks appear in the plotlist. The CombineTracks() function ensures that the tracks are aligned vertically and horizontally, and moves the x-axis labels (describing the genomic position) to the bottom of the combined tracks.

Generating multiple tracks

Above we’ve shown how to create genomic plot panels individually and how to combine them. This allows more control over how each panel is constructed and how they’re combined, but involves multiple steps. For convenience, we’ve included the ability to generate and combine different panels automatically in the CoveragePlot() function, through the annotation, peaks, tile, and features arguments. We can generate a similar plot to that shown above in a single function call:

CoveragePlot(
  object = pbmc,
  region = "chr2-87011729-87035519",
  features = "CD8A",
  annotation = TRUE,
  peaks = TRUE,
  tile = TRUE,
  links = TRUE
)

Notice that in this example we create the tile plot for every group of cells that is shown in the coverage track, whereas above we were able to create a plot that showed the aggregated coverage for all groups of cells and the tile plot for only the CD4 memory cells and the CD8 effector cells. A higher degree of customization is possible when creating each track separately.

Interactive visualization

Above we demonstrated the different types of plots that can be constructed using Signac. Often when exploring genomic data it’s useful to be able to interactively browse through different regions of the genome and adjust tracks on the fly. This can be done in Signac using the CoverageBrowser() function. This provides all the same functionality of the CoveragePlot() function, except that we can scroll upstream/downstream, zoom in/out of regions, navigate to new region, and adjust which tracks are shown or how the cells are grouped. In exploring the data interactively, often you will find interesting plots that you’d like to save for viewing later. We’ve included a “Save plot” button that will add the current plot to a list of plots that is returned when the interactive session is ended. Here’s a recorded demonstration of the CoverageBrowser() function:

Session Info

sessionInfo()

## R version 4.0.1 (2020-06-06)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
## [1] GenomicRanges_1.42.0 GenomeInfoDb_1.26.5  IRanges_2.24.1      
## [4] S4Vectors_0.28.1     BiocGenerics_0.36.0  Signac_1.2.0        
## 
## loaded via a namespace (and not attached):
##   [1] fastmatch_1.1-0        systemfonts_1.0.1      plyr_1.8.6            
##   [4] igraph_1.2.6           lazyeval_0.2.2         splines_4.0.1         
##   [7] BiocParallel_1.24.1    listenv_0.8.0          SnowballC_0.7.0       
##  [10] scattermore_0.7        ggplot2_3.3.3          digest_0.6.27         
##  [13] htmltools_0.5.1.1      fansi_0.4.2            magrittr_2.0.1        
##  [16] memoise_2.0.0          tensor_1.5             cluster_2.1.2         
##  [19] ROCR_1.0-11            globals_0.14.0         Biostrings_2.58.0     
##  [22] matrixStats_0.58.0     docopt_0.7.1           pkgdown_1.6.1         
##  [25] spatstat.sparse_2.0-0  colorspace_2.0-0       ggrepel_0.9.1         
##  [28] textshaping_0.3.3      xfun_0.22              dplyr_1.0.5           
##  [31] sparsesvd_0.2          crayon_1.4.1           RCurl_1.98-1.3        
##  [34] jsonlite_1.7.2         spatstat.data_2.1-0    survival_3.2-11       
##  [37] zoo_1.8-9              glue_1.4.2             polyclip_1.10-0       
##  [40] gtable_0.3.0           zlibbioc_1.36.0        XVector_0.30.0        
##  [43] leiden_0.3.7           future.apply_1.7.0     abind_1.4-5           
##  [46] scales_1.1.1           DBI_1.1.1              miniUI_0.1.1.1        
##  [49] Rcpp_1.0.6             viridisLite_0.4.0      xtable_1.8-4          
##  [52] reticulate_1.19        spatstat.core_2.1-2    htmlwidgets_1.5.3     
##  [55] httr_1.4.2             RColorBrewer_1.1-2     ellipsis_0.3.1        
##  [58] Seurat_4.0.1.9005      ica_1.0-2              pkgconfig_2.0.3       
##  [61] farver_2.1.0           ggseqlogo_0.1          sass_0.3.1            
##  [64] uwot_0.1.10            deldir_0.2-10          utf8_1.2.1            
##  [67] labeling_0.4.2         tidyselect_1.1.0       rlang_0.4.10          
##  [70] reshape2_1.4.4         later_1.2.0            munsell_0.5.0         
##  [73] tools_4.0.1            cachem_1.0.4           generics_0.1.0        
##  [76] ggridges_0.5.3         evaluate_0.14          stringr_1.4.0         
##  [79] fastmap_1.1.0          yaml_2.2.1             ragg_1.1.2            
##  [82] goftest_1.2-2          knitr_1.33             fs_1.5.0              
##  [85] fitdistrplus_1.1-3     purrr_0.3.4            RANN_2.6.1            
##  [88] pbapply_1.4-3          future_1.21.0          nlme_3.1-152          
##  [91] mime_0.10              slam_0.1-48            RcppRoll_0.3.0        
##  [94] compiler_4.0.1         plotly_4.9.3           png_0.1-7             
##  [97] spatstat.utils_2.1-0   tibble_3.1.1           tweenr_1.0.2          
## [100] bslib_0.2.4            stringi_1.5.3          highr_0.9             
## [103] desc_1.3.0             lattice_0.20-41        Matrix_1.3-2          
## [106] vctrs_0.3.7            pillar_1.6.0           lifecycle_1.0.0       
## [109] spatstat.geom_2.1-0    lmtest_0.9-38          jquerylib_0.1.4       
## [112] RcppAnnoy_0.0.18       data.table_1.14.0      cowplot_1.1.1         
## [115] bitops_1.0-7           irlba_2.3.3            httpuv_1.6.0          
## [118] patchwork_1.1.1        R6_2.5.0               promises_1.2.0.1      
## [121] lsa_0.73.2             KernSmooth_2.23-18     gridExtra_2.3         
## [124] parallelly_1.24.0      codetools_0.2-18       MASS_7.3-53.1         
## [127] assertthat_0.2.1       rprojroot_2.0.2        SeuratObject_4.0.0    
## [130] qlcMatrix_0.9.7        sctransform_0.3.2      Rsamtools_2.6.0       
## [133] GenomeInfoDbData_1.2.4 mgcv_1.8-33            grid_4.0.1            
## [136] rpart_4.1-15           tidyr_1.1.3            rmarkdown_2.7         
## [139] Rtsne_0.15             ggforce_0.3.3          shiny_1.6.0

Compiled: April 28, 2021