vignettes/visualization.Rmd
visualization.Rmd
In this vignette we will demonstrate how to visualize single-cell data in genome-browser-track style plots with Signac.
To demonstrate we’ll use the human PBMC dataset processed in this vignette.
There are several different genome browser style plot types available in Signac, including accessibility tracks, gene annotations, peak coordinate, genomic links, and fragment positions.
The main plotting function in Signac is CoveragePlot()
,
and this computes the averaged frequency of sequenced DNA fragments for
different groups of cells within a given genomic region.
cov_plot <- CoveragePlot(
object = pbmc,
region = "chr2-87011729-87035519",
annotation = FALSE,
peaks = FALSE
)
cov_plot
We can also request regions of the genome by gene name. This will use the gene coordinates stored in the Seurat object to determine which genomic region to plot
CoveragePlot(
object = pbmc,
region = "CD8A",
annotation = FALSE,
peaks = FALSE
)
All of the plots returned by Signac functions are
ggplot2
or patchwork
objects, and so you can
use standard functions from ggplot2
or other packages to
further modify or customize them. For example, if we wanted to adjust
the color of the tracks generated in the plot above, we can use the
scale_fill_
functions in ggplot2
, for example
scale_fill_brewer()
:
cov_plot + scale_fill_brewer(type = "seq", palette = 1)
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## Warning in RColorBrewer::brewer.pal(n, pal): n too large, allowed maximum for palette Blues is 9
## Returning the palette you asked for with that many colors
cov_plot + scale_fill_brewer(type = "qual", palette = 1)
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## Warning in RColorBrewer::brewer.pal(n, pal): n too large, allowed maximum for palette Accent is 8
## Returning the palette you asked for with that many colors
Note that for plots that are combined using Patchwork, the
&
operator can be used to apply the aesthetic
adjustment to all of the plots in the object.
Gene annotations within a given genomic region can be plotted using
the AnnotationPlot()
function.
gene_plot <- AnnotationPlot(
object = pbmc,
region = "chr2-87011729-87035519"
)
gene_plot
Peak coordinates within a genomic region can be plotted using the
PeakPlot()
function.
peak_plot <- PeakPlot(
object = pbmc,
region = "chr2-87011729-87035519"
)
peak_plot
Relationships between genomic positions can be plotted using the
LinkPlot()
function. This will display an arc connecting
two linked positions, with the transparency of the arc line proportional
to a score associated with the link. These links could be used to encode
different things, including regulatory relationships (for example,
linking enhancers to the genes that they regulate), or experimental data
such as Hi-C.
Just to demonstrate how the function works, we’ve created a fake link here and added it to the PBMC dataset.
link_plot <- LinkPlot(
object = pbmc,
region = "chr2-87011729-87035519"
)
link_plot
While the CoveragePlot()
function computes an aggregated
signal within a genomic region for different groups of cells, sometimes
it’s also useful to inspect the frequency of sequenced fragments within
a genomic region for individual cells, without aggregation.
This can be done using the TilePlot()
function.
tile_plot <- TilePlot(
object = pbmc,
region = "chr2-87011729-87035519",
idents = c("CD4 Memory", "CD8 Effector")
)
tile_plot
By default, this selects the top 100 cells for each group based on the total number of fragments in the genomic region. The genomic region is then tiled and the total fragments in each tile counted for each cell, and the resulting counts for each position displayed as a heatmap.
Multimodal single-cell datasets generate multiple experimental
measurements for each cell. Several methods now exist that are capable
of measuring single-cell chromatin data (such as chromatin
accessibility) alongside other measurements from the same cell, such as
gene expression or mitochondrial genotype. In these cases it’s often
informative to visualize the multimodal data together in a single plot.
This can be achieved using the ExpressionPlot()
function.
This is similar to the VlnPlot()
function in Seurat, but is
designed to be incorportated with genomic track plots generated by
CoveragePlot()
.
expr_plot <- ExpressionPlot(
object = pbmc,
features = "CD8A",
assay = "RNA"
)
expr_plot
We can create similar plots for multiple genes at once simply by passing a list of gene names
ExpressionPlot(
object = pbmc,
features = c("CD8A", "CD4"),
assay = "RNA"
)
Above we’ve demonstrated how to generate individual tracks and panels
that can be combined into a single plot for a single genomic region.
These panels can be easily combined using the
CombineTracks()
function.
CombineTracks(
plotlist = list(cov_plot, tile_plot, peak_plot, gene_plot, link_plot),
expression.plot = expr_plot,
heights = c(10, 6, 1, 2, 3),
widths = c(10, 1)
)
The heights
and widths
parameters control
the relative heights and widths of the individual tracks, according to
the order that the tracks appear in the plotlist
. The
CombineTracks()
function ensures that the tracks are
aligned vertically and horizontally, and moves the x-axis labels
(describing the genomic position) to the bottom of the combined
tracks.
Above we’ve shown how to create genomic plot panels individually and
how to combine them. This allows more control over how each panel is
constructed and how they’re combined, but involves multiple steps. For
convenience, we’ve included the ability to generate and combine
different panels automatically in the CoveragePlot()
function, through the annotation
, peaks
,
tile
, and features
arguments. We can generate
a similar plot to that shown above in a single function call:
CoveragePlot(
object = pbmc,
region = "chr2-87011729-87035519",
features = "CD8A",
annotation = TRUE,
peaks = TRUE,
tile = TRUE,
links = TRUE
)
Notice that in this example we create the tile plot for every group of cells that is shown in the coverage track, whereas above we were able to create a plot that showed the aggregated coverage for all groups of cells and the tile plot for only the CD4 memory cells and the CD8 effector cells. A higher degree of customization is possible when creating each track separately.
Above we demonstrated the different types of plots that can be
constructed using Signac. Often when exploring genomic data it’s useful
to be able to interactively browse through different regions of the
genome and adjust tracks on the fly. This can be done in Signac using
the CoverageBrowser()
function. This provides all the same
functionality of the CoveragePlot()
function, except that
we can scroll upstream/downstream, zoom in/out of regions, navigate to
new region, and adjust which tracks are shown or how the cells are
grouped. In exploring the data interactively, often you will find
interesting plots that you’d like to save for viewing later. We’ve
included a “Save plot” button that will add the current plot to a list
of plots that is returned when the interactive session is ended. Here’s
a recorded demonstration of the CoverageBrowser()
function:
## R version 4.3.1 (2023-06-16)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Sonoma 14.0
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: Asia/Singapore
## tzcode source: internal
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 IRanges_2.34.1
## [4] S4Vectors_0.38.2 BiocGenerics_0.46.0 ggplot2_3.4.4
## [7] Signac_1.11.0
##
## loaded via a namespace (and not attached):
## [1] fastmatch_1.1-4 gtable_0.3.4 xfun_0.40
## [4] bslib_0.5.1 RcppRoll_0.3.0 lattice_0.21-8
## [7] vctrs_0.6.3 tools_4.3.1 bitops_1.0-7
## [10] generics_0.1.3 parallel_4.3.1 tibble_3.2.1
## [13] fansi_1.0.5 pkgconfig_2.0.3 Matrix_1.6-1.1
## [16] data.table_1.14.8 RColorBrewer_1.1-3 desc_1.4.2
## [19] lifecycle_1.0.3 GenomeInfoDbData_1.2.10 farver_2.1.1
## [22] compiler_4.3.1 stringr_1.5.0 Rsamtools_2.16.0
## [25] textshaping_0.3.7 Biostrings_2.68.1 munsell_0.5.0
## [28] ggforce_0.4.1 codetools_0.2-19 SeuratObject_4.1.4
## [31] htmltools_0.5.6.1 sass_0.4.7 RCurl_1.98-1.12
## [34] yaml_2.3.7 tidyr_1.3.0 pillar_1.9.0
## [37] pkgdown_2.0.7 crayon_1.5.2 jquerylib_0.1.4
## [40] MASS_7.3-60 BiocParallel_1.34.2 cachem_1.0.8
## [43] parallelly_1.36.0 tidyselect_1.2.0 digest_0.6.33
## [46] future_1.33.0 stringi_1.7.12 listenv_0.9.0
## [49] dplyr_1.1.3 purrr_1.0.2 labeling_0.4.3
## [52] polyclip_1.10-6 rprojroot_2.0.3 fastmap_1.1.1
## [55] grid_4.3.1 colorspace_2.1-0 cli_3.6.1
## [58] magrittr_2.0.3 patchwork_1.1.3 utf8_1.2.3
## [61] future.apply_1.11.0 withr_2.5.1 scales_1.2.1
## [64] sp_2.1-0 rmarkdown_2.25 XVector_0.40.0
## [67] globals_0.16.2 progressr_0.14.0 ragg_1.2.6
## [70] pbapply_1.7-2 memoise_2.0.1 evaluate_0.22
## [73] knitr_1.44 irlba_2.3.5.1 rlang_1.1.1
## [76] Rcpp_1.0.11 glue_1.6.2 tweenr_2.0.2
## [79] rstudioapi_0.15.0 jsonlite_1.8.7 R6_2.5.1
## [82] systemfonts_1.0.5 fs_1.6.3 zlibbioc_1.46.0