Signac is an extension of Seurat for the analysis of single-cell chromatin data (DNA-based single-cell assays). We have extended the Seurat object to include information about the genome sequence and genomic coordinates of sequenced fragments per cell, and include functions needed for the analysis of single-cell chromatin data.
Signac uses the Seurat object structure, and so all the Seurat commands can be used when analysing data with Signac. See the Data Structures and Object Interaction vignette for an explanation of the classes defined in Signac and how to use them. See the Seurat documentation for more information about the Seurat object: https://satijalab.org/seurat/
See the merge and integration vignettes for information on combining multiple single-cell chromatin datasets.
The fragment file is provided in the output of popular single-cell data data processing tools such as chromap, cellranger-atac, and cellranger-arc.
If you are using another method that does not provide a fragment file as output, you can use the sinto package to generate a fragment file from the BAM file. See here for more information on using Sinto to generate a fragment file: https://timoast.github.io/sinto/basic_usage.html#create-scatac-seq-fragments-file
Choosing the dimensionality is a general problem in single-cell analysis for which there is no simple solution. There has been discussion about this for scRNA-seq, and you can read our recommendations for scRNA-seq in the Seurat vignettes: https://satijalab.org/seurat/v3.1/pbmc3k_tutorial.html (see “Determine the ‘dimensionality’ of the dataset”).
Here are some general tips/suggestions that might help guide you in the choice for number of dimensions:
If you are studying an organism that does not have a
BSgenome
genome package or EnsDB
annotation
package available on BioConductor, you can still use your own GTF file
or FASTA files with Signac.
To use a GTF file, you can import it using rtracklayer
,
for example:
gtf <- rtracklayer::import('genes.gtf')
gene.coords <- gtf[gtf$type == 'gene']
seqlevelsStyle(gene.coords) <- 'UCSC'
gene.coords <- keepStandardChromosomes(gene.coords, pruning.mode = 'coarse')
Alternatively, gene annotations for your species may be available on
AnnotationHub
.
You can use a FASTA file in place of a BSgenome
package
for functions that require access to the genome sequence. To do so,
index the FASTA file using samtools faidx
, and then create
a FaFile
object using the Rsamtools package:
fa <- Rsamtools::FaFile("path/to/fasta")
Alternatively, you can create your own BSgenome
data
package, see this vignette.
If you use Signac, please cite Stuart et al., 2021:
@ARTICLE{signac,
title = "Single-cell chromatin state analysis with Signac",
author = "Stuart, Tim and Srivastava, Avi and Madad, Shaista and Lareau,
Caleb A and Satija, Rahul",
journal = "Nat. Methods",
publisher = "Nature Publishing Group",
pages = "1--9",
month = nov,
year = 2021,
url = "https://www.nature.com/articles/s41592-021-01282-5",
language = "en"
}
Signac is an extension of Seurat, and uses the Seurat object structure, so you should consider citing the Seurat paper if you have used Signac.