Skip to contents

The Signac package is an extension of Seurat designed for the analysis of single-cell chromatin assays. This includes assays that generate signal mapped to genomic coordinates, such as scATAC-seq, scCUT&Tag, scNTT-seq, and other methods.

As the analysis of these single-cell chromatin datasets presents some unique challenges in comparison to the analysis of scRNA-seq data, we have created an extended Assay5 class to store the additional information needed, including:

  • Genomic ranges associated with the features (eg, peaks or genomic bins)
  • Gene annotations
  • Genome information
  • TF motifs
  • Genome-wide signal in a disk-based format (fragment files)
  • TF footprinting data
  • Tn5 insertion bias data
  • Linked genomic regions

A major advantage of the Signac design is its interoperability with existing functions in the Seurat package, and other packages that are able to use the Seurat object. This enables straightforward analysis of multimodal single-cell data through the addition of different assays to the Seurat object.

Here we outline the design of each class defined in the Signac package, and demonstrate methods that can be run on each class.

Differences between Signac v1 and v2

We first introduced Signac with the version 1 release in 2020. Recently we have released Signac v2, with some major updates to the package. The main changes include:

  • Updating the ChromatinAssay to use the Seurat Assay5 class: this update enables storing data in the object using different matrix types, including as a BPCells matrix. BPCells provides efficient on-disk matrix storage and computation methods, and enables scaling analyses to millions of cells with a small memory requirement.
  • More flexible object classes: we now provide options to use Signac assays without needing each feature to correspond to a single region of the genome. This is useful for storing different types of feature, including our recently-released universal feature set for the human genome, REMO
  • More flexible storage of genomic links: you can now store multiple sets of genomic links within the same assay. We have also moved to storing this information using the well-established GInteractions format from the InteractionSet package.
  • New peak calling methods: Signac now uses macs3 in CallPeaks(). We have also enabled parallelization of peak calling across groups of cells, and include an option to use the hmmratac peak calling method in macs3.
  • New data visualization methods: we added the ability to plot GWAS and QTL data alongside single-cell chromatin data in the CoveragePlot() function

For a full changelog please see the package release notes.

To update old ChromatinAssay assays you can use the as.GRangesAssay() function in Signac v2.

The ChromatinAssay5 Class

The ChromatinAssay5 class extends the standard Seurat Assay5 class and adds several additional slots for data useful for the analysis of single-cell chromatin datasets. The class includes all the slots present in a standard Seurat object with the following additional information:

  • motifs: A Motif object
  • fragments: A list of Fragment2 objects
  • annotation: A GRanges object containing gene annotations
  • bias: A vector containing Tn5 integration bias information (the frequency of Tn5 integration at different hexamers)
  • region.aggregation: A named list of RegionAggregation objects containing signal per cell aggregated across a set of genomic regions (for example, motif sites)
  • links: A named list of GInteractions objects describing linked genomic regions, such as co-accessible sites or enhancer-gene regulatory relationships.

The GRangesAssay Class

The GRangesAssay is an extended ChromatinAssay5 object class that adds a requirement that the features stored in the assay correspond to genomic ranges. It stores the genomic ranges associated with features in the assay as a GRanges object. The GRangesAssay enables some additional analyses that require that the assay features correspond to genomic regions, such as peak-gene linkage or motif enrichment.

Here we will focus on demonstrating use of the GRangesAssay, but all functions that run on a GRangesAssay object can also be run on a ChromatinAssay5 object. There are some cases where a ChromatinAssay5 object would be the preferred way to store your data. For example, you may wish to create an assay that does not store genomic peak intervals (perhaps quantifying REMO features or transcription factor activities), but still want the ability to create genome coverage plots.

Constructing the GRangesAssay

A GRangesAssay object can be constructed using the CreateGRangesAssay() function.

# get some data to use in the following examples
counts <- LayerData(atac_small, layer = "counts")

# create a standalone GRangesAssay object
gr_assay <- CreateGRangesAssay(counts = counts)

To create a Seurat object that contains a GRangesAssay rather than a standard Assay, we can initialize the object using the GRangesAssay rather than a count matrix.

# create a Seurat object containing a GRangesAssay
object <- CreateSeuratObject(counts = gr_assay)

Adding a GRangesAssay to a Seurat object

To add a new GRangesAssay object to an existing Seurat object, we can use the standard assignment operation used for adding standard Assay objects and other data types to the Seurat object.

object[["peaks"]] <- CreateGRangesAssay(counts = counts)

Getting and setting GRangesAssay data

We can get/set data for the GRangesAssay in much the same way we do for a standard Assay object: using the LayerData function defined in Seurat. For example:

## Getting

# access the data slot, found in standard Assays and GRangesAssays
data <- LayerData(atac_small, layer = "data")

# access the bias slot, unique to the GRangesAssay
bias <- LayerData(atac_small, layer = "bias")
## Warning: Layer 'bias' is empty
## Setting

# set the data slot
LayerData(atac_small, layer = "data") <- data

We also have a variety of convenience functions defined for getting/setting data in specific slots. This includes the Fragments(), Motifs(), Links(), Annotation(), RegionAggr(), and Bias() functions. For example, to get or set gene annotation data we can use the Annotation() getter and Annotation<- setter functions:

# first get some gene annotations
library(AnnotationHub)
ah <- AnnotationHub()

ensdb_v98 <- ah[["AH75011"]]

# convert EnsDb to GRanges
gene.ranges <- GetGRangesFromEnsDb(ensdb = ensdb_v98)

# set gene annotations
Annotation(atac_small) <- gene.ranges

# get gene annotation information
Annotation(atac_small)
## GRanges object with 3200207 ranges and 5 metadata columns:
##                   seqnames        ranges strand |           tx_id   gene_name
##                      <Rle>     <IRanges>  <Rle> |     <character> <character>
##   ENSE00001489430        X 276322-276394      + | ENST00000399012      PLCXD1
##   ENSE00001536003        X 276324-276394      + | ENST00000484611      PLCXD1
##   ENSE00002160563        X 276353-276394      + | ENST00000430923      PLCXD1
##   ENSE00001750899        X 281055-281121      + | ENST00000445062      PLCXD1
##   ENSE00001719251        X 281194-281256      + | ENST00000429181      PLCXD1
##               ...      ...           ...    ... .             ...         ...
##   ENST00000361739       MT     7586-8269      + | ENST00000361739      MT-CO2
##   ENST00000361789       MT   14747-15887      + | ENST00000361789      MT-CYB
##   ENST00000361851       MT     8366-8572      + | ENST00000361851     MT-ATP8
##   ENST00000361899       MT     8527-9207      + | ENST00000361899     MT-ATP6
##   ENST00000362079       MT     9207-9990      + | ENST00000362079      MT-CO3
##                           gene_id   gene_biotype     type
##                       <character>    <character> <factor>
##   ENSE00001489430 ENSG00000182378 protein_coding     exon
##   ENSE00001536003 ENSG00000182378 protein_coding     exon
##   ENSE00002160563 ENSG00000182378 protein_coding     exon
##   ENSE00001750899 ENSG00000182378 protein_coding     exon
##   ENSE00001719251 ENSG00000182378 protein_coding     exon
##               ...             ...            ...      ...
##   ENST00000361739 ENSG00000198712 protein_coding      cds
##   ENST00000361789 ENSG00000198727 protein_coding      cds
##   ENST00000361851 ENSG00000228253 protein_coding      cds
##   ENST00000361899 ENSG00000198899 protein_coding      cds
##   ENST00000362079 ENSG00000198938 protein_coding      cds
##   -------
##   seqinfo: 25 sequences (1 circular) from GRCh38 genome

The Fragments(), Motifs(), and Links() functions are demonstrated in other sections below.

Other GRangesAssay methods

As the GRangesAssay object uses the Bioconductor GRanges class to store information, we can also call standard Bioconductor functions defined in the IRanges, GenomicRanges, and Seqinfo packages on the GRangesAssay object (or a Seurat object with a GRangesAssay as the default assay).

The following methods use the genomic ranges stored in a GRangesAssay object.

# extract the genomic ranges associated with each feature in the data matrix
granges(atac_small)
## GRanges object with 100 ranges and 0 metadata columns:
##         seqnames          ranges strand
##            <Rle>       <IRanges>  <Rle>
##     [1]     chr1      9772-10660      *
##     [2]     chr1   180712-181178      *
##     [3]     chr1   181200-181607      *
##     [4]     chr1   191183-192084      *
##     [5]     chr1   267576-268461      *
##     ...      ...             ...    ...
##    [96]     chr1 1291564-1292473      *
##    [97]     chr1 1299784-1300670      *
##    [98]     chr1 1301689-1302543      *
##    [99]     chr1 1305198-1306109      *
##   [100]     chr1 1307720-1308738      *
##   -------
##   seqinfo: 1 sequence from an unspecified genome; no seqlengths
# find the nearest range
nearest(atac_small, subject = Annotation(atac_small))
##   [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
##  [26] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
##  [51] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
##  [76] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
# distance to the nearest range
distanceToNearest(atac_small, subject = Annotation(atac_small))
## Hits object with 0 hits and 1 metadata column:
##    queryHits subjectHits |  distance
##    <integer>   <integer> | <integer>
##   -------
##   queryLength: 100 / subjectLength: 3200207
# find overlaps with another set of genomic ranges
findOverlaps(atac_small, subject = Annotation(atac_small))
## Hits object with 0 hits and 0 metadata columns:
##    queryHits subjectHits
##    <integer>   <integer>
##   -------
##   queryLength: 100 / subjectLength: 3200207

Many other methods are defined, see the documentation for nearest-methods, findOverlaps-methods, inter-range-methods, and coverage in Signac for a full list.

For a full list of methods for the GRangesAssay class run:

methods(class = "GRangesAssay")
##  [1] [[<-              [<-               AddMotifs         coerce           
##  [5] coerce<-          colMeans          colSums           ConvertMotifID   
##  [9] countOverlaps     coverage          disjoin           disjointBins     
## [13] distance          distanceToNearest findOverlaps      follow           
## [17] gaps              genome            GetAssayData      granges          
## [21] isCircular        isDisjoint        merge             nearest          
## [25] precede           range             reduce            RegionStats      
## [29] rowMeans          rowSums           RunChromVAR       seqinfo          
## [33] seqlengths        seqlevels         seqnames          SetAssayData     
## [37] show              split             subset           
## see '?methods' for accessing help and source code

Subsetting a GRangesAssay

We can use the standard subset() function or the [ operator to subset Seurat object containing GRangesAssays. This works the same way as for standard Assay objects.

# subset using the subset() function
# this is meant for interactive use
subset.obj <- subset(atac_small, subset = nCount_peaks > 10)

# subset using the [ extract operator
# this can be used programmatically
subset.obj <- atac_small[, atac_small$nCount_peaks > 10]

Converting between Assay and GRangesAssay or ChromatinAssay5

To convert from a GRangesAssay to a standard Assay use the as() function

# convert a ChromatinAssay to an Assay
assay <- as(object = atac_small[["peaks"]], Class = "Assay5")
assay
## Assay (v5) data with 100 features for 100 cells
## First 10 features:
##  chr1:9772-10660, chr1:180712-181178, chr1:181200-181607,
## chr1:191183-192084, chr1:267576-268461, chr1:270850-271755,
## chr1:273946-274792, chr1:585753-586648, chr1:605079-605959,
## chr1:629538-630397 
## Layers:
##  counts, data

To convert from a standard Assay to a GRangesAssay we use the as.GRangesAssay() function. This takes a standard assay object, as well as information to fill the additional slots in the GRangesAssay class.

# convert an Assay to a ChromatinAssay
gr_assay <- as.GRangesAssay(assay)
gr_assay
## GRangesAssay data with 100 features for 100 cells
## Variable features: 0 
## Annotation present: FALSE 
## Fragment files: 0 
## Motifs present: FALSE 
## Links present: 0 
## Region aggregation matrices: 0

Similarly, we can use as.ChromatinAssay5() to convert to a ChromatinAssay5 object:

chr_assay <- as.ChromatinAssay5(assay)
chr_assay
## ChromatinAssay5 data with 100 features for 100 cells
## Variable features: 0 
## Annotation present: FALSE 
## Fragment files: 0 
## Motifs present: FALSE 
## Links present: 0 
## Region aggregation matrices: 0

The Fragment2 Class

The Fragment2 class is designed for storing and interacting with a fragment file commonly used for single-cell chromatin data. It contains: * The path to an indexed fragment file on disk
* The path to the fragment file index * An MD5 hash for the fragment file and the fragment file index * A vector of cell names contained in the fragment file. Importantly, this is a named vector where the elements of the vector are the cell names as they appear in the fragment file, and the name of each element is the cell name as it appears in the assay object storing the Fragment2 object. This allows a mapping of cell names on disk to cell names in R, and avoids the need to alter fragment files on disk. This path can also be a remote file accessible by http or ftp. * A vector of sequence names contained in the fragment file. This is a named vector of sequence levels (eg, chromosome name) where each element is the sequence name as it appears in the fragment file, and then name of each element is the corresponding sequence name as stored in the assay. This enables renaming sequence names in your assay object without needing to change your fragment file.

Constructing the Fragment2 class

A Fragment2 object can be constructed using the CreateFragmentObject() function.

frag.path <- system.file(
  "extdata", "fragments_header.tsv.gz", package = "Signac"
)
fragments <- CreateFragmentObject(
  path = frag.path,
  cells = colnames(atac_small),
  validate.fragments = TRUE
)
## Computing hash

The validate.fragments parameter controls whether the file is inspected to check whether the expected cell names are present. This can help avoid assigning the wrong fragment file to the object. If you’re sure that the file is correct, you can set this value to FALSE to skip this step and save some time. This check is typically only run once when the Fragment2 object is created, and is not normally run on existing Fragment2 files.

Inspecting the fragment file

To extract the first few lines of a fragment file on-disk, we can use the head() method defined for Fragment2 objects. This is useful for quickly checking the chromosome naming style in our fragment file, or checking how the cell barcodes are named:

head(fragments)
##   chrom  start    end            barcode readCount
## 1  chr1  10168  10216 AAACTGCTCTTCCACG-1         2
## 2  chr1  77156  77221 AAAGATGCACGCTGTG-1         1
## 3  chr1 181068 181131 AAACTGCCAGTAGGCA-1         1
## 4  chr1 181348 181514 AAACTGCAGGCAAGCT-1         1
## 5  chr1 190824 191033 AAAGATGAGACTAGGC-1         3
## 6  chr1 267990 268023 AAACTCGAGAGGCCTA-1         1

Some fragment files also contain header lines with additional information. This can be viewed using the header() function:

header(fragments)
##  [1] "# id=10k_pbmc_ATACv2_nextgem_Chromium_Controller"                                                         
##  [2] "# description=Human PBMCs"                                                                                
##  [3] "#"                                                                                                        
##  [4] "# pipeline_name=cellranger-atac"                                                                          
##  [5] "# pipeline_version=cellranger-atac-2.1.0"                                                                 
##  [6] "#"                                                                                                        
##  [7] "# reference_path=/mnt/scratch2/cellranger-atac-2.1.0/reference/refdata-cellranger-arc-GRCh38-2020-A-2.0.0"
##  [8] "# reference_fasta_hash=b6f131840f9f337e7b858c3d1e89d7ce0321b243"                                          
##  [9] "# reference_gtf_hash=3b4c36ca3bade222a5b53394e8c07a18db7ebb11"                                            
## [10] "# reference_version=2020-A"                                                                               
## [11] "# mkref_version=cellranger-arc-2.0.0"                                                                     
## [12] "#"                                                                                                        
## [13] "# primary_contig=chr1"                                                                                    
## [14] "# primary_contig=chr10"                                                                                   
## [15] "# primary_contig=chr11"                                                                                   
## [16] "# primary_contig=chr12"                                                                                   
## [17] "# primary_contig=chr13"                                                                                   
## [18] "# primary_contig=chr14"                                                                                   
## [19] "# primary_contig=chr15"                                                                                   
## [20] "# primary_contig=chr16"                                                                                   
## [21] "# primary_contig=chr17"                                                                                   
## [22] "# primary_contig=chr18"                                                                                   
## [23] "# primary_contig=chr19"                                                                                   
## [24] "# primary_contig=chr2"                                                                                    
## [25] "# primary_contig=chr20"                                                                                   
## [26] "# primary_contig=chr21"                                                                                   
## [27] "# primary_contig=chr22"                                                                                   
## [28] "# primary_contig=chr3"                                                                                    
## [29] "# primary_contig=chr4"                                                                                    
## [30] "# primary_contig=chr5"                                                                                    
## [31] "# primary_contig=chr6"                                                                                    
## [32] "# primary_contig=chr7"                                                                                    
## [33] "# primary_contig=chr8"                                                                                    
## [34] "# primary_contig=chr9"                                                                                    
## [35] "# primary_contig=chrX"                                                                                    
## [36] "# primary_contig=chrY"                                                                                    
## [37] "# primary_contig=KI270728.1"                                                                              
## [38] "# primary_contig=KI270727.1"                                                                              
## [39] "# primary_contig=GL000009.2"                                                                              
## [40] "# primary_contig=GL000194.1"                                                                              
## [41] "# primary_contig=GL000205.2"                                                                              
## [42] "# primary_contig=GL000195.1"                                                                              
## [43] "# primary_contig=GL000219.1"                                                                              
## [44] "# primary_contig=KI270734.1"                                                                              
## [45] "# primary_contig=GL000213.1"                                                                              
## [46] "# primary_contig=GL000218.1"                                                                              
## [47] "# primary_contig=KI270731.1"                                                                              
## [48] "# primary_contig=KI270721.1"                                                                              
## [49] "# primary_contig=KI270726.1"                                                                              
## [50] "# primary_contig=KI270711.1"                                                                              
## [51] "# primary_contig=KI270713.1"

Adding a Fragment2 object to the GRangesAssay

A GRangesAssay object can contain a list of Fragment2 objects. This avoids the need to merge fragment files on disk and simplifies processes of merging or integrating different Seurat objects containing GRangesAssays. To add a new Fragment2 object to a GRangesAssay, or a Seurat object containing a GRangesAssay, we can use the Fragments<- assignment function. This will do a few things:

  1. Re-compute the MD5 hash for the fragment file and index and verify that it matches the hash computed when the Fragment2 object was created.
  2. Check that none of the cells contained in the Fragment2 object being added are already contained in another Fragment2 object stored in the GRangesAssay. All fragments from a cell must be present in only one fragment file.
  3. Check that the same fragment file is not already stored in the assay.
  4. Append the Fragment2 object to the list of Fragment2 objects stored in the GRangesAssay.
Fragments(atac_small) <- fragments

The show() method for Fragment2-class objects prints the number of cells that the Fragment2 object contains data for.

fragments
## A Fragment v2 object for 100 cells
##  Unknown seqlevels

Alternatively, we can initialize the GRangesAssay with a Fragment2 object in a couple of ways. We can either pass a vector of Fragment2 objects to the fragments parameter in CreateGRangesAssay(), or pass the path to a single fragment file. If we pass the path to a fragment file we assume that the file contains fragments for all cells in the GRangesAssay and that the cell names are the same in the fragment file on disk and in the GRangesAssay. For example:

chrom_assay <- CreateGRangesAssay(
  counts = counts,
  fragments = frag.path
)
## Computing hash
object <- CreateSeuratObject(
  counts = chrom_assay,
  assay = "peaks"
)

This will create a Seurat object containing a GRangesAssay, with a single Fragment2 object.

Removing a Fragment2 object from the GRangesAssay

All the Fragment2 objects associated with a GRangesAssay can be removed by assigning NULL using the Fragment<- assignment function. For example:

Fragments(chrom_assay) <- NULL
Fragments(chrom_assay)
## list()

Changing the fragment file path in an existing Fragment2 object

The path to the fragment file can be updated using the UpdatePath() function. This can be useful if you move the fragment file to a new directory, or if you copy a stored Seurat object containing a GRangesAssay to a different server.

fragments <- UpdatePath(fragments, new.path = frag.path)

To change the path to fragment files in an object, you will need to remove the fragment objects, update the paths, and then add the fragment objects back to the object. For example:

frags <- Fragments(object) # get list of fragment objects
Fragments(object) <- NULL # remove fragment information from assay

# create a vector with all the new paths, in the correct order for your
# list of fragment objects
# In this case we only have 1
new.paths <- list(frag.path)
for (i in seq_along(frags)) {
  frags[[i]] <- UpdatePath(frags[[i]], new.path = new.paths[[i]]) # update path
}

Fragments(object) <- frags # assign updated list back to the object
Fragments(object)
## [[1]]
## A Fragment v2 object for 100 cells
##  Unknown seqlevels

Using remote fragment files

Fragment files hosted on remote servers accessible via http or ftp can also be added to the GRangesAssay in the same way as for locally-hosted fragment files. This can enable the exploration of large single-cell datasets without the need for downloading large files. For example, we can create a Fragment object using a file hosted on the 10x Genomics website:

fragments <- CreateFragmentObject(
  path = "http://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_v1_pbmc_10k/atac_v1_pbmc_10k_fragments.tsv.gz"
)
## Fragment file is on a remote server
## Warning in readLines(con = con, n = 10000): incomplete final line found on
## 'gzcon(http://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_v1_pbmc_10k/atac_v1_pbmc_10k_fragments.tsv.gz)'
## Computing hash
fragments
## A Fragment v2 object for 0 cells
##  Unknown seqlevels

When files are hosted remotely, the checks described in the section above (MD5 hash and expected cells) are not performed.

Getting and setting Fragment2 data

To access the cell names stored in a Fragment2 object, we can use the Cells() function. Importantly, this returns the cell names as they appear in the GRangesAssay, rather than as they appear in the fragment file itself.

fragments <- CreateFragmentObject(
  path = frag.path,
  cells = colnames(atac_small),
  validate.fragments = TRUE
)
## Computing hash
cells <- Cells(fragments)
head(cells)
## [1] "AAACGAAAGAGAGGTA-1" "AAACGAAAGCAGGAGG-1" "AAACGAAAGGAAGAAC-1"
## [4] "AAACGAAAGTCGACCC-1" "AAACGAACAAGCACTT-1" "AAACGAACAAGCGGTA-1"

Similarly, we can set the cell name information in a Fragment2 object using the Cells<- assignment function. This will set the named vector of cells stored in the Fragment2 object. Here we must supply a named vector.

names(cells) <- cells
Cells(fragments) <- cells

To extract any of the data stored in a Fragment2 object we can also use the GetFragmentData() function. For example, we can find the path to the fragment file on disk:

GetFragmentData(object = fragments, slot = "file.path")
## [1] "/charonfs/scratch/users/astar/gis/stuartt/mytmp/Rtmp1aKDCv/temp_libpath2d9a8f69f36444/Signac/extdata/fragments_header.tsv.gz"

For a full list of methods for the Fragment2 class run:

methods(class = "Fragment2")
##  [1] CallPeaks        Cells            Cells<-          head            
##  [5] RenameCells      renameSeqlevels  seqlevels        seqlevels<-     
##  [9] seqlevelsStyle   seqlevelsStyle<- show             subset          
## see '?methods' for accessing help and source code

The Motif Class

The Motif class stores information needed for DNA sequence motif analysis, and has the following slots:

  • data: a sparse feature by motif matrix, where entries are 1 if the feature contains the motif, and 0 otherwise
  • pwm: A named list of position weight or position frequency matrices
  • motif.names: a list of motif IDs and their common names
  • positions: A GRangesList object containing the exact positions of each motif
  • meta.data: Additional information about the motifs

Many of these slots are optional and do not need to be filled, but are only required when running certain functions. For example, the positions slot will be needed if running TF footprinting.

Constructing the Motif class

A Motif object can be constructed using the CreateMotifObject() function. Much of the data needed for constructing a Motif object can be generated using functions from the TFBSTools and motifmatchr packages. Position frequency matrices for motifs can be loaded using the JASPAR packages on Bioconductor or the chromVARmotifs package. For example:

library(JASPAR2020)
library(TFBSTools)
library(motifmatchr)

# Get a list of motif position frequency matrices from the JASPAR database
pfm <- getMatrixSet(
  x = JASPAR2020,
  opts = list(species = 9606) # 9606 is the species code for human
)

# Scan the DNA sequence of each peak for the presence of each motif
motif.matrix <- CreateMotifMatrix(
  features = granges(atac_small),
  pwm = pfm,
  genome = "hg38"
)

# Create a new Mofif object to store the results
motif <- CreateMotifObject(
  data = motif.matrix,
  pwm = pfm
)

The show() method for the Motif class prints the total number of motifs and regions included in the object:

motif
## A Motif object containing 633 motifs in 100 regions

Adding a Motif object to the GRangesAssay

We can add a Motif object to the GRangesAssay, or a Seurat object containing a GRangesAssay using the Motifs<- assignment operator.

Motifs(atac_small) <- motif

Getting and setting Motif data

Data stored in a Motif object can be accessed using the GetMotifData() and SetMotifData() functions.

# extract data from the Motif object
pfm <- GetMotifData(object = motif, slot = "pwm")

# set data in the Motif object
motif <- SetMotifData(object = motif, slot = "pwm", new.data = pfm)

We can access the set of motifs and set of features used in the Motif object using the colnames() and rownames() functions:

# look at the motifs included in the Motif object
head(colnames(motif))
## [1] "MA0030.1" "MA0031.1" "MA0051.1" "MA0057.1" "MA0059.1" "MA0066.1"
# look at the features included in the Motif object
head(rownames(motif))
## [1] "chr1:9772-10660"    "chr1:180712-181178" "chr1:181200-181607"
## [4] "chr1:191183-192084" "chr1:267576-268461" "chr1:270850-271755"

To quickly convert between motif IDs (like MA0497.1) and motif common names (like MEF2C), we can use the ConvertMotifID() function. For example:

# convert ID to common name
ids <- c("MA0030.1", "MA0031.1", "MA0051.1", "MA0057.1")
names <- ConvertMotifID(object = motif, id = ids)
names
## [1] "FOXF2"       "FOXD1"       "IRF2"        "MZF1(var.2)"
# convert names to IDs
ConvertMotifID(object = motif, name = names)
## [1] "MA0030.1" "MA0031.1" "MA0051.1" "MA0057.1"

For a full list of methods for the Motif class run:

methods(class = "Motif")
## [1] [              coerce         ConvertMotifID dim            dimnames      
## [6] GetMotifData   SetMotifData   show           subset        
## see '?methods' for accessing help and source code
Session Info
## R version 4.5.0 (2025-04-11)
## Platform: x86_64-pc-linux-gnu
## Running under: Red Hat Enterprise Linux 8.10 (Ootpa)
## 
## Matrix products: default
## BLAS:   /charonfs/home/users/astar/gis/stuartt/R-4.5.0/lib/libRblas.so 
## LAPACK: /charonfs/home/users/astar/gis/stuartt/R-4.5.0/lib/libRlapack.so;  LAPACK version 3.12.1
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Asia/Singapore
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] BSgenome.Hsapiens.UCSC.hg38_1.4.5 BSgenome_1.78.0                  
##  [3] rtracklayer_1.70.1                BiocIO_1.20.0                    
##  [5] Biostrings_2.78.0                 XVector_0.50.0                   
##  [7] GenomeInfoDb_1.46.2               motifmatchr_1.32.0               
##  [9] TFBSTools_1.48.0                  JASPAR2020_0.99.10               
## [11] ensembldb_2.34.0                  AnnotationFilter_1.34.0          
## [13] GenomicFeatures_1.62.0            AnnotationDbi_1.72.0             
## [15] Biobase_2.70.0                    GenomicRanges_1.62.1             
## [17] Seqinfo_1.0.0                     IRanges_2.44.0                   
## [19] S4Vectors_0.48.0                  AnnotationHub_4.0.0              
## [21] BiocFileCache_3.0.0               dbplyr_2.5.2                     
## [23] BiocGenerics_0.56.0               generics_0.1.4                   
## [25] Signac_1.9999.0                   Seurat_5.4.0                     
## [27] SeuratObject_5.3.0                sp_2.2-1                         
## 
## loaded via a namespace (and not attached):
##   [1] fs_1.6.7                    ProtGenerics_1.42.0        
##   [3] matrixStats_1.5.0           spatstat.sparse_3.1-0      
##   [5] bitops_1.0-9                DirichletMultinomial_1.52.0
##   [7] httr_1.4.8                  RColorBrewer_1.1-3         
##   [9] InteractionSet_1.38.0       tools_4.5.0                
##  [11] sctransform_0.4.3           backports_1.5.0            
##  [13] R6_2.6.1                    lazyeval_0.2.2             
##  [15] uwot_0.2.4                  withr_3.0.2                
##  [17] gridExtra_2.3               progressr_0.18.0           
##  [19] cli_3.6.5                   textshaping_1.0.4          
##  [21] spatstat.explore_3.7-0      fastDummies_1.7.5          
##  [23] sass_0.4.10                 S7_0.2.1                   
##  [25] spatstat.data_3.1-9         ggridges_0.5.7             
##  [27] pbapply_1.7-4               pkgdown_2.2.0              
##  [29] Rsamtools_2.26.0            systemfonts_1.3.2          
##  [31] foreign_0.8-91              dichromat_2.0-0.1          
##  [33] parallelly_1.46.1           rstudioapi_0.18.0          
##  [35] RSQLite_2.4.6               gtools_3.9.5               
##  [37] ica_1.0-3                   spatstat.random_3.4-4      
##  [39] dplyr_1.2.0                 Matrix_1.7-4               
##  [41] abind_1.4-8                 lifecycle_1.0.5            
##  [43] yaml_2.3.12                 SummarizedExperiment_1.40.0
##  [45] SparseArray_1.10.8          Rtsne_0.17                 
##  [47] grid_4.5.0                  blob_1.3.0                 
##  [49] promises_1.5.0              pwalign_1.6.0              
##  [51] crayon_1.5.3                miniUI_0.1.2               
##  [53] lattice_0.22-9              cowplot_1.2.0              
##  [55] cigarillo_1.0.0             KEGGREST_1.50.0            
##  [57] pillar_1.11.1               knitr_1.51                 
##  [59] rjson_0.2.23                future.apply_1.20.2        
##  [61] codetools_0.2-20            fastmatch_1.1-8            
##  [63] glue_1.8.0                  spatstat.univar_3.1-6      
##  [65] data.table_1.18.2.1         vctrs_0.7.1                
##  [67] png_0.1-9                   spam_2.11-3                
##  [69] gtable_0.3.6                cachem_1.1.0               
##  [71] xfun_0.56                   S4Arrays_1.10.1            
##  [73] mime_0.13                   survival_3.8-6             
##  [75] RcppRoll_0.3.1              fitdistrplus_1.2-6         
##  [77] ROCR_1.0-12                 nlme_3.1-168               
##  [79] bit64_4.6.0-1               filelock_1.0.3             
##  [81] RcppAnnoy_0.0.23            bslib_0.10.0               
##  [83] irlba_2.3.7                 KernSmooth_2.23-26         
##  [85] otel_0.2.0                  rpart_4.1.24               
##  [87] colorspace_2.1-2            seqLogo_1.76.0             
##  [89] DBI_1.3.0                   Hmisc_5.2-5                
##  [91] nnet_7.3-20                 tidyselect_1.2.1           
##  [93] bit_4.6.0                   compiler_4.5.0             
##  [95] curl_7.0.0                  httr2_1.2.2                
##  [97] htmlTable_2.4.3             desc_1.4.3                 
##  [99] DelayedArray_0.36.0         plotly_4.12.0              
## [101] checkmate_2.3.4             scales_1.4.0               
## [103] caTools_1.18.3              lmtest_0.9-40              
## [105] rappdirs_0.3.4              stringr_1.6.0              
## [107] digest_0.6.39               goftest_1.2-3              
## [109] spatstat.utils_3.2-1        rmarkdown_2.30             
## [111] htmltools_0.5.9             pkgconfig_2.0.3            
## [113] base64enc_0.1-6             sparseMatrixStats_1.22.0   
## [115] MatrixGenerics_1.22.0       fastmap_1.2.0              
## [117] rlang_1.1.7                 htmlwidgets_1.6.4          
## [119] UCSC.utils_1.6.1            shiny_1.13.0               
## [121] farver_2.1.2                jquerylib_0.1.4            
## [123] zoo_1.8-15                  jsonlite_2.0.0             
## [125] BiocParallel_1.44.0         VariantAnnotation_1.56.0   
## [127] RCurl_1.98-1.17             magrittr_2.0.4             
## [129] Formula_1.2-5               dotCall64_1.2              
## [131] patchwork_1.3.2             Rcpp_1.1.1                 
## [133] reticulate_1.45.0           stringi_1.8.7              
## [135] MASS_7.3-65                 plyr_1.8.9                 
## [137] parallel_4.5.0              listenv_0.10.1             
## [139] ggrepel_0.9.7               deldir_2.0-4               
## [141] splines_4.5.0               tensor_1.5.1               
## [143] igraph_2.2.2                spatstat.geom_3.7-0        
## [145] RcppHNSW_0.6.0              reshape2_1.4.5             
## [147] TFMPvalue_1.0.0             BiocVersion_3.22.0         
## [149] XML_3.99-0.23               evaluate_1.0.5             
## [151] biovizBase_1.58.0           BiocManager_1.30.27        
## [153] httpuv_1.6.16               RANN_2.6.2                 
## [155] tidyr_1.3.2                 purrr_1.2.1                
## [157] polyclip_1.10-7             future_1.70.0              
## [159] scattermore_1.2             ggplot2_4.0.2              
## [161] xtable_1.8-8                restfulr_0.0.16            
## [163] RSpectra_0.16-2             later_1.4.8                
## [165] viridisLite_0.4.3           ragg_1.5.0                 
## [167] tibble_3.3.1                memoise_2.0.1              
## [169] GenomicAlignments_1.46.0    cluster_2.1.8.2            
## [171] globals_0.19.1