Find peaks that are correlated with the expression of nearby genes. For each gene, this function computes the correlation coefficient between the gene expression and accessibility of each peak within a given distance from the gene TSS, and computes an expected correlation coefficient for each peak given the GC content, accessibility, and length of the peak. The expected coefficient values for the peak are then used to compute a z-score and p-value.
Usage
LinkPeaks(
object,
peak.assay,
expression.assay,
peak.layer = "counts",
expression.layer = "data",
method = "pearson",
key = "linkpeaks",
gene.coords = NULL,
distance = 5e+05,
min.distance = NULL,
min.cells = 10,
genes.use = NULL,
n_sample = 200,
pvalue_cutoff = 0.05,
score_cutoff = 0.05,
gene.id = FALSE,
verbose = TRUE,
peak.slot = deprecated(),
expression.slot = deprecated()
)Arguments
- object
A Seurat object
- peak.assay
Name of assay containing peak information
- expression.assay
Name of assay containing gene expression information
- peak.layer
Name of layer to pull chromatin data from
- expression.layer
Name of layer to pull expression data from
- method
Correlation method to use. One of "pearson" or "spearman"
- key
Key to use when storing link information in the assay
- gene.coords
GRanges object containing coordinates of genes in the expression assay. If NULL, extract from gene annotations stored in the assay.
- distance
Distance threshold for peaks to include in regression model
- min.distance
Minimum distance between peak and TSS to include in regression model. If NULL (default), no minimum distance is used.
- min.cells
Minimum number of cells positive for the peak and gene needed to include in the results.
- genes.use
Genes to test. If NULL, determine from expression assay.
- n_sample
Number of peaks to sample at random when computing the null distribution.
- pvalue_cutoff
Minimum p-value required to retain a link. Links with a p-value equal or greater than this value will be removed from the output.
- score_cutoff
Minimum absolute value correlation coefficient for a link to be retained
- gene.id
Set to TRUE if genes in the expression assay are named using gene IDs rather than gene names.
- verbose
Display messages
- peak.slot
Deprecated (use
peak.layer)- expression.slot
Deprecated (used
expression.layer)
Value
Returns a Seurat object with results added to the links slot in the
assay, stored under the key specified in the function. The results are stored
as an InteractionSet::GInteractions object accessible via the Links()
function. This contains the GenomicRanges::GRanges for the pair of linked
regions (peak and gene), with anchor1 corresponding to the peak region and
anchor2 corresponding to the gene region linked to the peak. The following
metadata is also stored in the GInteractions object:
anchor2.gene_id: The gene ID for the linked geneanchor2.gene_name: The name of the linked genescore: the correlation coefficient between the accessibility of the peak and expression of the genezscore: the z-score of the correlation coefficient, computed based on the distribution of correlation coefficients from a set of background peakspvalue: the p-value associated with the z-score for the link
Details
This function was inspired by the method originally described by SHARE-seq (Sai Ma et al. 2020, Cell). Please consider citing the original SHARE-seq work if using this function: doi: 10.1016/j.cell.2020.09.056