Skip to contents

Find peaks that are correlated with the expression of nearby genes. For each gene, this function computes the correlation coefficient between the gene expression and accessibility of each peak within a given distance from the gene TSS, and computes an expected correlation coefficient for each peak given the GC content, accessibility, and length of the peak. The expected coefficient values for the peak are then used to compute a z-score and p-value.

Usage

LinkPeaks(
  object,
  peak.assay,
  expression.assay,
  peak.layer = "counts",
  expression.layer = "data",
  method = "pearson",
  key = "linkpeaks",
  gene.coords = NULL,
  distance = 5e+05,
  min.distance = NULL,
  min.cells = 10,
  genes.use = NULL,
  n_sample = 200,
  pvalue_cutoff = 0.05,
  score_cutoff = 0.05,
  gene.id = FALSE,
  verbose = TRUE,
  peak.slot = deprecated(),
  expression.slot = deprecated()
)

Arguments

object

A Seurat object

peak.assay

Name of assay containing peak information

expression.assay

Name of assay containing gene expression information

peak.layer

Name of layer to pull chromatin data from

expression.layer

Name of layer to pull expression data from

method

Correlation method to use. One of "pearson" or "spearman"

key

Key to use when storing link information in the assay

gene.coords

GRanges object containing coordinates of genes in the expression assay. If NULL, extract from gene annotations stored in the assay.

distance

Distance threshold for peaks to include in regression model

min.distance

Minimum distance between peak and TSS to include in regression model. If NULL (default), no minimum distance is used.

min.cells

Minimum number of cells positive for the peak and gene needed to include in the results.

genes.use

Genes to test. If NULL, determine from expression assay.

n_sample

Number of peaks to sample at random when computing the null distribution.

pvalue_cutoff

Minimum p-value required to retain a link. Links with a p-value equal or greater than this value will be removed from the output.

score_cutoff

Minimum absolute value correlation coefficient for a link to be retained

gene.id

Set to TRUE if genes in the expression assay are named using gene IDs rather than gene names.

verbose

Display messages

peak.slot

Deprecated (use peak.layer)

expression.slot

Deprecated (used expression.layer)

Value

Returns a Seurat object with results added to the links slot in the assay, stored under the key specified in the function. The results are stored as an InteractionSet::GInteractions object accessible via the Links() function. This contains the GenomicRanges::GRanges for the pair of linked regions (peak and gene), with anchor1 corresponding to the peak region and anchor2 corresponding to the gene region linked to the peak. The following metadata is also stored in the GInteractions object:

  • anchor2.gene_id: The gene ID for the linked gene

  • anchor2.gene_name: The name of the linked gene

  • score: the correlation coefficient between the accessibility of the peak and expression of the gene

  • zscore: the z-score of the correlation coefficient, computed based on the distribution of correlation coefficients from a set of background peaks

  • pvalue: the p-value associated with the z-score for the link

Details

This function was inspired by the method originally described by SHARE-seq (Sai Ma et al. 2020, Cell). Please consider citing the original SHARE-seq work if using this function: doi: 10.1016/j.cell.2020.09.056