Find peaks that are correlated with the expression of nearby genes. For each gene, this function computes the correlation coefficient between the gene expression and accessibility of each peak within a given distance from the gene TSS, and computes an expected correlation coefficient for each peak given the GC content, accessibility, and length of the peak. The expected coefficient values for the peak are then used to compute a z-score and p-value.
LinkPeaks( object, peak.assay, expression.assay, peak.slot = "counts", expression.slot = "data", method = "pearson", gene.coords = NULL, distance = 5e+05, min.distance = NULL, min.cells = 10, genes.use = NULL, n_sample = 200, pvalue_cutoff = 0.05, score_cutoff = 0.05, gene.id = FALSE, verbose = TRUE )
A Seurat object
Name of assay containing peak information
Name of assay containing gene expression information
Name of slot to pull chromatin data from
Name of slot to pull expression data from
Correlation method to use. One of "pearson" or "spearman"
GRanges object containing coordinates of genes in the expression assay. If NULL, extract from gene annotations stored in the assay.
Distance threshold for peaks to include in regression model
Minimum distance between peak and TSS to include in regression model. If NULL (default), no minimum distance is used.
Minimum number of cells positive for the peak and gene needed to include in the results.
Genes to test. If NULL, determine from expression assay.
Number of peaks to sample at random when computing the null distribution.
Minimum p-value required to retain a link. Links with a p-value equal or greater than this value will be removed from the output.
Minimum absolute value correlation coefficient for a link to be retained
Set to TRUE if genes in the expression assay are named using gene IDs rather than gene names.
Returns a Seurat object with the
Links information set. This is
granges object accessible via the
function, with the following information:
score: the correlation coefficient between the accessibility of the peak and expression of the gene
zscore: the z-score of the correlation coefficient, computed based on the distribution of correlation coefficients from a set of background peaks
pvalue: the p-value associated with the z-score for the link
gene: name of the linked gene
peak: name of the linked peak
This function was inspired by the method originally described by SHARE-seq (Sai Ma et al. 2020, Cell). Please consider citing the original SHARE-seq work if using this function: doi:10.1016/j.cell.2020.09.056