Compute the term-frequency inverse-document-frequency
Source:R/generics.R, R/preprocessing.R
RunTFIDF.RdRun term frequency inverse document frequency (TF-IDF) normalization on a matrix.
Usage
RunTFIDF(object, ...)
# Default S3 method
RunTFIDF(
object,
assay = NULL,
method = 1,
scale.factor = 10000,
idf = NULL,
verbose = TRUE,
...
)
# S3 method for class 'Assay5'
RunTFIDF(
object,
assay = NULL,
method = 1,
scale.factor = 10000,
idf = NULL,
layer = "counts",
save = "data",
verbose = TRUE,
...
)
# S3 method for class 'StdAssay'
RunTFIDF(
object,
assay = NULL,
method = 1,
scale.factor = 10000,
idf = NULL,
layer = "counts",
save = "data",
verbose = TRUE,
...
)
# S3 method for class 'Seurat'
RunTFIDF(
object,
assay = NULL,
method = 1,
scale.factor = 10000,
idf = NULL,
layer = "counts",
save = "data",
verbose = TRUE,
...
)Arguments
- object
A Seurat object
- ...
Arguments passed to other methods
- assay
Name of assay to use.
- method
Which TF-IDF implementation to use. Choice of:
1: The TF-IDF implementation used by Stuart & Butler et al. 2019 (doi:10.1101/460147 ). This computes \(\log(TF \times IDF)\).
2: The TF-IDF implementation used by Cusanovich & Hill et al. 2018 (doi:10.1016/j.cell.2018.06.052 ). This computes \(TF \times (\log(IDF))\).
3: The log-TF method used by Andrew Hill. This computes \(\log(TF) \times \log(IDF)\).
4: The 10x Genomics method (no TF normalization). This computes \(IDF\).
- scale.factor
Which scale factor to use. Default is 10000.
- idf
A precomputed IDF vector to use. If NULL, compute based on the input data matrix.
- verbose
Print progress
- layer
Name of layer to use.
- save
Name of layer to save results in.
Value
Returns a SeuratObject::Seurat() object
Examples
mat <- matrix(data = rbinom(n = 25, size = 5, prob = 0.2), nrow = 5)
RunTFIDF(object = mat)
#> Performing TF-IDF normalization
#> 5 x 5 Matrix of class "dgeMatrix"
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 6.949537 7.759934 7.488134 7.354682 0.000000
#> [2,] 7.929766 0.000000 8.181001 0.000000 7.236979
#> [3,] 6.831874 7.236979 7.370421 7.236979 7.524481
#> [4,] 7.419181 7.131699 0.000000 8.229778 0.000000
#> [5,] 7.082948 7.488134 0.000000 6.795546 8.181001
RunTFIDF(atac_small[["peaks"]])
#> Processing layer: counts
#> Performing TF-IDF normalization
#> Warning: Some cells contain 0 total counts
#> Warning: Some features contain 0 total counts
#> GRangesAssay data with 100 features for 100 cells
#> Variable features: 0
#> Annotation present: TRUE
#> Fragment files: 0
#> Motifs present: TRUE
#> Links present: 0
#> Region aggregation matrices: 0
RunTFIDF(atac_small[["peaks"]])
#> Processing layer: counts
#> Performing TF-IDF normalization
#> Warning: Some cells contain 0 total counts
#> Warning: Some features contain 0 total counts
#> GRangesAssay data with 100 features for 100 cells
#> Variable features: 0
#> Annotation present: TRUE
#> Fragment files: 0
#> Motifs present: TRUE
#> Links present: 0
#> Region aggregation matrices: 0
RunTFIDF(object = atac_small)
#> Processing layer: counts
#> Performing TF-IDF normalization
#> Warning: Some cells contain 0 total counts
#> Warning: Some features contain 0 total counts
#> An object of class Seurat
#> 150 features across 100 samples within 2 assays
#> Active assay: peaks (100 features, 0 variable features)
#> 2 layers present: counts, data
#> 1 other assay present: RNA
#> 2 dimensional reductions calculated: lsi, umap