Public Documentation
Documentation for Severo.jl's public interface.
See the Internals section of the manual for internal package docs covering all submodules.
Contents
Index
Severo.agreementSevero.agreementSevero.agreementSevero.alignmentSevero.alignmentSevero.clusterSevero.clusterSevero.convert_countsSevero.convert_countsSevero.filter_cellsSevero.filter_countsSevero.filter_featuresSevero.filter_rank_markersSevero.find_markersSevero.find_variable_featuresSevero.jaccard_indexSevero.jaccard_indexSevero.mean_varSevero.nearest_neighboursSevero.nearest_neighboursSevero.nearest_neighboursSevero.nearest_neighboursSevero.normalize_cellsSevero.prefilter_markersSevero.puritySevero.read_10XSevero.read_10X_h5Severo.read_csvSevero.read_dataSevero.read_dgeSevero.read_h5Severo.read_h5adSevero.read_loomSevero.scale_featuresSevero.shared_nearest_neighboursSevero.shared_nearest_neighboursSevero.shared_nearest_neighboursSevero.shared_nearest_neighboursSevero.umapSevero.umapSevero.umap
Public Interface
Severo.agreement — Methodagreement(rng::AbstractRNG, X::AbstractMatrix, Y::Union{AbstractMatrix, LinearEmbedding}; k::Int64)A metric for quantifying how much a transformation/factorization distorts the geometry of the original dataset. The greater the agreement, the less distortion of geometry there is.
This is calculated by performing dimensionality reduction on the original and transformed dataset, and measuring similarity between the k nearest neighbors for each cell in the datasets. The Jaccard index is used to quantify similarity, and is the final metric averages across all cells.
Arguments:
- `rng`: random number generator used for k-NN
- `X`: low dimensional embedding for reference dataset
- `Y` low dimensional embedding for transformed dataset
- `k`: number of neighbours to find (default=15)Return values: The agreement score
Severo.agreement — Methodagreement(rng::AbstractRNG, X::Union{AbstractMatrix, LinearEmbedding}, Ys::Union{AbstractMatrix, LinearEmbedding}...; k::Int64)A metric for quantifying how much a transformation/factorization distorts the geometry of the original dataset. The greater the agreement, the less distortion of geometry there is.
This is calculated by performing dimensionality reduction on the original and transformed dataset, and measuring similarity between the k nearest neighbors for each cell in the datasets. The Jaccard index is used to quantify similarity, and is the final metric averages across all cells.
Arguments:
- `rng`: random number generator used for k-NN
- `X`: low dimensional embedding for reference dataset
- `Ys` low dimensional embedding for transformed datasets
- `k`: number of neighbours to find (default=15)Return values: The agreement score
Severo.agreement — Methodagreement(X::Union{AbstractMatrix, LinearEmbedding}, Ys::Union{AbstractMatrix, LinearEmbedding}...; k::Int64)A metric for quantifying how much a transformation/factorization distorts the geometry of the original dataset. The greater the agreement, the less distortion of geometry there is.
This is calculated by performing dimensionality reduction on the original and transformed dataset, and measuring similarity between the k nearest neighbors for each cell in the datasets. The Jaccard index is used to quantify similarity, and is the final metric averages across all cells.
Arguments:
- `X`: low dimensional embedding for reference dataset
- `Y...`: low dimensional embedding for transformed dataset(s)
- `k`: number of neighbours to find (default=15)Return values: The agreement score
Severo.alignment — Methodalignment(X::AbstractMatrix, datasets::AbstractVector{T}...; k::Union{Nothing,Int64}=nothing) where TCalculates the alignment score as defined by Butler 2018 [doi: 10.1038/nbt.4096]. It's a quantitative metric for the alignment of datasets and calculated as follows:
1. Randomly downsample the datasets to have the same number of cells as the smallest dataset
2. Construct a nearest-neighbor graph based on the cells’ embedding in some low dimensional space `X`.
3. For every cell, calculate how many of its k nearest-neighbors belong to the same dataset and average this over all cells.
4. We then normalize by the expected number of same dataset cells and scale to range from 0 to 1.If the datasets are well-aligned, we would expect that each cells’ nearest neighbors would be evenly shared across all datasets.
Arguments:
- `X`: low dimensional embedding of the aligned datasets
- `datasets`: the split into datasets
- `k`: number of neighbours to find. By default: 1% of the total number of cells, capped by a minimum of 10 and total number of samples drawnReturn values: The alignment score
Severo.alignment — Methodalignment(rng::AbstractRNG, X::AbstractMatrix, datasets::AbstractVector{T}...; k::Union{Nothing,Int64}=nothing) where TCalculates the alignment score as defined by Butler 2018 [doi: 10.1038/nbt.4096]. It's a quantitative metric for the alignment of datasets and calculated as follows:
1. Randomly downsample the datasets to have the same number of cells as the smallest dataset
2. Construct a nearest-neighbor graph based on the cells’ embedding in some low dimensional space `X`.
3. For every cell, calculate how many of its k nearest-neighbors belong to the same dataset and average this over all cells.
4. We then normalize by the expected number of same dataset cells and scale to range from 0 to 1.If the datasets are well-aligned, we would expect that each cells’ nearest neighbors would be evenly shared across all datasets.
Arguments:
- `rng`: random number generator used by downsampling and k-NN
- `X`: low dimensional embedding
- `datasets`: the split into datasets
- `k`: number of neighbours to find. By default: 1% of the total number of cells, capped by a minimum of 10 and total number of samples drawnReturn values: The alignment score
Severo.cluster — Methodcluster(SNN::NeighbourGraph; algorithm=:louvain, resolution=0.8, nstarts=1, niterations=10) where TCluster cells based on a neighbourhood graph.
Arguments:
- `SNN`: shared neighbours graph
- `algorithm`: clustering algorithm to use (louvain)
- `resolution`: parameters above 1 will lead to larger communities whereas below 1 lead to smaller ones
- `nstarts`: number of random starts
- `niterations`: maximum number of iterations per random start
- `group_singletons`: group singletons into nearest cluster, if false keeps singletonsReturn values:
cluster assignment per cell
Severo.cluster — Methodcluster(rng::AbstractRNG, SNN::NeighbourGraph; algorithm=:louvain, resolution=0.8, nstarts=1, niterations=10) where TCluster cells based on a neighbourhood graph.
Arguments:
- `rng`: random number generator
- `SNN`: shared neighbours graph
- `algorithm`: clustering algorithm to use (louvain)
- `resolution`: parameters above 1 will lead to larger communities whereas below 1 lead to smaller ones
- `nstarts`: number of random starts
- `niterations`: maximum number of iterations per random start
- `group_singletons`: group singletons into nearest cluster, if false keeps singletonsReturn values:
cluster assignment per cell
Severo.convert_counts — Methodconvert_counts(X::AbstractMatrix, features::AbstractVector, barcodes::AbstractVector; unique_features::Bool=true)Convert a count matrix and labels into its labeled representation
Arguments:
X: a count matrix (features x barcodes)features: list of feature namesbarcodes: list of barcodesunique_features: should feature names be made unique (default: true)
Returns values:
Returns labeled sparse matrix containing the counts
Severo.convert_counts — Methodconvert_counts(X::AbstractMatrix)Convert a count matrix into its labeled representation by generating unique labels
Arguments:
X: a count matrix (features x barcodes)
Returns values:
Returns labeled sparse matrix containing the counts
Severo.filter_cells — Methodfilter_cells(A::NamedCountMatrix; min_features=0, min_feature_count=0, min_umi=0)Filter a labeled count matrix, removing cells for which the metrics fall below the given thresholds
Arguments:
- `A`: the count matrix
- `min_features`: include cells where at least this many features are detected
- `min_features_count`: threshold on the count for which a feature is marked "detected"
- `min_umi`: include cells where the total of umi counts is at least this valueReturn value:
The filtered, labeled matrix with cells removed
Severo.filter_counts — Methodfilter_counts(A::NamedCountMatrix; min_cells=0, min_features=0, min_feature_count=0, min_umi=0)Filter a labeled count matrix, removing cells and features for which the metrics fall below the given threshold
First cells are removed using filter_cells and then features using filter_features. This order can be important!
Arguments:
- `A`: the count matrix
- `min_cells`: include features detected in at least this many cells
- `min_features`: include cells where at least this many features are detected
- `min_features_count`: threshold on the count for which a feature is marked "detected"
- `min_umi`: include cells where the total of umi counts is at least this valueReturn value:
The filtered, labeled matrix with cells and features removed
Severo.filter_features — Methodfilter_features(A::NamedCountMatrix; min_cells=0)Filter a count matrix, removing features for which the metrics fall below the given thresholds
Arguments:
- `A`: the count matrix
- `min_cells`: include features detected in at least this many cellsReturn value:
The filtered matrix with features removed
Severo.filter_rank_markers — Methodfilter_rank_markers(de::DataFrame; pval_thresh::Real=1e-2, ngenes::Integer=typemax(Int64))Filters and ranks a list of markers (differentially expressed genes).
Arguments:
-`de`: list of markers returned by [find_markers](@ref)
-`pval_thresh`: only keep markers with pval < pval_thresh
-`count`: the number of highest-ranked markers to keep
-`rankby_abs`: rank based on absolute value of the scoresReturn values:
A DataFrame containing a ranked list of filtered markers.
Severo.find_markers — Methodfind_markers(X::Union{NamedCountMatrix, NamedDataMatrix}, idents::NamedVector{<:Integer};
method=:wilcoxon, selection::Union{Nothing, NamedArray{Bool, 2}, AbstractArray{Bool,2}}=nothing, log::Bool=false, kw...)Finds markers (differentially expressed genes) for each of the classes in a dataset.
Arguments:
-`X`: count or data matrix
-`idents`: class identity for each cell
-`method`: Which test to use, supported are: [wilcoxon, t]
-`selection`: a selection of features and groups that should be considered
-`log`: the data is in log-scale (default = false)
-`kw...`: additional parameters passed down to the methodReturn values:
A DataFrame containing a list of putative markers with associated statistics (p-values and scores) and log fold-changes.
Severo.find_variable_features — Functionfind_variable_features(counts::NamedCountMatrix, nfeatures=2000; method=:vst, kw...)Identification of highly variable features: find features that exhibit high cell-to-cell variation in the dataset (i.e, they are highly expressed in some cells, and lowly expressed in others).
Arguments:
-`counts`: count matrix
-`nfeatures`: the number of top-ranking features to return
-`method`: how to choose top variable features
-`kw`: additional keyword arguments to pass along to the method*Methods**:
-`:vst`: fits a line to the log(mean) - log(variance) relationship, then standardizes the features values
using the observed mean and expected variance. Finally, feature variance is calculated using the standardized values.
- `loess_span`: span parameter for loess regression when fitting the mean-variance relationship
-`dispersion`: selects the genes with the highest dispersion values
-`meanvarplot`: calculates the feature mean and dispersion, bins the mean according into `num_bins` bins.
Finally, returns the z-scores for dispersion within each bin.
- `num_bins`: Total number of bins to use
- `binning_method`: Specifies how the bins should be computed. Available: `:width` for equal width and `:frequency` for equal frequency binningReturn value:
The nfeatures top-ranked features
Severo.jaccard_index — Methodjaccard_index(X::NamedArray{T,2}; prune::Real=1/15) where TCompute a graph with edges defined by the jaccard index. The Jaccard index measures similarity between nearest neighbour sets, and is defined as the size of the intersection divided by the size of the union. "0" indicating no overlap and "1" indicating full overlap.
Arguments:
- `nn`: a nearest neighbour graph
- `prune`: cutoff for the Jaccard index, edges with values below this cutoff are removed from the resulting graphReturn values:
A shared nearest neighbours graph represented by a sparse matrix. Weights of the edges indicate similarity of the neighbourhoods of the cells as computed with the Jaccard index.
Severo.jaccard_index — Methodjaccard_index(X::NamedArray{T,2}, k::Int64; prune::Real=1/15) where TCompute a graph with edges defined by the jaccard index. The Jaccard index measures similarity between nearest neighbour sets, and is defined as the size of the intersection divided by the size of the union. "0" indicating no overlap and "1" indicating full overlap.
Arguments:
- `nn`: a nearest neighbour graph
- `k`: maximum number of neighbours
- `prune`: cutoff for the Jaccard index, edges with values below this cutoff are removed from the resulting graphReturn values:
A shared nearest neighbours graph represented by a sparse matrix. Weights of the edges indicate similarity of the neighbourhoods of the cells as computed with the Jaccard index.
Severo.mean_var — Methodvariance to mean ratio (VMR) in non-logspace
Severo.nearest_neighbours — Methodnearest_neighbours(em::LinearEmbedding, k::Int64;
dims=:, metric::SemiMetric=Euclidean(), include_self::Bool=true, ntables::Int64=2*size(X,2)) where TCompute a k-nearest neighbours graph based on a linear embedding
Arguments:
- `em`: embedding containing the transformed coordinates for each cell
- `k`: number of nearest neighbours to find
- `dims`: which dimensions to use
- `metric`: distance metric to use
- `include_self`: include the cell in its k-nearest neighbours
- `ntables`: number of tables to use in knn algorithm: controls the precision (higher is more accurate)Return values:
A k-nearest neighbours graph represented by a sparse matrix. k-neighbours are stored as rows for each cell (cols)
Severo.nearest_neighbours — Methodnearest_neighbours(rng::AbstractRNG, em::LinearEmbedding, k::Int64;
dims=:, metric::SemiMetric=Euclidean(), include_self::Bool=true, ntables::Int64=2*size(X,2)) where TCompute a k-nearest neighbours graph based on a linear embedding
Arguments:
- `rng`: random number generator
- `em`: embedding containing the transformed coordinates for each cell
- `k`: number of nearest neighbours to find
- `dims`: which dimensions to use
- `metric`: distance metric to use
- `include_self`: include the cell in its k-nearest neighbours
- `ntables`: number of tables to use in knn algorithm: controls the precision (higher is more accurate)Return values:
A k-nearest neighbours graph represented by a sparse matrix. k-neighbours are stored as rows for each cell (cols)
Severo.nearest_neighbours — Methodnearest_neighbours(X::NamedArray{T,2}, k::Int64;
dims=:, metric::SemiMetric=Euclidean(), include_self::Bool=true, ntables::Int64=2*size(X,2)) where TCompute a k-nearest neighbours graph based on coordinates for each cell.
Arguments:
- `X`: a labelled matrix with coordinates for each cell
- `k`: number of nearest neighbours to find
- `dims`: which dimensions to use
- `metric`: distance metric to use
- `include_self`: include the cell in its k-nearest neighbours
- `ntables`: number of tables to use in knn algorithm: controls the precision (higher is more accurate)Return values:
A k-nearest neighbours graph represented by a sparse matrix. k-neighbours are stored as rows for each cell (cols)
Severo.nearest_neighbours — Methodnearest_neighbours(rng::AbstractRNG, X::NamedArray{T,2}, k::Int64;
dims=:, metric::SemiMetric=Euclidean(), include_self::Bool=true, ntables::Int64=2*size(X,2)) where TCompute a k-nearest neighbours graph based on coordinates for each cell.
Arguments:
- `rng`: random number generator
- `X`: a labelled matrix with coordinates for each cell
- `k`: number of nearest neighbours to find
- `dims`: which dimensions to use
- `metric`: distance metric to use
- `include_self`: include the cell in its k-nearest neighbours
- `ntables`: number of tables to use in knn algorithm: controls the precision (higher is more accurate)Return values:
A k-nearest neighbours graph represented by a sparse matrix. k-neighbours are stored as rows for each cell (cols)
Severo.normalize_cells — Methodnormalize_cells(X::NamedCountMatrix; method=:lognormalize, scale_factor=1.0)Normalize count data with different methods:
- `lognormalize`: feature counts are divided by the total count per cell, scaled by `scale_factor` and then log1p transformed.
- `relativecounts`: feature counts are divided by the total count per cell and scaled by `scale_factor`.Arguments:
- `X`: the labelled count matrix to normalize
- `method`: normalization method to apply
- `scale_factor`: the scaling factor
- `dtype`: datatype to be used for the outputReturn values:
A labelled data matrix
Severo.prefilter_markers — Functionprefilter_markers(X::Union{NamedCountMatrix, NamedDataMatrix}, idents::NamedVector{<:Integer};
logfc_threshold::Real=0.0, min_pct::Real=0.0, min_diff_pct::Real=-Inf, only_pos:Bool=false, log::Bool=false)Filter features for each of the classes in a dataset.
Arguments:
-`X`: count or data matrix
-`idents`: class identity for each cell
-`logfc_threshold`: Limit testing to features which show, on average, at least X-fold difference (log-scale) between the two groups of cells
-`min_pct`: only test features that are detected in a minimum fraction of `min_pct` cells in either of the two populations
-`min_diff_pct`: only test features that show a minimum difference in the fraction of detection between the two groups.
-`only_pos`: only return features with positive log fold-change
-`log`: the data is in log-scale (default = false)Return values:
Selection matrix for each feature and class
Severo.purity — Methodpurity(clusters::IntegerArray, classes::IntegerArray)Calculates purity between clusters and external clustering (true clusters/classes).
Arguments:
- `clusters`: clustering for which to calculate purity
- `classes`: clustering/classes with which to compareReturn values:
Purity score in the range [0, 1], with a score of 1 representing a pure/accurate clustering
Severo.read_10X — Methodread_10X(dirname::AbstractString; unique_features=true)Read count matrix from 10X genomics
Arguments:
dirname: path to directory containing matrix.mtx, genes.tsv (or features.tsv), and barcodes.tsv from 10Xunique_features: should feature names be made unique (default: true)
Returns values:
Returns labeled sparse matrix containing the counts
Severo.read_10X_h5 — Methodread_10X_h5(fname::AbstractString; dataset::AbstractString="/mm10", unique_features=true)Read count matrix from 10X CellRanger hdf5 file.
Arguments:
fname: path to hdf5 filedataset: name of dataset to load (default: "mm10")unique_features: should feature names be made unique (default: true)
Returns values:
Returns labeled sparse matrix containing the counts
Severo.read_csv — Methodread_csv(dirname::AbstractString; unique_features=true)Read count matrix from CSV
Arguments:
fname: path to csv fileunique_features: should feature names be made unique (default: true)
Returns values:
Returns labeled sparse matrix containing the counts
Severo.read_data — Methodread_data(path::AbstractString; kw...)Tries to identify and read a count matrix in any of the supported formats
Arguments:
fname: pathkw: additional keyword arguments are passed on
Returns values:
Returns labeled sparse matrix containing the counts
Severo.read_dge — Methodread_csv(dirname::AbstractString; unique_features=true)Read count matrix from digital gene expression (DGE) files
Arguments:
fname: path to dge fileunique_features: should feature names be made unique (default: true)
Returns values:
Returns labeled sparse matrix containing the counts
Severo.read_h5 — Methodread_h5(fname::AbstractString; dataset::AbstractString="/mm10", unique_features=true)Read count matrix from hdf5 file.
Arguments:
fname: path to hdf5 filedataset: name of dataset to load (default: "counts")unique_features: should feature names be made unique (default: true)
Returns values:
Returns labeled sparse matrix containing the counts
Severo.read_h5ad — Methodread_h5ad(fname::AbstractString, dataset::String="/mm10"; unique_features=true)Read count matrix from hdf5 file as created by AnnData.py. https://anndata.readthedocs.io/en/latest/fileformat-prose.html
Arguments:
fname: path to hdf5 fileunique_features: should feature names be made unique (default: true)
Returns values:
Returns labeled sparse matrix containing the counts
Severo.read_loom — Methodread_loom(fname::AbstractString; barcode_names::AbstractString="CellID", feature_names::AbstractString="Gene", unique_names::Bool=true, blocksize::Tuple{Int,Int}=(100,100))Read count matrix from loom format
Arguments:
fname: path to loom filebarcode_names: key where the observation/cell names are stored.feature_names: key where the variable/feature names are stored.unique_names: should feature and barcode names be made unique (default: true)blocksize: blocksize to use when reading the matrix (tradeoff between memory and speed)
Returns values:
Returns labeled sparse matrix containing the counts
Severo.scale_features — Methodscale_features(X::NamedArray{T, 2, SparseMatrixCSC{T, Int64}} ; scale_max=Inf, dtype::Type{<:AbstractFloat})Scale and center a count/data matrix along the cells such that each feature is standardized
Arguments:
- `X`: the labelled count/data matrix to scale
- `scale_max`: maximum value of the scaled dataReturn values:
A centered matrix
Severo.shared_nearest_neighbours — Methodshared_nearest_neighbours(em::LinearEmbedding, k::Int64; dims=:, metric::SemiMetric=Euclidean(), include_self::Bool=true, ntables::Int64=2*size(X,2))Compute a k-nearest neighbours graph based on an embedding of cells and its Jaccard index.
The Jaccard index measures similarity between nearest neighbour sets, and is defined as the size of the intersection divided by the size of the union. "0" indicating no overlap and "1" indicating full overlap.
Arguments:
- `em`: embedding containing the transformed coordinates for each cell
- `k`: number of nearest neighbours to find
- `dims`: which dimensions to use
- `include_self`: include the cell in its k-nearest neighbours
- `ntables`: number of tables to use in knn algorithm: controls the precision (higher is more accurate)
- `prune`: cutoff for the Jaccard index, edges with values below this cutoff are removed from the resulting graphReturn values:
A shared nearest neighbours graph represented by a sparse matrix. Weights of the edges indicate similarity of the neighbourhoods of the cells as computed with the Jaccard index.
Severo.shared_nearest_neighbours — Methodshared_nearest_neighbours(rng::AbstractRNG, em::LinearEmbedding, k::Int64; dims=:, metric::SemiMetric=Euclidean(), include_self::Bool=true, ntables::Int64=2*size(X,2))Compute a k-nearest neighbours graph based on an embedding of cells and its Jaccard index.
The Jaccard index measures similarity between nearest neighbour sets, and is defined as the size of the intersection divided by the size of the union. "0" indicating no overlap and "1" indicating full overlap.
Arguments:
- `rng`: random number generator
- `em`: embedding containing the transformed coordinates for each cell
- `k`: number of nearest neighbours to find
- `dims`: which dimensions to use
- `include_self`: include the cell in its k-nearest neighbours
- `ntables`: number of tables to use in knn algorithm: controls the precision (higher is more accurate)
- `prune`: cutoff for the Jaccard index, edges with values below this cutoff are removed from the resulting graphReturn values:
A shared nearest neighbours graph represented by a sparse matrix. Weights of the edges indicate similarity of the neighbourhoods of the cells as computed with the Jaccard index.
Severo.shared_nearest_neighbours — Methodshared_nearest_neighbours(X::NamedArray{T,2}, k::Int64;
dims=:, metric::SemiMetric=Euclidean(), include_self::Bool=true, ntables::Int64=2*size(X,2)) where TCompute a k-nearest neighbours graph based on coordinates for each cell and its Jaccard index.
The Jaccard index measures similarity between nearest neighbour sets, and is defined as the size of the intersection divided by the size of the union. "0" indicating no overlap and "1" indicating full overlap.
Arguments:
- `X`: a labelled matrix with coordinates for each cell
- `k`: number of nearest neighbours to find
- `dims`: which dimensions to use
- `include_self`: include the cell in its k-nearest neighbours
- `ntables`: number of tables to use in knn algorithm: controls the precision (higher is more accurate)
- `prune`: cutoff for the Jaccard index, edges with values below this cutoff are removed from the resulting graphReturn values:
A shared nearest neighbours graph represented by a sparse matrix. Weights of the edges indicate similarity of the neighbourhoods of the cells as computed with the Jaccard index.
Severo.shared_nearest_neighbours — Methodshared_nearest_neighbours(rng::AbstractRNG, X::NamedArray{T,2}, k::Int64; dims=:, metric::SemiMetric=Euclidean(), include_self::Bool=true, ntables::Int64=2*size(X,2)) where TCompute a k-nearest neighbours graph based on coordinates for each cell and its Jaccard index.
The Jaccard index measures similarity between nearest neighbour sets, and is defined as the size of the intersection divided by the size of the union. "0" indicating no overlap and "1" indicating full overlap.
Arguments:
- `rng`: random number generator
- `X`: a labelled matrix with coordinates for each cell
- `k`: number of nearest neighbours to find
- `dims`: which dimensions to use
- `include_self`: include the cell in its k-nearest neighbours
- `ntables`: number of tables to use in knn algorithm: controls the precision (higher is more accurate)
- `prune`: cutoff for the Jaccard index, edges with values below this cutoff are removed from the resulting graphReturn values:
A shared nearest neighbours graph represented by a sparse matrix. Weights of the edges indicate similarity of the neighbourhoods of the cells as computed with the Jaccard index.
Severo.umap — Functionumap(X::NamedMatrix, ncomponents::Int64=2; dims=:, metric=:cosine, nneighbours::Int=30, min_dist::Real=.3, nepochs::Int=300, kw...) where TPerforms a Uniform Manifold Approximation and Projection (UMAP) dimensional reduction on the coordinates.
For a more in depth discussion of the mathematics underlying UMAP, see the ArXiv paper: [https://arxiv.org/abs/1802.03426]
Arguments:
- `X`: a labelled matrix with coordinates for each cell
- `ncomponents`: the dimensionality of the embedding
- `dims`: which dimensions to use
- `metric`: distance metric to use
- `nneighbours`: the number of neighboring points used in local approximations of manifold structure.
- `min_dist`: controls how tightly the embedding is allowed compress points together.
- `nepochs`: number of training epochs to be used while optimizing the low dimensional embedding
- `kw`: additional parameters for the umap algorithm. See [`UMAP.umap`](@ref)Return values:
A low-dimensional embedding of the cells
Severo.umap — Functionumap(em::LinearEmbedding, ncomponents::Int64=2; dims=:, metric=:cosine, nneighbours::Int=30, min_dist::Real=.3, nepochs::Int=300, kw...) where TPerforms a Uniform Manifold Approximation and Projection (UMAP) dimensional reduction on the coordinates in the linear embedding.
For a more in depth discussion of the mathematics underlying UMAP, see the ArXiv paper: [https://arxiv.org/abs/1802.03426]
Arguments:
- `em`: embedding containing the transformed coordinates for each cell
- `ncomponents`: the dimensionality of the embedding
- `dims`: which dimensions to use
- `metric`: distance metric to use
- `nneighbours`: the number of neighboring points used in local approximations of manifold structure.
- `min_dist`: controls how tightly the embedding is allowed compress points together.
- `nepochs`: number of training epochs to be used while optimizing the low dimensional embedding
- `kw`: additional parameters for the umap algorithm. See [`UMAP.umap`](@ref)Return values:
A low-dimensional embedding of the cells
Severo.umap — Methodumap(X::AbstractMatrix, ncomponents::Int64=2; dims=:, metric=:cosine, nneighbours::Int=30, min_dist::Real=.3, nepochs::Int=300, kw...) where TPerforms a Uniform Manifold Approximation and Projection (UMAP) dimensional reduction on the coordinates in the linear embedding.
For a more in depth discussion of the mathematics underlying UMAP, see the ArXiv paper: [https://arxiv.org/abs/1802.03426]
Arguments:
- `X`: an unlabelled matrix with coordinates for each cell
- `ncomponents`: the dimensionality of the embedding
- `dims`: which dimensions to use
- `metric`: distance metric to use
- `nneighbours`: the number of neighboring points used in local approximations of manifold structure.
- `min_dist`: controls how tightly the embedding is allowed compress points together.
- `nepochs`: number of training epochs to be used while optimizing the low dimensional embedding
- `kw`: additional parameters for the umap algorithm. See [`UMAP.umap`](@ref)Return values:
A low-dimensional embedding of the cells