Skip to contents

Annotate single cells using scmap.

Usage

RunScmap(
  srt_query,
  srt_ref,
  ref_group = NULL,
  query_assay = "RNA",
  ref_assay = "RNA",
  method = "scmapCluster",
  nfeatures = 500,
  threshold = 0.5,
  k = 10
)

Arguments

srt_query

An object of class Seurat to be annotated with cell types.

srt_ref

An object of class Seurat storing the reference cells.

ref_group

A character vector specifying the column name in the `srt_ref` metadata that represents the cell grouping.

query_assay

A character vector specifying the assay to be used for the query data. Defaults to the default assay of the `srt_query` object.

ref_assay

A character vector specifying the assay to be used for the reference data. Defaults to the default assay of the `srt_ref` object.

method

The method to be used for scmap analysis. Can be any of "scmapCluster" or "scmapCell". The default value is "scmapCluster".

nfeatures

The number of top features to be selected. The default value is 500.

threshold

The threshold value on similarity to determine if a cell is assigned to a cluster. This should be a value between 0 and 1. The default value is 0.5.

k

Number of clusters per group for k-means clustering when method is "scmapCell".

Examples

data("panc8_sub")
# Simply convert genes from human to mouse and preprocess the data
genenames <- make.unique(capitalize(rownames(panc8_sub), force_tolower = TRUE))
panc8_sub <- RenameFeatures(panc8_sub, newnames = genenames)
#> Rename features for the assay: RNA
panc8_sub <- check_srtMerge(panc8_sub, batch = "tech")[["srtMerge"]]
#> [2023-11-21 07:50:22.635964] Spliting srtMerge into srtList by column tech... ...
#> [2023-11-21 07:50:22.813549] Checking srtList... ...
#> Data 1/5 of the srtList is raw_normalized_counts. Perform NormalizeData(LogNormalize) on the data ...
#> Perform FindVariableFeatures on the data 1/5 of the srtList...
#> Data 2/5 of the srtList is raw_normalized_counts. Perform NormalizeData(LogNormalize) on the data ...
#> Perform FindVariableFeatures on the data 2/5 of the srtList...
#> Data 3/5 of the srtList is raw_normalized_counts. Perform NormalizeData(LogNormalize) on the data ...
#> Perform FindVariableFeatures on the data 3/5 of the srtList...
#> Data 4/5 of the srtList is raw_counts. Perform NormalizeData(LogNormalize) on the data ...
#> Perform FindVariableFeatures on the data 4/5 of the srtList...
#> Data 5/5 of the srtList is raw_counts. Perform NormalizeData(LogNormalize) on the data ...
#> Perform FindVariableFeatures on the data 5/5 of the srtList...
#> Use the separate HVF from srtList...
#> Number of available HVF: 2000
#> [2023-11-21 07:50:26.11454] Finished checking.

# Annotation
data("pancreas_sub")
pancreas_sub <- Standard_SCP(pancreas_sub)
#> [2023-11-21 07:50:27.795783] Start Standard_SCP
#> [2023-11-21 07:50:27.795974] Checking srtList... ...
#> Data 1/1 of the srtList is raw_counts. Perform NormalizeData(LogNormalize) on the data ...
#> Perform FindVariableFeatures on the data 1/1 of the srtList...
#> Use the separate HVF from srtList...
#> Number of available HVF: 2000
#> [2023-11-21 07:50:28.485316] Finished checking.
#> [2023-11-21 07:50:28.485507] Perform ScaleData on the data...
#> [2023-11-21 07:50:28.563784] Perform linear dimension reduction (pca) on the data...
#> Warning: The following arguments are not used: force.recalc
#> Warning: The following arguments are not used: force.recalc
#> [2023-11-21 07:50:29.137888] Perform FindClusters (louvain) on the data...
#> [2023-11-21 07:50:29.212299] Reorder clusters...
#> [2023-11-21 07:50:29.275877] Perform nonlinear dimension reduction (umap) on the data...
#> Non-linear dimensionality reduction(umap) using Reduction(Standardpca, dims:1-13) as input
#> Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
#> Also defined by ‘spam’
#> Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
#> Also defined by ‘spam’
#> Non-linear dimensionality reduction(umap) using Reduction(Standardpca, dims:1-13) as input
#> Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
#> Also defined by ‘spam’
#> Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
#> Also defined by ‘spam’
#> [2023-11-21 07:50:37.419439] Standard_SCP done
#> Elapsed time: 9.62 secs 
pancreas_sub <- RunScmap(
  srt_query = pancreas_sub, srt_ref = panc8_sub,
  ref_group = "celltype", method = "scmapCluster"
)
#> Detected srt_query data type: log_normalized_counts
#> Detected srt_ref data type: log_normalized_counts
#> Perform selectFeatures on the data...
#> Perform indexCluster on the data...
#> Perform scmapCluster on the data...
#> Warning: Features Adcyap1, Cfc1, Col6a2, Col6a3, Crp, Ctrc, Gsta1, Lcn2, Loxl4, Mafa, Mt-atp6, Mt-co1, Mt-co2, Mt-co3, Mt-nd1, Mt-nd2, Mt-nd4, Mt-nd4l, Mt-nd5, Muc13, Serpina5, Tacstd2 are not present in the 'SCESet' object and therefore were not set.
CellDimPlot(pancreas_sub, group.by = "scmap_annotation")


pancreas_sub <- RunScmap(
  srt_query = pancreas_sub, srt_ref = panc8_sub,
  ref_group = "celltype", method = "scmapCell"
)
#> Detected srt_query data type: log_normalized_counts
#> Detected srt_ref data type: log_normalized_counts
#> Perform selectFeatures on the data...
#> Perform indexCell on the data...
#> Perform scmapCell on the data...
#> Perform scmapCell2Cluster on the data...
CellDimPlot(pancreas_sub, group.by = "scmap_annotation")