This function calculates gene-set scores from the specified database (db
) for each lineage using the specified scoring method (score_method
).
It then treats these scores as expression values and uses them as input to the RunDynamicFeatures function to identify dynamically enriched terms along the lineage.
Usage
RunDynamicEnrichment(
srt,
lineages,
score_method = "AUCell",
slot = "data",
assay = NULL,
min_expcells = 20,
r.sq = 0.2,
dev.expl = 0.2,
padjust = 0.05,
IDtype = "symbol",
species = "Homo_sapiens",
db = "GO_BP",
db_update = FALSE,
db_version = "latest",
convert_species = TRUE,
Ensembl_version = 103,
mirror = NULL,
TERM2GENE = NULL,
TERM2NAME = NULL,
minGSSize = 10,
maxGSSize = 500,
BPPARAM = BiocParallel::bpparam(),
seed = 11
)
Arguments
- srt
A Seurat object containing the results of differential expression analysis (RunDEtest). If specified, the genes and groups will be extracted from the Seurat object automatically. If not specified, the
geneID
andgeneID_groups
arguments must be provided.- lineages
A character vector specifying the lineages to plot.
- score_method
The method to use for scoring. Can be "Seurat", "AUCell", or "UCell". Defaults to "Seurat".
- slot
A character vector specifying the slot in the Seurat object to use. Default is "counts".
- assay
A character vector specifying the assay in the Seurat object to use. Default is NULL.
- min_expcells
A numeric value specifying the minimum number of expected cells. Default is 20.
- r.sq
A numeric value specifying the R-squared threshold. Default is 0.2.
- dev.expl
A numeric value specifying the deviance explained threshold. Default is 0.2.
- padjust
A numeric value specifying the p-value adjustment threshold. Default is 0.05.
- IDtype
A character vector specifying the type of gene IDs in the
srt
object orgeneID
argument. This argument is used to convert the gene IDs to a different type ifIDtype
is different fromresult_IDtype
.- species
A character vector specifying the species for which the analysis is performed.
- db
A character vector specifying the name of the database to be used for enrichment analysis.
- db_update
A logical value indicating whether the gene annotation databases should be forcefully updated. If set to FALSE, the function will attempt to load the cached databases instead. Default is FALSE.
- db_version
A character vector specifying the version of the database to be used. This argument is ignored if
db_update
isTRUE
. Default is "latest".- convert_species
A logical value indicating whether to use a species-converted database when the annotation is missing for the specified species. The default value is TRUE.
- Ensembl_version
Ensembl database version. If NULL, use the current release version.
- mirror
Specify an Ensembl mirror to connect to. The valid options here are 'www', 'uswest', 'useast', 'asia'.
- TERM2GENE
A data frame specifying the gene-term mapping for a custom database. The first column should contain the term IDs, and the second column should contain the gene IDs.
- TERM2NAME
A data frame specifying the term-name mapping for a custom database. The first column should contain the term IDs, and the second column should contain the corresponding term names.
- minGSSize
A numeric value specifying the minimum size of a gene set to be considered in the enrichment analysis.
- maxGSSize
A numeric value specifying the maximum size of a gene set to be considered in the enrichment analysis.
- BPPARAM
A BiocParallelParam object specifying the parallel back-end to be used for parallel computation. Defaults to BiocParallel::bpparam().
- seed
The random seed for reproducibility. Defaults to 11.
Examples
data("pancreas_sub")
pancreas_sub <- RunSlingshot(pancreas_sub, group.by = "SubCellType", reduction = "UMAP")
#> Warning: Removed 8 rows containing missing values (`geom_path()`).
#> Warning: Removed 8 rows containing missing values (`geom_path()`).
pancreas_sub <- RunDynamicFeatures(pancreas_sub, lineages = "Lineage1", n_candidates = 200)
#> [2023-11-21 07:36:51.180215] Start RunDynamicFeatures
#> Workers: 2
#> Number of candidate features(union): 199
#> Calculate dynamic features for Lineage1...
#>
|
| | 0%
|
|======================================================= | 50%
|
|==============================================================================================================| 100%
#>
#> [2023-11-21 07:37:04.273068] RunDynamicFeatures done
#> Elapsed time:13.09 secs
ht1 <- DynamicHeatmap(
srt = pancreas_sub,
lineages = "Lineage1",
cell_annotation = "SubCellType",
n_split = 4
)
#> 146 features from Lineage1 passed the threshold (exp_ncells>20 & r.sq>0.2 & dev.expl>0.2 & padjust<0.05):
#> Gcg,Iapp,Pyy,Chgb,Slc38a5,Rbp4,Lrpprc,Cck,2810417H13Rik,Chga...
#> 'magick' package is suggested to install to give better rasterization.
#>
#> Set `ht_opt$message = FALSE` to turn off this message.
#>
#> The size of the heatmap is fixed because certain elements are not scalable.
#> The width and height of the heatmap are determined by the size of the current viewport.
#> If you want to have more control over the size, you can manually set the parameters 'width' and 'height'.
ht1$plot
pancreas_sub <- RunDynamicEnrichment(
srt = pancreas_sub,
lineages = "Lineage1",
score_method = "AUCell",
db = "GO_BP",
species = "Mus_musculus"
)
#> [2023-11-21 07:37:07.211287] Start RunDynamicFeatures
#> Workers: 2
#> Species: Mus_musculus
#> Loading cached db: GO_BP version:3.17.0 nterm:16027 created:2023-11-21 07:14:20.545084
#> Convert ID types for the database: GO_BP
#> Connect to the Ensembl archives...
#> Using the 103 version of biomart...
#> Connecting to the biomart...
#> Searching the dataset mmusculus ...
#> Connecting to the dataset mmusculus_gene_ensembl ...
#> Converting the geneIDs...
#> Error in collect(., Inf): Failed to collect lazy table.
#> Caused by error in `db_collect()`:
#> ! Arguments in `...` must be used.
#> ✖ Problematic argument:
#> • ..1 = Inf
#> ℹ Did you misspell an argument name?
ht2 <- DynamicHeatmap(
srt = pancreas_sub,
assay = "GO_BP",
lineages = "Lineage1_GO_BP",
cell_annotation = "SubCellType",
n_split = 4,
split_method = "kmeans-peaktime"
)
#> Error in DynamicHeatmap(srt = pancreas_sub, assay = "GO_BP", lineages = "Lineage1_GO_BP", cell_annotation = "SubCellType", n_split = 4, split_method = "kmeans-peaktime"): lineages: Lineage1_GO_BP is not in the meta data of the Seurat object
ht2$plot
#> Error in eval(expr, envir, enclos): object 'ht2' not found