AnnotateFeatures Annotate features in a Seurat object with additional metadata from databases or a GTF file.
Source:R/SCP-feature_annotation.R
AnnotateFeatures.Rd
AnnotateFeatures Annotate features in a Seurat object with additional metadata from databases or a GTF file.
Usage
AnnotateFeatures(
srt,
species = "Homo_sapiens",
IDtype = c("symbol", "ensembl_id", "entrez_id"),
db = NULL,
db_update = FALSE,
db_version = "latest",
convert_species = TRUE,
Ensembl_version = 103,
mirror = NULL,
gtf = NULL,
merge_gtf_by = "gene_name",
columns = c("seqname", "feature", "start", "end", "strand", "gene_id", "gene_name",
"gene_type"),
assays = "RNA",
overwrite = FALSE
)
Arguments
- srt
Seurat object to be annotated.
- species
Name of the species to be used for annotation. Default is "Homo_sapiens".
- IDtype
Type of identifier to use for annotation. Default is "symbol" with options "symbol", "ensembl_id", and "entrez_id".
- db
Vector of database names to be used for annotation. Default is NULL.
- db_update
Logical value indicating whether to update the database. Default is FALSE.
- db_version
Version of the database to use. Default is "latest".
- convert_species
A logical value indicating whether to use a species-converted database when the annotation is missing for the specified species. The default value is TRUE.
- Ensembl_version
Version of the Ensembl database to use. Default is 103.
- mirror
URL of the mirror to use for Ensembl database. Default is NULL.
- gtf
Path to the GTF file to be used for annotation. Default is NULL.
- merge_gtf_by
Column name to merge the GTF file by. Default is "gene_name".
- columns
Vector of column names to be used from the GTF file. Default is "seqname", "feature", "start", "end", "strand", "gene_id", "gene_name", "gene_type".
- assays
Character vector of assay names to be annotated. Default is "RNA".
- overwrite
Logical value indicating whether to overwrite existing metadata. Default is FALSE.
Examples
data("pancreas_sub")
pancreas_sub <- AnnotateFeatures(pancreas_sub,
species = "Mus_musculus", IDtype = "symbol",
db = c("Chromosome", "GeneType", "Enzyme", "TF", "CSPA", "VerSeDa")
)
#> Species: Mus_musculus
#> Preparing database: Chromosome
#> Preparing database: GeneType
#> Preparing database: Enzyme
#> Preparing database: TF
#> Preparing database: CSPA
#> Preparing database: VerSeDa
#> Connect to the Ensembl archives...
#> Using the 103 version of biomart...
#> Connecting to the biomart...
#> Searching the dataset mmusculus ...
#> Connecting to the dataset mmusculus_gene_ensembl ...
#> Converting the geneIDs...
#> Error in collect(., Inf): Failed to collect lazy table.
#> Caused by error in `db_collect()`:
#> ! Arguments in `...` must be used.
#> ✖ Problematic argument:
#> • ..1 = Inf
#> ℹ Did you misspell an argument name?
head(pancreas_sub[["RNA"]]@meta.features)
#> highly_variable_genes
#> Mrpl15 False
#> Npbwr1 <NA>
#> 4732440D04Rik False
#> Gm26901 False
#> Sntg1 True
#> Mybl1 False
## Annotate features using a GTF file
# pancreas_sub <- AnnotateFeatures(pancreas_sub, gtf = "/data/reference/CellRanger/refdata-gex-mm10-2020-A/genes/genes.gtf")
# head(pancreas_sub[["RNA"]]@meta.features)