De Novo#

This subpackage contains functions to annotate cells de novo (from scratch), based on marker genes.

Generally in this subpackage, you will find a function named ai_annotate_blank that takes an AnnData object, and internally calls ai_blank, a function that takes a gene list.

For convenience, we provide documentation for both forms of the function in case you want to use an AnnData object as input, or directly pass lists of genes for flexibility/customization of inputs.

Annotation by marker genes#

This module handles LLM querying to annotate cell types in the context of all other cells, based on marker genes alone.

ai_cell_type(gene_list[, tissue])

Returns the cell type based on a list of marker genes as determined by LLM.

ai_annotate_cell_type(adata, groupby, ...[, ...])

Annotate cell types based on the top marker genes for each cluster.

ai_annotate_cell_sub_type(adata, ...[, ...])

Annotate cell subtypes using an LLM.

Annotation by marker genes (in the context of the other marker genes)#

This module handles annotating a set of genes sets by considering each gene set in the context of the other sets of genes.

ai_cell_types_by_comparison(gene_lists[, ...])

Returns cell type labels for multiple lists of marker genes (in the context of each other) as determined by an LLM.

ai_annotate_cell_type_by_comparison(adata, ...)

Uses an LLM to annotate cell types based on their enriched genes, by considering each gene set in the context of the other sets of genes.

Annotation based on expected cell types#

This module annotates gene lists with cell types based on a list of expected cell types. Can pass tissue information. Can pass cell type information if looking for subtypes with subtype=True.

Based on ai_cell_types_by_comparison().

ai_from_expected_cell_types(gene_lists[, ...])

Returns cell type labels for multiple lists of marker genes.

ai_annotate_from_expected_cell_types(adata, ...)

Annotate cell types based on the top differentially expressed genes in each cluster.

Automatically calculate cell type gene module scores#

This module contains functions that calculate cell type marker gene scores automaticallly (i.e. you supply only the cell type, not the marker genes).

cell_type_marker_gene_score(adata[, ...])

Compute marker gene scores for specified cell types.

Annotate groups of cells by Biological Process#

This module annotates groups of cells with a biological process, based on the group’s enriched genes

ai_biological_process(gene_list)

Describes the most prominent biological process represented by a list of genes using an LLM.

ai_annotate_biological_process(adata, ...[, ...])

Annotate biological processes based on the top n marker genes for each cluster.

Automatically determine (Leiden) clustering resolution#

This module contains functions to automatically determine the cluster resolution of a umap.

ai_determine_leiden_resolution(adata, ...)

Adjusts the Leiden clustering resolution of an AnnData object based on LLM feedback.

Core Functions#

This module contains core functions for de novo annotation of cells based on marker genes and LLMs. The functions in this module are called by other annotation functions. We include these functions in the docs for reference, but you should not generally use them directly.

ai_annotate(func, adata, groupby, ...[, ...])

Annotate clusters based on the top marker genes for each cluster.

ai_annotate_by_comparison(func, adata, ...)

Annotate clusters based on the top marker genes for each cluster, in the context of the other clusters' marker genes.