anndict.annotate.cells.de_novo.ai_annotate#
- anndict.annotate.cells.de_novo.ai_annotate(func, adata, groupby, n_top_genes, new_label_column, tissue_of_origin_col=None, **kwargs)[source]#
Annotate clusters based on the top marker genes for each cluster.
This uses marker genes for each cluster and applies func to determine the label for each cluster based on the top n marker genes. The results are added to the AnnData object and returned as a DataFrame.
If rank_genes_groups hasn’t been run on the adata, this function will automatically run
sc.tl.rank_genes_groups- Parameters:
- func
callable A function that takes
gene_list:list[str] and returnsannotation:str.- adata
AnnData An
AnnDataobject.- groupby
str Column in
adata.obsto group by for differential expression analysis.- n_top_genes
int The number of top marker genes to consider for each cluster.
- new_label_column
str The name of the new column in
adata.obswhere the annotations will be stored.- tissue_of_origin_col
str(default:None) Name of a column in
adata.obsthat contains the tissue of orgin. Used to provide context to the LLM.- **kwargs
additional kwargs passed to
func
- func
- Return type:
DataFrame- Returns:
A
pd.DataFramewith a column for the top marker genes for each cluster.
Notes
This function also modifies the input
adatain place, adding annotations toadata.obs[new_label_col]