anndict.annotate.ai_annotate_biological_process#
- anndict.annotate.ai_annotate_biological_process(adata, groupby, n_top_genes, new_label_column='ai_biological_process')[source]#
Annotate biological processes based on the top n marker genes for each cluster.
This function performs differential expression analysis to identify marker genes for each group in
adata.obs[groupby]
and labels the list of genes with the biological process it represents. The results are added toadata
and returned as aDataFrame
.- Parameters:
- adata
AnnData
An
AnnData
object.- groupby
str
Column in
adata.obs
to group by for differential expression analysis.- n_top_genes
int
The number of top marker genes to consider.
- new_label_column
str
(default:'ai_biological_process'
) The name of the new column in
adata.obs
where the biological process annotations will be stored.
- adata
- Return type:
DataFrame
- Returns:
A
pd.DataFrame
with a column for the top marker genes for each cluster.
Notes
This function also modifies the input
adata
in place, adding annotations toadata.obs[new_label_col]
Examples
import anndict as adt # This will annotate the treatment group with biological processes based on the top 10 differentially expressed genes in each group adt.ai_annotate_biological_process(adata, groupby='treatment_vs_control', n_top_genes=10, new_label_column='ai_biological_process') adata.obs['ai_biological_process'] >>> ['immune response', 'cell cycle regulation', ...]