anndict.ai_unify_labels#
- anndict.ai_unify_labels(adata_dict, label_columns, new_label_column, simplification_level='unified, typo-fixed')[source]#
Unifies cell type labels across multiple AnnData objects by mapping them to a simplified, unified set of labels.
- Parameters:
- adata_dict
AdataDict
An
AdataDict
.- label_columns
dict
[tuple
[str
,...
],str
] dict
where keys should be the same as the keys ofadata_dict
and values are the column names in the correspondingadata.obs
containing the original labels.- new_label_column
str
Name of the new column to be created in each
adata.obs
for storing the unified labels.- simplification_level
str
(default:'unified, typo-fixed'
) Instructions on how to unify the labels.
- adata_dict
- Return type:
dict
- Returns:
A mapping
dict
where the keys are the original labels and the values are the unified labels.
Notes
Modifies each
adata
inadata_dict
in-place by addingadata.obs[new_label_column]
with the unified label mapping.Example
#import package import anndict as adt #configure LLM backend configure_llm_backend('your-provider-name','your-provider-model-name',api_key='your-provider-api-key') #load the data as an AdataDict adata_paths = ['path/to/adata1.h5ad', 'path/to/adata2.h5ad'] adata_dict = adt.read_adata_dict_from_h5ad(adata_paths) #define the label columns for each adata label_columns = { ('adata1',): 'cell_type', ('adata2',): 'cell_type_label', # this could be different in each adata } new_label_column = 'unified_cell_type' #unify the labels across the adata mapping_dict = adt.ai_unify_labels( adata_dict, label_columns=label_columns, new_label_column=new_label_column, simplification_level="unified, typo-fixed" ) # Now each adata in adata_dict has a new column 'unified_cell_type' # Write the adata_dict to disk (an adata_dict on disk is just a directory containing .h5ad files) adt.write_adata_dict(adata_dict, 'path/to/unified_adata_dict/')