anndict.map_gene_labels_to_simplified_set

anndict.map_gene_labels_to_simplified_set#

anndict.map_gene_labels_to_simplified_set(labels, simplification_level='', batch_size=50)[source]#

Maps a list of genes to a simplified set of labels using an LLM, processing in batches.

Parameters:
labels list[str]

The list of labels to be mapped.

simplification_level str (default: '')

A qualitative description of how much you want the labels to be simplified.

batch_size int (default: 50)

The number of labels to process in each batch.

Return type:

dict

Returns:

A dict mapping the original labels to the simplified set of labels.

Example

import anndict as adt

gene_labels = ['HSP90AA1',
               'HSPA1A',
               'HSPA1B',
               'CLOCK',
               'ARNTL',
               'PER1',
               'IL1A',
               'IL6'
               ]

label_mapping = adt.map_gene_labels_to_simplified_set(gene_labels,
                        simplification_level='functional category level'
                        )

print(label_mapping)

> {
>     'HSP90AA1': 'Heat Shock Protein',
>     'HSPA1A': 'Heat Shock Protein',
>     'HSPA1B': 'Heat Shock Protein',
>     'CLOCK': 'Circadian Rhythm',
>     'ARNTL': 'Circadian Rhythm',
>     'PER1': 'Circadian Rhythm',
>     'IL1A': 'Interleukin',
>     'IL6': 'Interleukin'
> }