anndict.annotate.transfer_labels_using_classifier

anndict.annotate.transfer_labels_using_classifier#

anndict.annotate.transfer_labels_using_classifier(origin_adata, destination_adata, origin_label_key, feature_key, classifier_class, new_column_name='predicted_label', random_state=None, **kwargs)[source]#

Transfers labels from origin_adata to destination_adata using a classifier of type classifier_class.

Supported classifiers include any sklearn classifier inheriting from sklearn.base.ClassifierMixin.

Parameters:

origin_adata AnnData: An AnnData containing the original labels. A classifier will be trained on this adata.
destination_adata AnnData: An AnnData containing the new cells to be labeled. Must contain the same .obsm[feature_key] as origin_adata if feature_key is not 'use_X'.
origin_label_key str: Key in origin_adata.obs containing the original labels.
feature_key Union[str, Literal['use_X']]: Key of data to use in origin_adata.obsm, or 'use_X' to use origin_adata.X.
classifier_class Type[ClassifierMixin]: Any classifier inheriting from sklearn.base.ClassifierMixin. Pass as a class, e.g. LogisticRegression, and not an already-instantiated object.
new_column_name str (default: 'predicted_label'): The name of the new column in destination_adata.obs where the predicted labels will be stored.
random_state int | None (default: None): random state seed passed to stable_label_adata().
**kwargs: Additional keyword arguments passed to the classifier constructor.

Return type:

AdataPredicoder

Returns:

A AdataPredicoder that contains the trained classifier, and automatically decodes predicted labels into text labels. Can be used to calculate class membership probabilities or predict on other AnnData.

Notes

Modifies destination_adata in-place.

See also

AdataPredicoder: The container class for classifier+label encoder/decoder.
train_label_classifier(): The function that trains the classifier on origin_adata.

Examples

Case 1: Using a logistic regression classifier

import anndict as adt
from sklearn.linear_model import LogisticRegression

transfer_labels(
    origin_adata=origin_adata,
    destination_adata=destination_adata,
    origin_label_key='cell_type',
    feature_key='X_pca',
    classifier_class=LogisticRegression,
    new_column_name='predicted_label',
    penalty='l2', #one kwarg for LogisticRegression
    fit_intercept=True, #another kwarg for LogisticRegression
)

Case 2: Using a random forest classifier

import anndict as adt
from sklearn.ensemble import RandomForestClassifier

transfer_labels(
    origin_adata=origin_adata,
    destination_adata=destination_adata,
    origin_label_key='cell_type',
    feature_key='X_pca',
    classifier_class=RandomForestClassifier,
    new_column_name='predicted_label',
    n_estimators=1000, #one kwarg for RandomForestClassifier
    max_features='sqrt', #another kwarg for RandomForestClassifier
)

anndict.annotate.transfer_labels_using_classifier

Contents

anndict.annotate.transfer_labels_using_classifier#