anndict.annotate.train_label_classifier

anndict.annotate.train_label_classifier#

anndict.annotate.train_label_classifier(adata, label_key, feature_key, classifier_class, *, random_state=None, **kwargs)[source]#

Trains a classifier on the given adata; used internally by transfer_labels_using_classifier().

Parameters:

adata AnnData: An AnnData containing the original labels. A classifier will be trained on this adata.
label_key str: Key in adata.obs containing the original labels.
feature_key Union[str, Literal['use_X']]: Key of data to use in adata.obsm, or 'use_X' to use adata.X.
classifier_class Type[ClassifierMixin]: Any classifier inheriting from sklearn.base.ClassifierMixin. Pass as a class, e.g. LogisticRegression, and not an already-instantiated object.
random_state int | None (default: None): random state seed passed to stable_label_adata().
**kwargs: Additional keyword arguments passed to the classifier constructor.

Return type:

AdataPredicoder

Returns:

A AdataPredicoder, containing the trained classifier and label encoder/decoder.

See also

AdataPredicoder: The container class for classifier+label encoder/decoder.
stable_label_adata(): The function that trains the classifier on adata.

Examples

Case 1: Using a logistic regression classifier

import anndict as adt
from sklearn.linear_model import LogisticRegression

train_label_classifier(
    adata=adata,
    label_key='cell_type',
    feature_key='X_pca',
    classifier_class=LogisticRegression,
    penalty='l2', #one kwarg for LogisticRegression
    fit_intercept=True, #another kwarg for LogisticRegression
)

Case 2: Using a random forest classifier

import anndict as adt
from sklearn.ensemble import RandomForestClassifier

train_label_classifier(
    adata=adata,
    label_key='cell_type',
    feature_key='X_pca',
    classifier_class=RandomForestClassifier,
    n_estimators=1000, #one kwarg for RandomForestClassifier
    max_features='sqrt', #another kwarg for RandomForestClassifier
)

anndict.annotate.train_label_classifier

Contents

anndict.annotate.train_label_classifier#