anndict.annotate.transfer_labels_using_classifier#
- anndict.annotate.transfer_labels_using_classifier(origin_adata, destination_adata, origin_label_key, feature_key, classifier_class, new_column_name='predicted_label', random_state=None, **kwargs)[source]#
Transfers labels from
origin_adatatodestination_adatausing a classifier of typeclassifier_class.Supported classifiers include any
sklearnclassifier inheriting fromsklearn.base.ClassifierMixin.- Parameters:
- origin_adata
AnnData An
AnnDatacontaining the original labels. A classifier will be trained on this adata.- destination_adata
AnnData An
AnnDatacontaining the new cells to be labeled. Must contain the same.obsm[feature_key]asorigin_adataiffeature_keyis not'use_X'.- origin_label_key
str Key in
origin_adata.obscontaining the original labels.- feature_key
Union[str,Literal['use_X']] Key of data to use in
origin_adata.obsm, or'use_X'to useorigin_adata.X.- classifier_class
Type[ClassifierMixin] Any classifier inheriting from
sklearn.base.ClassifierMixin. Pass as a class, e.g.LogisticRegression, and not an already-instantiated object.- new_column_name
str(default:'predicted_label') The name of the new column in
destination_adata.obswhere the predicted labels will be stored.- random_state
int|None(default:None) random state seed passed to
stable_label_adata().- **kwargs
Additional keyword arguments passed to the classifier constructor.
- origin_adata
- Return type:
AdataPredicoder- Returns:
A
AdataPredicoderthat contains the trained classifier, and automatically decodes predicted labels into text labels. Can be used to calculate class membership probabilities or predict on otherAnnData.
Notes
Modifies
destination_adatain-place.See also
AdataPredicoderThe container class for classifier+label encoder/decoder.
train_label_classifier()The function that trains the classifier on
origin_adata.
Examples
Case 1: Using a logistic regression classifier
import anndict as adt from sklearn.linear_model import LogisticRegression transfer_labels( origin_adata=origin_adata, destination_adata=destination_adata, origin_label_key='cell_type', feature_key='X_pca', classifier_class=LogisticRegression, new_column_name='predicted_label', penalty='l2', #one kwarg for LogisticRegression fit_intercept=True, #another kwarg for LogisticRegression )
Case 2: Using a random forest classifier
import anndict as adt from sklearn.ensemble import RandomForestClassifier transfer_labels( origin_adata=origin_adata, destination_adata=destination_adata, origin_label_key='cell_type', feature_key='X_pca', classifier_class=RandomForestClassifier, new_column_name='predicted_label', n_estimators=1000, #one kwarg for RandomForestClassifier max_features='sqrt', #another kwarg for RandomForestClassifier )