anndict.annotate.transfer_labels_using_classifier#
- anndict.annotate.transfer_labels_using_classifier(origin_adata, destination_adata, origin_label_key, feature_key, classifier_class, new_column_name='predicted_label', random_state=None, **kwargs)[source]#
Transfers labels from
origin_adata
todestination_adata
using a classifier of typeclassifier_class
.Supported classifiers include any
sklearn
classifier inheriting fromsklearn.base.ClassifierMixin
.- Parameters:
- origin_adata
AnnData
An
AnnData
containing the original labels. A classifier will be trained on this adata.- destination_adata
AnnData
An
AnnData
containing the new cells to be labeled. Must contain the same.obsm[feature_key]
asorigin_adata
iffeature_key
is not'use_X'
.- origin_label_key
str
Key in
origin_adata.obs
containing the original labels.- feature_key
Union
[str
,Literal
['use_X'
]] Key of data to use in
origin_adata.obsm
, or'use_X'
to useorigin_adata.X
.- classifier_class
Type
[ClassifierMixin
] Any classifier inheriting from
sklearn.base.ClassifierMixin
. Pass as a class, e.g.LogisticRegression
, and not an already-instantiated object.- new_column_name
str
(default:'predicted_label'
) The name of the new column in
destination_adata.obs
where the predicted labels will be stored.- random_state
int
|None
(default:None
) random state seed passed to
stable_label_adata()
.- **kwargs
Additional keyword arguments passed to the classifier constructor.
- origin_adata
- Return type:
AdataPredicoder
- Returns:
A
AdataPredicoder
that contains the trained classifier, and automatically decodes predicted labels into text labels. Can be used to calculate class membership probabilities or predict on otherAnnData
.
Notes
Modifies
destination_adata
in-place.See also
AdataPredicoder
The container class for classifier+label encoder/decoder.
train_label_classifier()
The function that trains the classifier on
origin_adata
.
Examples
Case 1: Using a logistic regression classifier
import anndict as adt from sklearn.linear_model import LogisticRegression transfer_labels( origin_adata=origin_adata, destination_adata=destination_adata, origin_label_key='cell_type', feature_key='X_pca', classifier_class=LogisticRegression, new_column_name='predicted_label', penalty='l2', #one kwarg for LogisticRegression fit_intercept=True, #another kwarg for LogisticRegression )
Case 2: Using a random forest classifier
import anndict as adt from sklearn.ensemble import RandomForestClassifier transfer_labels( origin_adata=origin_adata, destination_adata=destination_adata, origin_label_key='cell_type', feature_key='X_pca', classifier_class=RandomForestClassifier, new_column_name='predicted_label', n_estimators=1000, #one kwarg for RandomForestClassifier max_features='sqrt', #another kwarg for RandomForestClassifier )