anndict.annotate.train_label_classifier#
- anndict.annotate.train_label_classifier(adata, label_key, feature_key, classifier_class, *, random_state=None, **kwargs)[source]#
Trains a classifier on the given
adata
; used internally bytransfer_labels_using_classifier()
.- Parameters:
- adata
AnnData
An
AnnData
containing the original labels. A classifier will be trained on thisadata
.- label_key
str
Key in
adata.obs
containing the original labels.- feature_key
Union
[str
,Literal
['use_X'
]] Key of data to use in
adata.obsm
, or'use_X'
to useadata.X
.- classifier_class
Type
[ClassifierMixin
] Any classifier inheriting from
sklearn.base.ClassifierMixin
. Pass as a class, e.g.LogisticRegression
, and not an already-instantiated object.- random_state
int
|None
(default:None
) random state seed passed to
stable_label_adata()
.- **kwargs
Additional keyword arguments passed to the classifier constructor.
- adata
- Return type:
AdataPredicoder
- Returns:
A
AdataPredicoder
, containing the trained classifier and label encoder/decoder.
See also
AdataPredicoder
The container class for classifier+label encoder/decoder.
stable_label_adata()
The function that trains the classifier on
adata
.
Examples
Case 1: Using a logistic regression classifier
import anndict as adt from sklearn.linear_model import LogisticRegression train_label_classifier( adata=adata, label_key='cell_type', feature_key='X_pca', classifier_class=LogisticRegression, penalty='l2', #one kwarg for LogisticRegression fit_intercept=True, #another kwarg for LogisticRegression )
Case 2: Using a random forest classifier
import anndict as adt from sklearn.ensemble import RandomForestClassifier train_label_classifier( adata=adata, label_key='cell_type', feature_key='X_pca', classifier_class=RandomForestClassifier, n_estimators=1000, #one kwarg for RandomForestClassifier max_features='sqrt', #another kwarg for RandomForestClassifier )