anndict.annotate.train_label_classifier#
- anndict.annotate.train_label_classifier(adata, label_key, feature_key, classifier_class, *, random_state=None, **kwargs)[source]#
Trains a classifier on the given
adata; used internally bytransfer_labels_using_classifier().- Parameters:
- adata
AnnData An
AnnDatacontaining the original labels. A classifier will be trained on thisadata.- label_key
str Key in
adata.obscontaining the original labels.- feature_key
Union[str,Literal['use_X']] Key of data to use in
adata.obsm, or'use_X'to useadata.X.- classifier_class
Type[ClassifierMixin] Any classifier inheriting from
sklearn.base.ClassifierMixin. Pass as a class, e.g.LogisticRegression, and not an already-instantiated object.- random_state
int|None(default:None) random state seed passed to
stable_label_adata().- **kwargs
Additional keyword arguments passed to the classifier constructor.
- adata
- Return type:
AdataPredicoder- Returns:
A
AdataPredicoder, containing the trained classifier and label encoder/decoder.
See also
AdataPredicoderThe container class for classifier+label encoder/decoder.
stable_label_adata()The function that trains the classifier on
adata.
Examples
Case 1: Using a logistic regression classifier
import anndict as adt from sklearn.linear_model import LogisticRegression train_label_classifier( adata=adata, label_key='cell_type', feature_key='X_pca', classifier_class=LogisticRegression, penalty='l2', #one kwarg for LogisticRegression fit_intercept=True, #another kwarg for LogisticRegression )
Case 2: Using a random forest classifier
import anndict as adt from sklearn.ensemble import RandomForestClassifier train_label_classifier( adata=adata, label_key='cell_type', feature_key='X_pca', classifier_class=RandomForestClassifier, n_estimators=1000, #one kwarg for RandomForestClassifier max_features='sqrt', #another kwarg for RandomForestClassifier )