anndict.utils.sample_and_drop#
- anndict.utils.sample_and_drop(adata, strata_keys, min_num_cells=0, n_largest_groups=None, **kwargs)[source]#
Sample
adata
based on specified strata keys and drop strata with fewer than themin_num_cells
. Can optionally retain only then_largest_groups
.- Parameters:
- adata
AnnData
An
AnnData
.- strata_keys
list
[str
] |str
List of column names in adata.obs to use for stratification.
- min_num_cells
int
(default:0
) Minimum number of cells required to retain a stratum.
- n_largest_groups
int
|None
(default:None
) If specified, keep only the
n_largest_groups
.- kwargs
Additional keyword arguments passed to
sample_adata()
andsc.pp.subsample()
.
- adata
- Return type:
AnnData
- Returns:
Concatenated
AnnData
object after resampling and filtering.- Raises:
ValueError – If any of the specified
strata_keys
do not exist inadata.obs
.
Notes
In the case of ties when selecting the largest groups, all tied groups are kept. So you may end up with more than
n_largest_groups
.