sfaira.data.DatasetSuperGroup¶
- class sfaira.data.DatasetSuperGroup(dataset_groups: Union[None, List[sfaira.data.dataloaders.base.dataset_group.DatasetGroup], List[sfaira.data.dataloaders.base.dataset_group.DatasetSuperGroup]])¶
Container for multiple DatasetGroup instances.
Used to manipulate structured dataset collections. Primarly designed for this manipulation, convert to DatasetGroup via flatten() for more functionalities.
Attributes
"
Returns DatasetGroup (rather than self = DatasetSuperGroup) containing all listed data sets.
Methods
Collapse count matrix along duplicated index.
download
(**kwargs)extend_dataset_groups
(dataset_groups)flatten
()Returns DatasetGroup (rather than self = DatasetSuperGroup) containing all listed data sets.
get_gc
([genome])load
([annotated_only, load_raw, ...])Loads data set homosapiens into anndata object.
load_config
(fn)Load a config file and recreates a data sub-setting.
ncells
([annotated_only])ncells_bydataset
([annotated_only])List of list of length of all data sets by data set group.
ncells_bydataset_flat
([annotated_only])Flattened list of length of all data sets.
Project free text cell type names to ontology based on mapping table.
remove_duplicates
([supplier_hierarchy])Remove duplicate data loaders from super group, e.g.
set_dataset_groups
(dataset_groups)streamline_features
([match_to_release, ...])Subset and sort genes to genes defined in an assembly or genes of a particular type, such as protein coding.
streamline_metadata
([schema, clean_obs, ...])Streamline the adata instance in each group and each data set to output format.
subset
(key, values)Subset list of adata objects based on match to values in key property.
subset_cells
(key, values)Subset list of adata objects based on cell-wise properties.
write_config
(fn)Writes a config file that describes the current data sub-setting.
write_distributed_store
(dir_cache[, ...])Write data set into a format that allows distributed access to data set on disk.