sfaira.data.Universe

class sfaira.data.Universe(data_path: Optional[str] = None, meta_path: Optional[str] = None, cache_path: Optional[str] = None, exclude_databases: bool = True)

Attributes

adata

adata_ls

additional_annotation_key

"

datasets

Returns DatasetGroup (rather than self = DatasetSuperGroup) containing all listed data sets.

ids

Methods

collapse_counts()

Collapse count matrix along duplicated index.

download(**kwargs)

extend_dataset_groups(dataset_groups)

flatten()

Returns DatasetGroup (rather than self = DatasetSuperGroup) containing all listed data sets.

get_gc([genome])

load([annotated_only, load_raw, ...])

Loads data set homosapiens into anndata object.

load_config(fn)

Load a config file and recreates a data sub-setting.

ncells([annotated_only])

ncells_bydataset([annotated_only])

List of list of length of all data sets by data set group.

ncells_bydataset_flat([annotated_only])

Flattened list of length of all data sets.

project_celltypes_to_ontology([...])

Project free text cell type names to ontology based on mapping table.

remove_duplicates([supplier_hierarchy])

Remove duplicate data loaders from super group, e.g.

set_dataset_groups(dataset_groups)

show_summary()

streamline_features([match_to_release, ...])

Subset and sort genes to genes defined in an assembly or genes of a particular type, such as protein coding.

streamline_metadata([schema, clean_obs, ...])

Streamline the adata instance in each group and each data set to output format.

subset(key, values)

Subset list of adata objects based on match to values in key property.

subset_cells(key, values)

Subset list of adata objects based on cell-wise properties.

write_config(fn)

Writes a config file that describes the current data sub-setting.

write_distributed_store(dir_cache[, ...])

Write data set into a format that allows distributed access to data set on disk.