sfaira.data.Universe¶

class sfaira.data.Universe(data_path: Optional[str] = None, meta_path: Optional[str] = None, cache_path: Optional[str] = None, exclude_databases: bool = True)¶

Attributes

`adata`
`adata_ls`
`additional_annotation_key`	"
`datasets`	Returns DatasetGroup (rather than self = DatasetSuperGroup) containing all listed data sets.
`ids`

Methods

`collapse_counts`()	Collapse count matrix along duplicated index.
`download`(**kwargs)
`extend_dataset_groups`(dataset_groups)
`flatten`()	Returns DatasetGroup (rather than self = DatasetSuperGroup) containing all listed data sets.
`get_gc`([genome])
`load`([annotated_only, load_raw, ...])	Loads data set homosapiens into anndata object.
`load_config`(fn)	Load a config file and recreates a data sub-setting.
`ncells`([annotated_only])
`ncells_bydataset`([annotated_only])	List of list of length of all data sets by data set group.
`ncells_bydataset_flat`([annotated_only])	Flattened list of length of all data sets.
`project_celltypes_to_ontology`([...])	Project free text cell type names to ontology based on mapping table.
`remove_duplicates`([supplier_hierarchy])	Remove duplicate data loaders from super group, e.g.
`set_dataset_groups`(dataset_groups)
`show_summary`()
`streamline_features`([match_to_release, ...])	Subset and sort genes to genes defined in an assembly or genes of a particular type, such as protein coding.
`streamline_metadata`([schema, clean_obs, ...])	Streamline the adata instance in each group and each data set to output format.
`subset`(key, values)	Subset list of adata objects based on match to values in key property.
`subset_cells`(key, values)	Subset list of adata objects based on cell-wise properties.
`write_config`(fn)	Writes a config file that describes the current data sub-setting.
`write_distributed_store`(dir_cache[, ...])	Write data set into a format that allows distributed access to data set on disk.