sfaira.data.DatasetGroup.load¶
- DatasetGroup.load(annotated_only: bool = False, load_raw: bool = False, allow_caching: bool = True, processes: int = 1, func=None, kwargs_func: Optional[dict] = None, verbose: int = 0, **kwargs)¶
Load all datasets in group (option for temporary loading).
Note: This method automatically subsets to the group to the data sets for which input files were found.
This method also allows temporarily loading data sets to execute function on loaded data sets (supply func). In this setting, datasets are removed from memory after the function has been executed.
- param annotated_only
- param load_raw
- param allow_caching
- param processes
Processes to parallelise loading over. Uses python multiprocessing if > 1, for loop otherwise.
- param func
Function to run on loaded datasets. map_fun should only take one argument, which is a Dataset instance. The return can be empty:
- def func(dataset, **kwargs_func):
# code manipulating dataset and generating output x. return x
- param kwargs_func
Kwargs of func.
- param verbose
Verbosity of description of loading failure.
0: no indication of failure
1: indication of which data set failed in warning
2: 1 with error report in warning
3: reportin as in 2 but aborts with OSError
- Parameters
remove_gene_version – Remove gene version string from ENSEMBL ID so that different versions in different data sets are superimposed.
match_to_reference – Reference genomes name or False to keep original feature space.
load_raw – Loads unprocessed version of data if available in data loader.
allow_caching – Whether to allow method to cache adata object for faster re-loading.