sfaira.data.DatasetInteractive.streamline_metadata

DatasetInteractive.streamline_metadata(schema: str = 'sfaira', clean_obs: bool = True, clean_var: bool = True, clean_uns: bool = True, clean_obs_names: bool = True, keep_orginal_obs: bool = False, keep_symbol_obs: bool = True, keep_id_obs: bool = True)

Streamline the adata instance to a defined output schema.

Output format are saved in ADATA_FIELDS* classes.

Note on ontology-controlled meta data: These are defined for a given format in ADATA_FIELDS*.ontology_constrained. They may appear in three different formats:

  • original (free text) annotation

  • ontology symbol

  • ontology ID

During streamlining, these ontology-controlled meta data are projected to all of these three different formats. The initially annotated column may be any of these and is defined as “{attr}_obs_col”. The resulting three column per meta data item are named:

  • ontology symbol: “{ADATA_FIELDS*.attr}”

  • ontology ID: {ADATA_FIELDS*.attr}_{ADATA_FIELDS*.onto_id_suffix}”

  • original (free text) annotation: “{ADATA_FIELDS*.attr}_{ADATA_FIELDS*.onto_original_suffix}”

Parameters
  • schema – Export format. - “sfaira” - “cellxgene”

  • clean_obs – Whether to delete non-streamlined fields in .obs, .obsm and .obsp.

  • clean_var – Whether to delete non-streamlined fields in .var, .varm and .varp.

  • clean_uns – Whether to delete non-streamlined fields in .uns.

  • clean_obs_names – Whether to replace obs_names with a string comprised of dataset id and an increasing integer.

  • keep_orginal_obs – For ontology-constrained .obs columns, whether to keep a column with original annotation.

  • keep_symbol_obs – For ontology-constrained .obs columns, whether to keep a column with ontology symbol annotation.

  • keep_id_obs – For ontology-constrained .obs columns, whether to keep a column with ontology ID annotation.

Returns