sfaira.data.DatasetInteractive

class sfaira.data.DatasetInteractive(data: anndata._core.anndata.AnnData, feature_symbol_col: Optional[str] = 'index', feature_id_col: Optional[str] = None, feature_type_col: Optional[str] = None, dataset_id: str = 'interactive_dataset', data_path: Optional[str] = '.', meta_path: Optional[str] = '.', cache_path: Optional[str] = '.')

Attributes

additional_annotation_key

annotated

assay_differentiation

assay_sc

assay_type_differentiation

author

bio_sample

bio_sample_obs_key

cache_fn

cell_line

cell_type

celltypes_universe

citation

Return all information necessary to cite data set.

data_dir

default_embedding

development_stage

directory_formatted_doi

disease

doi

All publication DOI associated with the study which are the journal publication and the preprint.

doi_cleaned_id

doi_journal

The prepring publication (secondary) DOI associated with the study.

doi_main

The main DOI associated with the study which is the journal publication if available, otherwise the preprint.

doi_preprint

The journal publication (main) DOI associated with the study.

download_url_data

Data download website(s).

download_url_meta

Meta data download website(s).

ethnicity

feature_reference

feature_type

id

individual

loaded

return: Whether DataSet was loaded into memory.

meta

meta_fn

ncells

ontology_class_maps

organ

organism

primary_data

sample_source

sex

source

source_doi

state_exact

tech_sample

tech_sample_obs_key

title

year

Methods

clear()

Remove loaded .adata to reduce memory footprint.

collapse_counts()

Collapse count matrix along duplicated index.

download(**kwargs)

get_ontology(k)

load([load_raw, allow_caching])

param remove_gene_version

Remove gene version string from ENSEMBL ID so that different versions in different data sets are superimposed.

load_meta(fn)

project_free_to_ontology(attr)

Project free text cell type names to ontology based on mapping table.

read_ontology_class_maps(fns)

Load class maps of free text class labels to ontology classes.

set_dataset_id([idx])

show_summary()

streamline_features([match_to_release, ...])

Subset and sort genes to genes defined in an assembly or genes of a particular type, such as protein coding.

streamline_metadata([schema, clean_obs, ...])

Streamline the adata instance to a defined output schema.

subset_cells(key, values)

Subset list of adata objects based on cell-wise properties.

write_distributed_store(dir_cache[, ...])

Write data set into a format that allows distributed access to data set on disk.

write_meta([fn_meta, dir_out])

Write meta data object for data set.

write_ontology_class_maps(fn, attrs[, ...])

Load class maps of ontology-controlled field to ontology classes.