Tasks
Cell Generation
To generate cells conditioned on cell type using a C2S model,
you can use the tasks.generate_cells_conditioned_on_cell_type() function.
This function will call the batched generation function of the CSModel class
with cell type generation prompts.
- tasks.generate_cells_conditioned_on_cell_type(csmodel: CSModel, cell_types_list: list, n_genes: int = 200, organism: str = 'Homo sapiens', inference_batch_size: int = 8, max_num_tokens: int = 1024, use_flash_attn: bool = False, **kwargs)
Generate new cells using a C2S model, conditioned on cell type.
- Parameters:
csmodel – a CSModel object wrapping the C2S model
cell_types_list – list of strings representing the cell type labels to generate from
n_genes – the number of genes to prompt the model to generate for each cell sentence
organism – the organism to generate cells for (‘Homo sapiens’, ‘Mus musculus’)
inference_batch_size – batch size of inference for text generation
max_num_tokens – maximum number of tokens to generate
use_flash_attn – if True, uses Flash Attention in model.generate() for faster inference
kwargs – additional arguments for Huggingface model.generate(). For generation options, see Huggingface docs: https://huggingface.co/docs/transformers/en/main_classes/text_generation
- Returns:
List of generated cells in the form of cell sentences
Cell Type Annotation
To predict cell types of data, you can use the
tasks.predict_cell_types_of_data() function:
- tasks.predict_cell_types_of_data(csdata: CSData, csmodel: CSModel, n_genes: int = 200, **kwargs)
Predict cell types of data using C2S model.
- Parameters:
csdata – a CSData object wrapping the dataset to predict cell types with
csmodel – a CSModel object wrapping the C2S model to predict cell types with
n_genes – the number of genes to use for each cell sentence
kwargs – additional arguments for Huggingface model.generate(). For generation options, see Huggingface docs: https://huggingface.co/docs/transformers/en/main_classes/text_generation
- Returns:
List of predicted cell types
Cell Embedding
To embed cells using C2S models, you can use the
tasks.embed_cells() function. This function loads a CSModel object, and
uses the C2S model to embed cell sentences from the CSData object into
embedding vectors.
- tasks.embed_cells(csdata: CSData, csmodel: CSModel, n_genes: int = 200, inference_batch_size: int = 8)
Embed cells using C2S model.
- Parameters:
csdata – a CSData object wrapping the dataset to predict cell types with
csmodel – a CSModel object wrapping the C2S model to predict cell types with
n_genes – the number of genes to use for each cell sentence
inference_batch_size – batch size for inference
- Returns:
Numpy array of embedded cells