Cell2Sentence: Single-cell Analysis With LLMs

Cell2Sentence (C2S) is a framework for directly adapting Large Language Models (LLMs) to single-cell biology. C2S proposes a rank-ordering transformation of cell expression into cell sentences, which are sentences of space-separated gene names ordered by descending expression. By representing single-cell data as cell sentences, C2S provides a framework for LLMs to directly model single-cell biology in natural language, enabling diverse capabilities on multiple single-cell tasks.

C2S is developed by members of the vanDijk Lab at Yale University. Check out the Quickstart section for quickstart instructions.

Note

We are actively adding more features and documentation to the C2S API. For any feature requests or issues, please leave a GitHub issue or reach out to us!

Indices and tables