{ "cells": [ { "cell_type": "markdown", "id": "destroyed-campaign", "metadata": {}, "source": [ "# V(D)J Trajectory\n", "\n", "This notebook is an identical notebook as per the quickstart tutorial with some additional information in the setup." ] }, { "cell_type": "code", "execution_count": 1, "id": "fewer-swing", "metadata": {}, "outputs": [], "source": [ "import dandelion as ddl\n", "import pandas as pd\n", "import scanpy as sc\n", "import numpy as np\n", "import warnings\n", "import os\n", "\n", "warnings.filterwarnings(\"ignore\")" ] }, { "cell_type": "markdown", "id": "likely-blood", "metadata": {}, "source": [ "This notebook makes use of [Pertpy](https://github.com/scverse/pertpy) (previously [Milopy](https://github.com/emdann/milopy) [[Dann2022]](https://doi.org/10.1038/s41587-021-01033-z)) and [Palantir](https://github.com/dpeerlab/Palantir) [[Setty2019]](https://doi.org/10.1038/s41587-019-0068-4), two packages that are not formally Dandelion's dependencies. V(D)J feature space applications are open-ended, this is just one of them. Be sure to install the packages beforehand if you want to follow along." ] }, { "cell_type": "code", "execution_count": 2, "id": "beginning-change", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2026-05-02 16:36:57 | [INFO] arviz_base not installed\n", "2026-05-02 16:36:57 | [INFO] arviz_stats not installed\n", "2026-05-02 16:36:57 | [INFO] arviz_plots not installed\n" ] } ], "source": [ "# import milopy.core as milo \n", "import pertpy as pt # see issue https://github.com/emdann/milopy/issues/54\n", "import palantir\n", "\n", "#required because of Palantir\n", "%matplotlib inline\n", "\n", "sc.settings.set_figure_params(dpi=80)" ] }, { "cell_type": "markdown", "id": "arbitrary-justice", "metadata": {}, "source": [ "We've prepared a demo object based on the TCR trajectory shown in the manuscript for you to use here. It's had some analysis done on the GEX, and has Dandelion-derived contig information merged into it. You can download it from the ftp site as per below or from this [demo repo](https://github.com/zktuong/dandelion-demo-files).\n", "\n", "It's possible to use V(D)J information that comes from other sources than Dandelion processing, e.g. the pseudobulking will work with Scirpy output. The functions are just calibrated to work with Dandelion's structure by default." ] }, { "cell_type": "code", "execution_count": 3, "id": "municipal-galaxy", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Downloading demo-pseudobulk.h5ad → dandelion_tutorial/panfetal_trajectory/demo-pseudobulk.h5ad\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Downloading...\n", "From (original): https://drive.google.com/uc?id=1-LbAinwhAhJW3Y60wpO9GWJJcaMa_liy\n", "From (redirected): https://drive.google.com/uc?id=1-LbAinwhAhJW3Y60wpO9GWJJcaMa_liy&confirm=t&uuid=cce8eda1-1001-4da6-b052-3bb64edf2821\n", "To: /Users/uqztuong/Documents/GitHub/dandelion/docs/notebooks/polars/dandelion_tutorial/panfetal_trajectory/demo-pseudobulk.h5ad\n", "100%|██████████| 401M/401M [02:19<00:00, 2.87MB/s] \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Downloading demo-vdj-traj.tsv.gz → dandelion_tutorial/panfetal_trajectory/demo-vdj-traj.tsv.gz\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Downloading...\n", "From (original): https://drive.google.com/uc?id=1lyScJWdGopW2nLoIhZmfUGVSWLWI_qWg\n", "From (redirected): https://drive.google.com/uc?id=1lyScJWdGopW2nLoIhZmfUGVSWLWI_qWg&confirm=t&uuid=19be480b-7ed8-4421-8202-195d8b5eda22\n", "To: /Users/uqztuong/Documents/GitHub/dandelion/docs/notebooks/polars/dandelion_tutorial/panfetal_trajectory/demo-vdj-traj.tsv.gz\n", "100%|██████████| 28.4M/28.4M [00:10<00:00, 2.83MB/s]\n" ] } ], "source": [ "from dandelion.tutorial import setup_dandelion_tutorial_trajectory\n", "\n", "setup_dandelion_tutorial_trajectory()\n", "\n", "os.chdir(\"dandelion_tutorial/\")" ] }, { "cell_type": "markdown", "id": "70e10873", "metadata": {}, "source": [ "The full data used in the Nature Biotechnology paper is available at a separate [repository](https://github.com/zktuong/dandelion-demo-files/tree/master/dandelion_manuscript)." ] }, { "cell_type": "markdown", "id": "caroline-output", "metadata": {}, "source": [ "Prior to performing the pseudobulking, it is recommended to run `ddl.tl.setup_vdj_pseudobulk()`. This will subset the object to just cells with paired chains, and prepare appropriately named and formatted columns for the pseudobulking function to use as defaults.\n", "\n", "If working with non-Dandelion V(D)J processing, subset your cells to ones with at least a full pair of chains, and ensure that you have four columns in place which contain the V(D)J calls for both of the identified primary chains. Scirpy stores this information natively.\n", "\n", "If you are wanting to include D calls (disabled by default), the recommendation is to subset to only cells/contigs with d_call annotated otherwise the separation could be unreliable (due to missing d_call because of technical reasons rather than biology).\n", "\n", "Please look at the options for `ddl.tl.setup_vdj_pseudobulk()` carefully to tailor to your use case.\n", "\n", "