Base Backend: dandelion.base

Preprocessing: pp

assign_isotype(fasta[, org, evalue, ...])

Annotate contigs with constant region call using blastn.

assign_isotypes(fastas[, org, evalue, ...])

Annotate contigs with constant region call using blastn.

check_contigs(data[, adata, ...])

Check contigs for whether they can be considered as ambiguous or not.

format_fasta(fasta[, prefix, suffix, sep, ...])

Add prefix to the headers/contig ids in input fasta and annotation file.

format_fastas(fastas[, prefix, suffix, sep, ...])

Add prefix to the headers/contig ids in input fasta and annotation file.

reannotate_genes(data[, igblast_db, ...])

Reannotate cellranger fasta files with igblastn and parses to airr format.

reassign_alleles(data, combined_folder[, ...])

Correct allele calls based on a personalized genotype using tigger.

Tools: tl

clone_centrality(vdj)

Calculate node closeness centrality in BCR/TCR network.

clone_degree(vdj[, weight])

Calculate node degree in BCR/TCR network.

clone_diversity(data, group_by[, method, ...])

Compute clonal diversity with bootstrapping.

clone_overlap(vdj, group_by[, ...])

A function to tabulate clonal overlap for input as a circos-style plot.

clone_rarefaction(data, group_by[, ...])

Compute sample-based rarefaction curves with asymptotic extrapolation and optional plotting.

clone_size(vdj[, group_by, max_size, ...])

Quantify clone sizes, globally or per group.

clone_view(adata[, mode, ...])

Swap the 'active' connectivities, distances, and optionally embedding in AnnData.

concat(arrays[, check_unique, ...])

Concatenate data frames and return as Dandelion object.

define_clones(vdj, dist[, action, model, ...])

Find clones using changeo's DefineClones.py.

extract_edge_weights(vdj[, expanded_only])

Retrieve edge weights from graph.

find_clones(vdj[, identity, key, ...])

Find clones based on VDJ chain and VJ chain CDR3 junction hamming distance.

from_scirpy(data)

Convert data from scirpy format to Dandelion format.

generate_network(vdj[, adata, key, ...])

Generate a Levenshtein distance network based on VDJ and VJ sequences.

productive_ratio(adata, vdj, group_by[, ...])

Compute the cell-level productive/non-productive contig ratio.

project_pseudotime_to_cell(adata, pb_adata, ...)

Function to project pseudotime & branch probabilities from pb_adata (pseudobulk adata) to adata (cell adata).

pseudobulk_gex(adata_raw[, pbs, ...])

Function to pseudobulk gene expression (raw count).

pseudotime_transfer(adata, pr_res[, suffix])

Function to add pseudotime and branch probabilities into adata.obs in place.

setup_vdj_pseudobulk(adata[, mode, ...])

Function for prepare anndata for computing pseudobulk vdj feature space.

to_scirpy(vdj[, transfer, to_mudata, ...])

Convert Dandelion data to scirpy-compatible format.

transfer(adata, vdj[, expanded, gex_key, ...])

vdj_pseudobulk(adata[, pbs, obs_to_bulk, ...])

Function for making pseudobulk vdj feature space.

vdj_sample(vdj_data[, size, adata, p, ...])

Resample vdj data and corresponding AnnData to a specified size.

vj_usage_pca(adata, group_by[, min_size, ...])

Extract productive V/J gene usage from single cell data and compute PCA.

Plotting: pl

barplot(data, color[, palette, figsize, ...])

A barplot function to plot usage of V/J genes in the data.

clone_circlepackplot(data, group_by[, ...])

A bubble plot to visualise clone sizes within groups using circle packing.

clone_network(adata[, basis, edges])

Using scanpy's plotting module to plot the network.

clone_overlap(adata, group_by[, color_by, ...])

A plot function to visualise clonal overlap as a circos-style plot.

productive_ratio(adata[, figsize, palette, ...])

Plot productive/non-productive contig ratio from AnnData (cell level).

spectratype(vdj, color, group_by, locus[, ...])

A spectratype function to plot usage of CDR3 length.

stackedbarplot(data, color, group_by[, ...])

A stacked bar plot function to plot usage of V/J genes in the data split by groups.

Reading: io

read([filename, distance_zarr, verbose])

Read in and returns a Dandelion class from .h5ddl format.

read_10x_airr(file[, prefix, suffix, sep, ...])

Read the airr_rearrangement.tsv produced from Cell Ranger directly and returns a Dandelion object.

read_10x_vdj(data[, filename_prefix, ...])

A parser to read .csv and .json files directly from folder containing 10x cellranger-outputs, or parse an existing pandas DataFrame.

read_airr(file[, prefix, suffix, sep, ...])

Reads a standard single-cell AIRR rearrangement file.

read_bd_airr(file[, prefix, suffix, sep, ...])

Read the TCR or BCR _AIRR.tsv produced from BD Rhapsody technology.

read_parse_airr(file[, prefix, suffix, sep, ...])

Read the TCR or BCR _annotation_airr.tsv produced from Parse Biosciences Evercode technology.

read_seekgene_vdj(data[, filename_prefix, ...])

A parser to read .csv and .json files directly from folder containing SeekGene VDJ outputs, or parse an existing pandas DataFrame.

read_ddl([filename, distance_zarr, verbose])

Read in and returns a Dandelion class from .h5ddl format.

read_h5ddl([filename, distance_zarr, verbose])

Read in and returns a Dandelion class from .h5ddl format.

Dandelion Class

Dandelion.add_cell_prefix(prefix[, sync, ...])

Add prefix to cell_id and optionally to sequence_id.

Dandelion.add_cell_suffix(suffix[, sync, ...])

Add suffix to cell_id and optionally to sequence_id.

Dandelion.add_sequence_prefix(prefix[, ...])

Add prefix to sequence_id and then apply to cell_id as well.

Dandelion.add_sequence_suffix(suffix[, ...])

Add suffix to sequence_id and then apply to cell_id as well.

Dandelion.compute()

Convert self.distances to a concrete csr matrix.

Dandelion.copy()

Performs a deep copy of all slots in Dandelion class.

Dandelion.data

One-dimensional annotation of contig observations.

Dandelion.data_names

Names of observations (alias for .data.index).

Dandelion.metadata

One-dimensional annotation of cell observations.

Dandelion.metadata_names

Names of observations (alias for .metadata.index).

Dandelion.simplify(**kwargs)

Disambiguate VDJ and C gene calls when there's multiple calls separated by commas and strip the alleles.

Dandelion.store_germline_reference([...])

Update germline reference with corrected sequences and store in Dandelion object.

Dandelion.update_data([skip])

Sync missing metadata columns into data via dictionary mapping.

Dandelion.update_metadata([retrieve, ...])

A Dandelion initialisation function to update and populate the .metadata slot.

Dandelion.update_plus([option])

Retrieve additional data columns that are useful.

Dandelion.write([filename, compression, ...])

Writes a Dandelion class to .h5ddl format.

Dandelion.write_10x([folder, ...])

Alias for write_vdj() kept for backwards compatibility.

Dandelion.write_airr([filename])

Writes a Dandelion class to AIRR formatted .tsv format.

Dandelion.write_ddl([filename, compression, ...])

Writes a Dandelion class to .h5ddl format.

Dandelion.write_h5ddl([filename, ...])

Writes a Dandelion class to .h5ddl format.

Dandelion.write_vdj([folder, ...])

Writes a Dandelion object to contig-annotation formatted files compatible with multiple platforms (10x Genomics, SeekGene, etc.) so that it can be ingested by other tools.

Polars Backend: dandelion.polars

Preprocessing: pp

assign_isotype(fasta[, org, evalue, ...])

Annotate contigs with constant region call using blastn.

assign_isotypes(fastas[, org, evalue, ...])

Annotate contigs with constant region call using blastn.

check_contigs(vdj[, adata, productive_only, ...])

Check contigs for whether they can be considered as ambiguous or not.

format_fasta(fasta[, prefix, suffix, sep, ...])

Add prefix to the headers/contig ids in input fasta and annotation file.

format_fastas(fastas[, prefix, suffix, sep, ...])

Add prefix to the headers/contig ids in input fasta and annotation file.

reannotate_genes(data[, igblast_db, ...])

Reannotate cellranger fasta files with igblastn and parses to airr format.

reassign_alleles(data, combined_folder[, ...])

Correct allele calls based on a personalized genotype using tigger.

Tools: tl

clone_centrality(vdj)

Calculate node closeness centrality in BCR/TCR network.

clone_degree(vdj[, weight])

Calculate node degree in BCR/TCR network.

clone_diversity(data, group_by[, method, ...])

Compute clonal diversity with bootstrapping.

clone_overlap(vdj, group_by[, ...])

A function to tabulate clonal overlap for input as a circos-style plot.

clone_rarefaction(data, group_by[, ...])

Compute sample-based rarefaction curves with asymptotic extrapolation and optional plotting.

clone_size(vdj[, group_by, max_size, ...])

Quantify clone sizes, globally or per group.

clone_view(adata[, mode, ...])

Swap the 'active' connectivities, distances, and optionally embedding in AnnData.

concat(arrays[, check_unique, ...])

Concatenate data frames and return as Dandelion object.

define_clones(vdj, dist[, action, model, ...])

Find clones using changeo's DefineClones.py.

extract_edge_weights(vdj[, expanded_only])

Retrieve edge weights from graph.

find_clones(vdj[, identity, hard_cutoff, ...])

Find clones based on VDJ chain and VJ chain CDR3 junction hamming distance.

from_scirpy(data)

Convert data from scirpy format to Dandelion format.

generate_network(vdj[, adata, key, ...])

Generate a Levenshtein distance network based on VDJ and VJ sequences.

productive_ratio(adata, vdj, group_by[, ...])

Compute the cell-level productive/non-productive contig ratio.

project_pseudotime_to_cell(adata, pb_adata, ...)

Function to project pseudotime & branch probabilities from pb_adata (pseudobulk adata) to adata (cell adata).

pseudobulk_gex(adata_raw[, pbs, ...])

Function to pseudobulk gene expression (raw count).

pseudotime_transfer(adata, pr_res[, suffix])

Function to add pseudotime and branch probabilities into adata.obs in place.

setup_vdj_pseudobulk(adata, vdj[, mode, ...])

Function to prepare AnnData for computing pseudobulk vdj feature space.

to_scirpy(data[, transfer, to_mudata, ...])

Convert Dandelion data to scirpy-compatible format.

transfer(adata, vdj[, main_view, gex_key, ...])

vdj_pseudobulk(adata[, vdj, pbs, ...])

Function for making pseudobulk vdj feature space.

vdj_sample(vdj_data[, size, adata, p, ...])

Resample vdj data and corresponding AnnData to a specified size.

vj_usage_pca(adata, vdj, group_by[, ...])

Extract productive V/J gene usage from single cell data and compute PCA.

Plotting: pl

barplot(data, color[, palette, figsize, ...])

A barplot function to plot usage of V/J genes in the data.

clone_circlepackplot(data, group_by[, ...])

A bubble plot to visualise clone sizes within groups using circle packing.

clone_network(adata[, basis, edges])

Using scanpy's plotting module to plot the network.

clone_overlap(adata, group_by[, color_by, ...])

A plot function to visualise clonal overlap as a circos-style plot.

productive_ratio(adata[, figsize, palette, ...])

Plot productive/non-productive contig ratio from AnnData (cell level).

spectratype(vdj, color, group_by, locus[, ...])

A spectratype function to plot usage of CDR3 length.

stackedbarplot(data, color, group_by[, ...])

A stacked bar plot function to plot usage of V/J genes in the data split by groups.

Reading: io

read(filename[, distance_zarr, verbose])

Read a Dandelion object from a .zipddl file (hybrid Zarr v3 ZipStore container).

read_10x_airr(file[, prefix, suffix, sep, ...])

Read the airr_rearrangement.tsv produced from Cell Ranger directly and returns a DandelionPolars object.

read_10x_vdj([data, filename_prefix, ...])

A parser to read .csv and .json files directly from folder containing 10x cellranger-outputs, or parse an existing pandas/polars DataFrame.

read_airr(file[, prefix, suffix, sep, ...])

Reads a standard single-cell AIRR rearrangement file.

read_bd_airr(file[, prefix, suffix, sep, ...])

Read the TCR or BCR _AIRR.tsv produced from BD Rhapsody technology.

read_parse_airr(file[, prefix, suffix, sep, ...])

Read the TCR or BCR _annotation_airr.tsv produced from Parse Biosciences Evercode technology.

read_seekgene_vdj([data, filename_prefix, ...])

A parser to read .csv and .json files directly from folder containing SeekGene VDJ outputs, or parse an existing pandas/polars DataFrame.

read_ddl(filename[, distance_zarr, verbose])

Read a Dandelion object from a .zipddl file (hybrid Zarr v3 ZipStore container).

read_h5ddl([filename, distance_zarr, verbose])

Read in and returns a Dandelion class from .h5ddl format.

read_zipddl(filename[, distance_zarr, verbose])

Read a Dandelion object from a .zipddl file (hybrid Zarr v3 ZipStore container).

Dandelion Class

Dandelion.add_cell_prefix(prefix[, sync, ...])

Add prefix to cell_id and optionally to sequence_id.

Dandelion.add_cell_suffix(suffix[, sync, ...])

Add suffix to cell_id and optionally to sequence_id.

Dandelion.add_sequence_prefix(prefix[, ...])

Add prefix to sequence_id and then apply to cell_id as well.

Dandelion.add_sequence_suffix(suffix[, ...])

Add suffix to sequence_id and then apply to cell_id as well.

Dandelion.clone()

Polars-style clone: duplicate frames and state without sharing cache handles.

Dandelion.compute()

Convert self.distances to a concrete csr matrix.

Dandelion.copy()

Performs a deep copy of all slots in Dandelion class.

Dandelion.data

One-dimensional annotation of contig observations.

Dandelion.data_names

Names of contig observations.

Dandelion.metadata

One-dimensional annotation of cell observations.

Dandelion.metadata_names

Names of cell observations.

Dandelion.n_contigs

Number of contigs.

Dandelion.n_obs

Number of observations.

Dandelion.reset_ids()

Reset both IDs to their original values.

Dandelion.simplify(**kwargs)

Disambiguate VDJ and C gene calls when there's multiple calls separated by commas and strip the alleles.

Dandelion.store_germline_reference([...])

Update germline reference with corrected sequences and store in Dandelion object.

Dandelion.to_anndata()

Convert DandelionPolars.metadata to AnnData.

Dandelion.to_eager()

Convert lazy slots to eager slots.

Dandelion.to_lazy(*[, chunks])

Convert eager slots to lazy slots.

Dandelion.to_pandas()

Convert self from Polars to Pandas implementation.

Dandelion.to_polars([lazy])

Convert self from Pandas to Polars implementation.

Dandelion.update_data([skip])

Sync metadata columns into data via dictionary mapping.

Dandelion.update_metadata([retrieve, ...])

A Dandelion initialisation function to update and populate the .metadata slot.

Dandelion.update_plus([option])

Retrieve additional data columns that are useful.

Dandelion.write([filename, compress])

Write a Dandelion object to a single .zipddl file (Zarr v3 ZipStore, hybrid storage) with optional compression.

Dandelion.write_10x([folder, ...])

Alias for write_vdj() kept for backwards compatibility.

Dandelion.write_airr([filename])

Writes a Dandelion class to AIRR formatted .tsv format.

Dandelion.write_ddl([filename, compress])

Write a Dandelion object to a single .zipddl file (Zarr v3 ZipStore, hybrid storage) with optional compression.

Dandelion.write_h5ddl([filename, ...])

Write a Dandelion object to .h5ddl format.

Dandelion.write_vdj([folder, ...])

Writes a DandelionPolars object to contig-annotation formatted files compatible with multiple platforms (10x Genomics, SeekGene, etc.) so that it can be ingested by other tools.

Dandelion.write_zipddl([filename, compress])

Write a Dandelion object to a single .zipddl file (Zarr v3 ZipStore, hybrid storage) with optional compression.

Utilities

extract_edge_weights(vdj[, expanded_only])

Retrieve edge weights from graph.

makeblastdb(ref)

Run makeblastdb on constant region fasta file.

Tutorial

setup_dandelion_tutorial_bcr([path])

Download example BCR datasets for Dandelion tutorial.

setup_dandelion_tutorial_parse([path])

Download the extremely large dataset from Parse Biosciences for Dandelion tutorial.

setup_dandelion_tutorial_tcr([path])

Download example TCR datasets for Dandelion tutorial.

setup_dandelion_tutorial_trajectory([path])

Download example datasets for Dandelion V(D)J trajectory tutorial.

Set Backend

set_backend(mode)

Override the active backend at runtime.

Logging

print_header([dependencies])

Versions that are essential for dandelion's operation.

print_versions([dependencies])

Versions that are essential for dandelion's operation.

External

scanpy

recipe_scanpy_qc(adata[, layer, ...])

Recipe for running a standard scanpy QC workflow.

Immcantation

Wrappers for tools in Immcantation pipeline.

Base

changeo

assigngenes_igblast(fasta[, igblast_db, ...])

Reannotate with IgBLASTn.

creategermlines(airr_file[, germline, org, ...])

Wrapper for CreateGermlines.py for reconstructing germline sequences.

makedb_igblast(fasta[, igblast_output, ...])

Parse IgBLAST output to AIRR format.

parsedb_heavy(airr_file)

Parse AIRR tsv file (heavy chain contigs only).

parsedb_light(airr_file)

Parse AIRR tsv file (light chain contigs only).

shazam

calculate_threshold(data[, mode, ...])

Calculating nearest neighbor distances for tuning clonal assignment with shazam.

quantify_mutations(data[, split_locus, ...])

Run basic mutation load analysis.

scoper

identical_clones(vdj[, method, junction, ...])

Clonal assignment using sequence identity partitioning.

hierarchical_clones(vdj, threshold[, ...])

Hierarchical clustering approach to clonal assignment.

spectral_clones(vdj[, method, germline, ...])

Spectral clustering method for clonal partitioning.

Polars

changeo

assigngenes_igblast(fasta[, igblast_db, ...])

Reannotate with IgBLASTn.

creategermlines(airr_file[, germline, org, ...])

Wrapper for CreateGermlines.py for reconstructing germline sequences.

makedb_igblast(fasta[, igblast_output, ...])

Parse IgBLAST output to AIRR format.

parsedb_heavy(airr_file)

Parse AIRR tsv file (heavy chain contigs only).

parsedb_light(airr_file)

Parse AIRR tsv file (light chain contigs only).

shazam

calculate_threshold(data[, mode, ...])

Calculating nearest neighbor distances for tuning clonal assignment with shazam.

quantify_mutations(data[, split_locus, ...])

Run basic mutation load analysis.

scoper

identical_clones(vdj[, method, junction, ...])

Clonal assignment using sequence identity partitioning with Polars.

hierarchical_clones(vdj, threshold[, ...])

Hierarchical clustering approach to clonal assignment with Polars.

spectral_clones(vdj[, method, germline, ...])

Spectral clustering method for clonal partitioning with Polars.

tigger

tigger_genotype(airr_file[, v_germline, ...])

Reassign alleles with TIgGER in R.