dandelion.tools.generate_network

dandelion.tools.generate_network(vdj_data, key=None, clone_key=None, min_size=2, downsample=None, verbose=True, compute_layout=True, layout_method='sfdp', expanded_only=False, use_existing_graph=True, num_cores=1, **kwargs)[source]

Generate a Levenshtein distance network based on full length VDJ sequence alignments for heavy and light chain(s).

The distance matrices are then combined into a singular matrix.

Parameters:
  • vdj_data (Dandelion | pd.DataFrame | str) – Dandelion object, pandas DataFrame in changeo/airr format, or file path to changeo/airr file after clones have been determined.

  • key (str | None, optional) – column name for distance calculations. None defaults to ‘sequence_alignment_aa’.

  • clone_key (str | None, optional) – column name to build network on.

  • min_size (int, optional) – For visualization purposes, two graphs are created where one contains all cells and a trimmed second graph. This value specifies the minimum number of edges required otherwise node will be trimmed in the secondary graph.

  • downsample (int | None, optional) – whether or not to downsample the number of cells prior to construction of network. If provided, cells will be randomly sampled to the integer provided. A new Dandelion class will be returned.

  • verbose (bool, optional) – whether or not to print the progress bars.

  • compute_layout (bool, optional) – whether or not to generate the layout. May be time consuming if too many cells.

  • layout_method (Literal[“sfdp”, “mod_fr”], optional) – accepts one of ‘sfdp’ or ‘mod_fr’. ‘sfdp’ refers to sfdp_layout from graph_tool (C++ implementation; fast) whereas ‘mod_fr’ refers to modified Fruchterman-Reingold layout originally implemented in dandelion (python implementation; slow).

  • expanded_only (bool, optional) – whether or not to only compute layout on expanded clonotypes.

  • use_existing_graph (bool, optional) – whether or not to just compute the layout using the existing graph if it exists in the Dandelion object.

  • num_cores (int, optional) – if more than 1, parallelise the minimum spanning tree calculation step.

  • **kwargs – additional kwargs passed to options specified in networkx.drawing.layout.spring_layout or graph_tool.draw.sfdp_layout.

Returns:

Dandelion object with .edges, .layout, .graph initialized.

Return type:

Dandelion

Raises:

ValueError – if any errors with dandelion input.