dandelion.polars.preprocessing.assign_isotype

dandelion.polars.preprocessing.assign_isotype(fasta, org='human', evalue=0.0001, correct_c_call=True, correction_dict=None, plot=True, save_plot=False, show_plot=True, figsize=(4, 4), blastdb=None, filename_prefix=None, additional_args=[])[source]

Annotate contigs with constant region call using blastn.

Parameters:
  • fasta (Path | str) – path to fasta file.

  • org (Literal[“human”, “mouse”], optional) – organism of reference folder.

  • evalue (float, optional) – This is the statistical significance threshold for reporting matches against database sequences. Lower EXPECT thresholds are more stringent and report only high similarity matches. Choose higher EXPECT value (for example 1 or more) if you expect a low identity between your query sequence and the targets.

  • correct_c_call (bool, optional) – whether or not to adjust the c_calls after blast based on provided primers specified in primer_dict option.

  • correction_dict (dict[str, dict[str, str]] | None, optional) – a nested dictionary contain isotype/c_genes as keys and primer sequences as records to use for correcting annotated c_calls. Defaults to a curated dictionary for human sequences if left as none.

  • plot (bool, optional) – whether or not to plot reassignment summary metrics.

  • save_plot (bool, optional) – whether or not to save plot.

  • show_plot (bool, optional) – whether or not to show plot.

  • figsize (tuple[float, float], optional) – size of figure.

  • blastdb (Path | str | None, optional) – path to blast database. Defaults to $BLASTDB environmental variable.

  • filename_prefix (str | None, optional) – prefix of file name preceding ‘_contig’. None defaults to ‘all’.

  • additional_args (list[str], optional) – additional arguments to pass to blastn.

Raises:

FileNotFoundError – if path to fasta file is unknown.