dandelion.polars.preprocessing.assign_isotypes
- dandelion.polars.preprocessing.assign_isotypes(fastas, org='human', evalue=10000.0, correct_c_call=True, correction_dict=None, plot=True, save_plot=False, show_plot=True, figsize=(4, 4), blastdb=None, filename_prefix=None, additional_args=[])[source]
Annotate contigs with constant region call using blastn.
- Parameters:
fastas (list[Path | str] | Path | str) – path(s) to fasta file(s).
org (Literal[“human”, “mouse”], optional) – organism of reference folder.
evalue (float, optional) – This is the statistical significance threshold for reporting matches against database sequences. Lower EXPECT thresholds are more stringent and report only high similarity matches. Choose higher EXPECT value (for example 1 or more) if you expect a low identity between your query sequence and the targets.
correct_c_call (bool, optional) – whether or not to adjust the c_calls after blast based on provided primers specified in primer_dict option.
correction_dict (dict[str, dict[str, str]] | None, optional) – a nested dictionary contain isotype/c_genes as keys and primer sequences as records to use for correcting annotated c_calls. Defaults to a curated dictionary for human sequences if left as none.
plot (bool, optional) – whether or not to plot reassignment summary metrics.
save_plot (bool, optional) – whether or not to save plots.
show_plot (bool, optional) – whether or not to show plots.
figsize (tuple[float, float], optional) – size of figure.
blastdb (Path | str | None, optional) – path to blast database. Defaults to $BLASTDB environmental variable.
filename_prefix (list[str] | str | None, optional) – list of prefixes of file names preceding ‘_contig’. None defaults to ‘all’.
additional_args (list[str], optional) – additional arguments to pass to blastn.