dandelion.polars.preprocessing.assign_isotype
- dandelion.polars.preprocessing.assign_isotype(fasta, org='human', evalue=0.0001, correct_c_call=True, correction_dict=None, plot=True, save_plot=False, show_plot=True, figsize=(4, 4), blastdb=None, filename_prefix=None, additional_args=[])[source]
Annotate contigs with constant region call using blastn.
- Parameters:
fasta (Path | str) – path to fasta file.
org (Literal[“human”, “mouse”], optional) – organism of reference folder.
evalue (float, optional) – This is the statistical significance threshold for reporting matches against database sequences. Lower EXPECT thresholds are more stringent and report only high similarity matches. Choose higher EXPECT value (for example 1 or more) if you expect a low identity between your query sequence and the targets.
correct_c_call (bool, optional) – whether or not to adjust the c_calls after blast based on provided primers specified in primer_dict option.
correction_dict (dict[str, dict[str, str]] | None, optional) – a nested dictionary contain isotype/c_genes as keys and primer sequences as records to use for correcting annotated c_calls. Defaults to a curated dictionary for human sequences if left as none.
plot (bool, optional) – whether or not to plot reassignment summary metrics.
save_plot (bool, optional) – whether or not to save plot.
show_plot (bool, optional) – whether or not to show plot.
figsize (tuple[float, float], optional) – size of figure.
blastdb (Path | str | None, optional) – path to blast database. Defaults to $BLASTDB environmental variable.
filename_prefix (str | None, optional) – prefix of file name preceding ‘_contig’. None defaults to ‘all’.
additional_args (list[str], optional) – additional arguments to pass to blastn.
- Raises:
FileNotFoundError – if path to fasta file is unknown.