dandelion.polars.io.read_seekgene_vdj

dandelion.polars.io.read_seekgene_vdj(data=None, filename_prefix=None, prefix=None, suffix=None, sep='_', remove_malformed=True, remove_trailing_hyphen_number=False, verbose=False)[source]

A parser to read .csv and .json files directly from folder containing SeekGene VDJ outputs, or parse an existing pandas/polars DataFrame.

SeekGene produces contig annotation files in the same format as 10x CellRanger VDJ output. This function is a convenience wrapper around read_10x_vdj() with SeekGene-specific naming for clarity.

Minimum requirement is one of either {filename_prefix}_contig_annotations.csv or all_contig_annotations.json when reading from a file path.

If .fasta, .json files are found in the same folder, additional info will be appended to the final table.

Parameters:
  • data (Path | str | pandas.DataFrame | polars.DataFrame | polars.LazyFrame | None) – path to folder containing .csv and/or .json files, path to files directly, or a pandas/polars DataFrame containing the contig annotations data.

  • filename_prefix (str | None, optional) – prefix of file name preceding ‘_contig’. None defaults to ‘all’. Only used when data is a file/folder.

  • prefix (str | None, optional) – Prefix to append to sequence_id and cell_id.

  • suffix (str | None, optional) – Suffix to append to sequence_id and cell_id.

  • sep (str, optional) – the separator to append suffix/prefix.

  • remove_malformed (bool, optional) – whether or not to remove malformed contigs.

  • remove_trailing_hyphen_number (bool, optional) – whether or not to remove the trailing hyphen number e.g. ‘-1’ from the cell/contig barcodes.

  • verbose (bool, optional) – whether or not to print messages during creation of the DandelionPolars object.

Returns:

DandelionPolars object holding the parsed data.

Return type:

DandelionPolars

Raises:
  • OSError – if contig_annotations.csv and all_contig_annotations.json file(s) not found in the input folder.

  • TypeError – if data is not a valid type (Path, str, DataFrame, or LazyFrame).