dandelion.base.io.read_10x_vdj

dandelion.base.io.read_10x_vdj(data, filename_prefix=None, prefix=None, suffix=None, sep='_', remove_malformed=True, remove_trailing_hyphen_number=False, verbose=False)[source]

A parser to read .csv and .json files directly from folder containing 10x cellranger-outputs, or parse an existing pandas DataFrame.

This function parses the 10x output files into an AIRR compatible format.

Minimum requirement is one of either {filename_prefix}_contig_annotations.csv or all_contig_annotations.json when reading from file path.

If .fasta, .json files are found in the same folder, additional info will be appended to the final table.

Parameters:

data (Path | str | pandas.DataFrame) – path to folder containing .csv and/or .json files, path to files directly, or a pandas DataFrame containing the contig annotations data.
filename_prefix (str | None, optional) – prefix of file name preceding ‘_contig’. None defaults to ‘all’. Only used when data is a file/folder.
prefix (str | None, optional) – Prefix to append to sequence_id and cell_id.
suffix (str | None, optional) – Suffix to append to sequence_id and cell_id.
sep (str, optional) – the separator to append suffix/prefix.
remove_malformed (bool, optional) – whether or not to remove malformed contigs.
remove_trailing_hyphen_number (bool, optional) – whether or not to remove the trailing hyphen number e.g. ‘-1’ from the cell/contig barcodes.
verbose (bool, optional) – whether or not to print messages during creation of the Dandelion object.

Returns:

Dandelion object holding the parsed data.

Return type:

Dandelion

Raises:

OSError – if contig_annotations.csv and all_contig_annotations.json file(s) not found in the input folder.
TypeError – if data is not a valid type (Path, str, or pandas.DataFrame).