dandelion.base.tools.vj_usage_pca

dandelion.base.tools.vj_usage_pca(adata, group_by, min_size=20, mode='abT', use_vdj_v=True, use_vdj_j=True, use_vj_v=True, use_vj_j=True, transfer_mapping=None, n_comps=30, groups=None, allowed_chain_status=['Single pair', 'Extra pair', 'Extra pair-exception', 'Orphan VDJ-exception'], verbose=False, **kwargs)[source]

Extract productive V/J gene usage from single cell data and compute PCA.

Parameters:
  • adata (AnnData) – AnnData object holding the cell level metadata with Dandelion VDJ info transferred.

  • group_by (str) – Column name in adata.obs to group_by as observations for PCA.

  • min_size (int, optional) – Minimum cell size numbers to keep for computing the final matrix. Defaults to 20.

  • mode (Literal[“B”, “abT”, “gdT”], optional) – Mode for extract the V/J genes.

  • use_vdj_v (bool, optional) – Whether to use V gene from VDJ contigs for tabulation. Defaults to True.

  • use_vdj_j (bool, optional) – Whether to use J gene from VDJ contigs for tabulation. Defaults to True.

  • use_vj_v (bool, optional) – Whether to use V genes from VJ contigs for tabulation. Defaults to True.

  • use_vj_j (bool, optional) – Whether to use J genes from VJ contigs for tabulation. Defaults to True.

  • transfer_mapping (None, optional) – If provided, the columns will be mapped to the output AnnData from the original AnnData.

  • n_comps (int, optional) – Number of principal components to compute. Defaults to 30.

  • groups (list[str] | None, optional) – If provided, only the following groups/categories will be used for computing the PCA.

  • allowed_chain_status (list[str] | None, optional) – If provided, only the ones in this list are kept from the chain_status column. Defaults to [“Single pair”, “Extra pair”, “Extra pair-exception”, “Orphan VDJ-exception”].

  • verbose (bool, optional) – Whether to display progress

  • **kwargs – Additional keyword arguments passed to scanpy.pp.pca.

Returns:

AnnData object with obs as groups and V/J genes as features.

Return type:

AnnData

Raises:

ValueError – if none of use_vdj_v, use_vdj_j, use_vj_v, use_vj_j is True.