changeo.Alignment

Alignment manipulation

class changeo.Alignment.RegionDefinition(junction_length, amino_acid=False, definition='default')

Bases: object

FWR and CDR region boundary definitions

getRegions(seq)

Return IMGT defined FWR and CDR regions

Parameters:

seq – IMGT-gapped sequence.

Returns:

dictionary of FWR and CDR sequences.

Return type:

dict

changeo.Alignment.alignmentPositions(alignment)

Extracts start position and length from an alignment

Parameters:

alignment – tuples of (operation, length) for each alignment operation.

Returns:

query (q) and reference (r) start (0-based) and length information with keys

{q_start, q_length, r_start, r_length}.

Return type:

dict

changeo.Alignment.decodeBTOP(btop)

Parse a BTOP string into a list of tuples in CIGAR annotation.

Parameters:

btop – BTOP string.

Returns:

tuples of (operation, length) for each operation in the BTOP string using CIGAR annotation.

Return type:

list

changeo.Alignment.decodeCIGAR(cigar)

Parse a CIGAR string into a list of tuples.

Parameters:

cigar – CIGAR string.

Returns:

tuples of (operation, length) for each operation in the CIGAR string.

Return type:

list

changeo.Alignment.encodeCIGAR(alignment)

Encodes a list of tuple with alignment information into a CIGAR string.

Parameters:

tuple – tuples of (type, length) for each alignment operation.

Returns:

CIGAR string.

Return type:

str

changeo.Alignment.gapV(seq, v_germ_start, v_germ_length, v_call, references, asis_calls=False)

Construction IMGT-gapped V segment sequences.

Parameters:
  • seq (str) – V(D)J sequence alignment (SEQUENCE_VDJ).

  • v_germ_start (int) – start position V segment alignment in the germline (V_GERM_START_VDJ, 1-based).

  • v_germ_length (int) – length of the V segment alignment against the germline (V_GERM_LENGTH_VDJ, 1-based).

  • v_call (str) – V segment allele assignment (V_CALL).

  • references (dict) – dictionary of IMGT-gapped reference sequences.

  • asis_calls (bool) – if True do not parse v_call for allele names and just split by comma.

Returns:

dictionary containing IMGT-gapped query sequences and germline positions.

Return type:

dict

Raises:

KeyError – raised if the v_call is not found in the reference dictionary.

changeo.Alignment.getRegions(seq, junction_length)

Identify FWR and CDR regions by IMGT definition.

Parameters:
  • seq – IMGT-gapped sequence.

  • junction_length – length of the junction region in nucleotides.

Returns:

dictionary of FWR and CDR sequences.

Return type:

dict

changeo.Alignment.inferJunction(seq, j_germ_start, j_germ_length, j_call, references, asis_calls=False, regions='default')

Identify junction region by IMGT definition.

Parameters:
  • seq (str) – IMGT-gapped V(D)J sequence alignment (SEQUENCE_IMGT).

  • j_germ_start (int) – start position J segment alignment in the germline (J_GERM_START, 1-based).

  • j_germ_length (int) – length of the J segment alignment against the germline (J_GERM_LENGTH).

  • j_call (str) – J segment allele assignment (J_CALL).

  • references (dict) – dictionary of IMGT-gapped reference sequences.

  • asis_calls (bool) – if True do not parse V_CALL for allele names and just split by comma.

  • regions (str) – name of the IMGT FWR/CDR region definitions to use.

Returns:

dictionary containing junction sequence, translation and length.

Return type:

dict

changeo.Alignment.padAlignment(alignment, q_start, r_start)

Pads the start of an alignment based on query and reference positions.

Parameters:
  • alignment – tuples of (operation, length) for each alignment operation.

  • q_start – query (input) start position (0-based)

  • r_start – reference (subject) start position (0-based)

Returns:

updated list of tuples of (operation, length) for the alignment.

Return type:

list