changeo.Parsers

Alignment tool parsing functions

class changeo.Parsers.IHMMuneReader(ihmmune, seq_dict, repo_dict, parse_scores=False, parse_regions=False, ig=True)

Bases: object

An iterator to read and parse iHMMune-Align output files.

__iter__()

Iterator initializer.

Returns:changeo.Parsers.IHMMuneReader
__next__()

Next method.

Returns:parsed IMGT/HighV-QUEST result as an IgRecord (ig=True) or dictionary (ig=False).
Return type:changeo.Receptor.IgRecord
fields

List of ordered output field names.

ihmmune_fields = ['SEQUENCE_ID', 'V_CALL', 'D_CALL', 'J_CALL', 'V_SEQ', 'NP1_SEQ', 'D_SEQ', 'NP2_SEQ', 'J_SEQ', 'V_MUT', 'D_MUT', 'J_MUT', 'NX_COUNT', 'J_INFRAME', 'V_SEQ_START', 'STOP_COUNT', 'D_PROB', 'HMM_SCORE', 'RC', 'COMMON_MUT', 'COMMON_NX_COUNT', 'V_SEQ_START2', 'V_SEQ_LENGTH', 'A_SCORE']
parseRecord(record)

Parses a single row from each IMTG file.

Parameters:record – dictionary containing one row of iHMMune-Align file.
Returns:database entry for the row.
Return type:dict
class changeo.Parsers.IMGTReader(summary, gapped, ntseq, junction, parse_scores=False, parse_regions=False, parse_junction=False, ig=True)

Bases: object

An iterator to read and parse IMGT output files.

__iter__()

Iterator initializer.

Returns:changeo.Parsers.IgBLASTReader
__next__()

Next method.

Returns:parsed IMGT/HighV-QUEST result as an IgRecord (ig=True) or dictionary (ig=False).
Return type:changeo.Receptor.IgRecord
fields

List of ordered output field names.

parseRecord(summary, gapped, ntseq, junction)

Parses a single row from each IMTG file.

Parameters:
  • summary – dictionary containing one row of the ‘1_Summary’ file.
  • gapped – dictionary containing one row of the ‘2_IMGT-gapped-nt-sequences’ file.
  • ntseq – dictionary containing one row of the ‘3_Nt-sequences’ file.
  • junction – dictionary containing one row of the ‘6_Junction’ file.
Returns:

database entry for the row.

Return type:

dict

class changeo.Parsers.IgBLASTReader(igblast, seq_dict, repo_dict, parse_scores=False, parse_regions=False, parse_igblast_cdr3=False, ig=True)

Bases: object

An iterator to read and parse IgBLAST output files

__iter__()

Iterator initializer.

Returns:changeo.Parsers.IgBLASTReader
__next__()

Next method.

Returns:parsed IMGT/HighV-QUEST result as an IgRecord (ig=True) or dictionary (ig=False).
Return type:changeo.Receptor.IgRecord
fields

List of ordered output field names.

parseBlock(block)

Parses an IgBLAST result into separate sections

Parameters:block – an iterator from itertools.groupby containing a single IgBLAST result.
Returns:
a parsed results block;
with the keys ‘query’ (sequence identifier as a string), ‘summary’ (dictionary of the alignment summary), ‘subregion’ (dictionary of IgBLAST CDR3 sequences), and ‘hits’ (VDJ hit table as a list of dictionaries). Returns None if the block has no data that can be parsed.
Return type:dict
parseSections(sections)

Parses an IgBLAST sections into a db dictionary

Parameters:sections – dictionary of parsed sections from parseBlock.
Returns:db entries.
Return type:dict
changeo.Parsers.decodeBTOP(btop)

Parse a BTOP string into a list of tuples.

Parameters:btop – BTOP string.
Returns:tuples of (type, length) for each operation in the BTOP string.
Return type:list
changeo.Parsers.decodeCIGAR(cigar)

Parse a CIGAR string into a list of tuples.

Parameters:cigar – CIGAR string.
Returns:tuples of (type, length) for each operation in the CIGAR string.
Return type:list
changeo.Parsers.encodeCIGAR(alignment)

Encodes a list of tuple with alignment information into a CIGAR string.

Parameters:alignment – tuples of (type, length) for each alignment operation.
Returns:CIGAR string.
Return type:str
changeo.Parsers.gapV(db, repo_dict)

Construction IMGT-gapped V-region sequences.

Parameters:
  • db – database dictionary of parsed IgBLAST.
  • repo_dict – dictionary of IMGT-gapped reference sequences.
Returns:

database entries containing IMGT-gapped query sequences and germline positions.

Return type:

dict

changeo.Parsers.getIDforIMGT(seq_file)

Create a sequence ID translation using IMGT truncation.

Parameters:seq_file – a fasta file of sequences input to IMGT.
Returns:a dictionary of with the IMGT truncated ID as the key and the full sequence description as the value.
Return type:dict
changeo.Parsers.getRegions(db)

Identify FWR and CDR regions by IMGT definition.

Parameters:db – database dictionary of parsed alignment output.
Returns:database entries containing FWR and CDR sequences.
Return type:dict
changeo.Parsers.inferJunction(db, repo_dict)

Identify junction region by IMGT definition.

Parameters:
  • db – database dictionary of parsed IgBLAST.
  • repo_dict – dictionary of IMGT-gapped reference sequences.
Returns:

database entries containing junction sequence and length.

Return type:

dict