changeo.IO
File I/O and parsers
- class changeo.IO.AIRRReader(handle)
Bases:
TSVReader
An iterator to read and parse AIRR formatted data.
- class changeo.IO.AIRRWriter(handle, fields=['sequence_id', 'sequence', 'sequence_alignment', 'germline_alignment', 'rev_comp', 'productive', 'stop_codon', 'vj_in_frame', 'locus', 'v_call', 'd_call', 'j_call', 'c_call', 'junction', 'junction_length', 'junction_aa', 'np1_length', 'np2_length', 'v_sequence_start', 'v_sequence_end', 'v_germline_start', 'v_germline_end', 'd_sequence_start', 'd_sequence_end', 'd_germline_start', 'd_germline_end', 'j_sequence_start', 'j_sequence_end', 'j_germline_start', 'j_germline_end'])
Bases:
TSVWriter
Writes AIRR formatted data.
- writeReceptor(records)
Writes a row from a Receptor object
- Parameters:
records – a changeo.Receptor object to write or iterable of such objects.
- Returns:
None
- class changeo.IO.ChangeoReader(handle)
Bases:
TSVReader
An iterator to read and parse Change-O formatted data.
- class changeo.IO.ChangeoWriter(handle, fields=['SEQUENCE_ID', 'SEQUENCE_INPUT', 'FUNCTIONAL', 'IN_FRAME', 'STOP', 'MUTATED_INVARIANT', 'INDELS', 'LOCUS', 'V_CALL', 'D_CALL', 'J_CALL', 'SEQUENCE_VDJ', 'SEQUENCE_IMGT', 'V_SEQ_START', 'V_SEQ_LENGTH', 'V_GERM_START_VDJ', 'V_GERM_LENGTH_VDJ', 'V_GERM_START_IMGT', 'V_GERM_LENGTH_IMGT', 'NP1_LENGTH', 'D_SEQ_START', 'D_SEQ_LENGTH', 'D_GERM_START', 'D_GERM_LENGTH', 'NP2_LENGTH', 'J_SEQ_START', 'J_SEQ_LENGTH', 'J_GERM_START', 'J_GERM_LENGTH', 'JUNCTION', 'JUNCTION_LENGTH', 'GERMLINE_IMGT'], header=True)
Bases:
TSVWriter
Writes Change-O formatted data.
- writeReceptor(records)
Writes a row from a Receptor object
- Parameters:
records – a changeo.Receptor.Receptor object to write or an iterable of such objects.
- Returns:
None
- class changeo.IO.IHMMuneReader(ihmmune, sequences, references, receptor=True)
Bases:
object
An iterator to read and parse iHMMune-Align output files.
- __iter__()
Iterator initializer.
- Returns:
changeo.IO.IHMMuneReader
- __next__()
Next method.
- Returns:
parsed IMGT/HighV-QUEST result as an Receptor (receptor=True) or dictionary (receptor=False).
- Return type:
- static customFields(scores=False, regions=False, cell=False, schema=None)
Returns non-standard Receptor attributes defined by the parser
- Parameters:
scores – if True include alignment scoring fields.
regions – if True include IMGT-gapped CDR and FWR region fields.
schema – schema class to pass field through for conversion. If None, return changeo.Receptor.Receptor attribute names.
- Returns:
list of field names.
- Return type:
- ihmmune_fields = ['SEQUENCE_ID', 'V_CALL', 'D_CALL', 'J_CALL', 'V_SEQ', 'NP1_SEQ', 'D_SEQ', 'NP2_SEQ', 'J_SEQ', 'V_MUT', 'D_MUT', 'J_MUT', 'NX_COUNT', 'J_INFRAME', 'V_SEQ_START', 'STOP_COUNT', 'D_PROB', 'HMM_SCORE', 'RC', 'COMMON_MUT', 'COMMON_NX_COUNT', 'V_SEQ_START', 'V_SEQ_LENGTH', 'A_SCORE']
- class changeo.IO.IMGTReader(summary, gapped, ntseq, junction, receptor=True)
Bases:
object
An iterator to read and parse IMGT output files.
- __iter__()
Iterator initializer.
- Returns:
changeo.IO.IMGTReader
- __next__()
Next method.
- Returns:
parsed IMGT/HighV-QUEST result as an Receptor (receptor=True) or dictionary (receptor=False).
- Return type:
- static customFields(scores=False, regions=False, junction=False, schema=None)
Returns non-standard fields defined by the parser
- Parameters:
scores – if True include alignment scoring fields.
regions – if True include IMGT-gapped CDR and FWR region fields.
junction – if True include detailed junction annotation fields.
schema – schema class to pass field through for conversion. If None, return changeo.Receptor.Receptor attribute names.
- Returns:
list of field names.
- Return type:
- parseRecord(summary, gapped, ntseq, junction)
Parses a single row from each IMTG file.
- Parameters:
summary – dictionary containing one row of the ‘1_Summary’ file.
gapped – dictionary containing one row of the ‘2_IMGT-gapped-nt-sequences’ file.
ntseq – dictionary containing one row of the ‘3_Nt-sequences’ file.
junction – dictionary containing one row of the ‘6_Junction’ file.
- Returns:
database entry for the row.
- Return type:
- class changeo.IO.IgBLASTReader(igblast, sequences, references, asis_calls=False, regions='default', receptor=True, infer_junction=False)
Bases:
object
An iterator to read and parse IgBLAST output files
- __iter__()
Iterator initializer.
- Returns:
changeo.IO.IgBLASTReader
- __next__()
Next method.
- Returns:
parsed IMGT/HighV-QUEST result as an Receptor (receptor=True) or dictionary (receptor=False).
- Return type:
- static customFields(schema=None)
Returns non-standard fields defined by the parser
- Parameters:
schema – schema class to pass field through for conversion. If None, return changeo.Receptor.Receptor attribute names.
- Returns:
list of field names.
- Return type:
- parseBlock(block)
Parses an IgBLAST result into separate sections
- Parameters:
block (iter) – an iterator from itertools.groupby containing a single IgBLAST result.
- Returns:
- a parsed results block;
with the keys ‘query’ (sequence identifier as a string), ‘summary’ (dictionary of the alignment summary), ‘subregion’ (dictionary of IgBLAST CDR3 sequences), and ‘hits’ (VDJ hit table as a list of dictionaries). Returns None if the block has no data that can be parsed.
- Return type:
- class changeo.IO.IgBLASTReaderAA(igblast, sequences, references, asis_calls=False, regions='default', receptor=True, infer_junction=False)
Bases:
IgBLASTReader
An iterator to read and parse IgBLAST amino acid alignment output files
- static customFields(schema=None)
Returns non-standard fields defined by the parser
- Parameters:
schema – schema class to pass field through for conversion. If None, return changeo.Receptor.Receptor attribute names.
- Returns:
list of field names.
- Return type:
- class changeo.IO.TSVReader(handle)
Bases:
object
Simple csv.DictReader wrapper to read format agnostic TSV files.
- reader
reader object.
- Type:
iter
- __iter__()
Iterator initializer
- Returns:
changeo.IO.TSVReader
- __next__()
Next method
- Returns:
row as a dictionary of field:value pairs.
- Return type:
dist
- class changeo.IO.TSVWriter(handle, fields, header=True)
Bases:
object
Simple csv.DictWriter wrapper to write format agnostic TSV files.
- writeDict(records)
Writes a row from a dictionary
- Parameters:
records – dictionary of row data or an iterable of such objects.
- Returns:
None
- writeHeader()
Writes the header
- Returns:
None
- changeo.IO.checkFields(attributes, header, schema=<class 'changeo.Receptor.AIRRSchema'>)
Checks that a file header contains a required set of Receptor attributes
- Parameters:
- Returns:
True if all attributes mapping fields are found.
- Return type:
- Raises:
- changeo.IO.countDbFile(file)
Counts the records in database files
- Parameters:
file – tab-delimited database file.
- Returns:
count of records in the database file.
- Return type:
- changeo.IO.extractIMGT(imgt_output)
Extract necessary files from IMGT/HighV-QUEST results.
- Parameters:
imgt_output – zipped file or unzipped folder output by IMGT/HighV-QUEST.
- Returns:
(temporary directory handle, dictionary with names of extracted IMGT files).
- Return type:
- changeo.IO.getDbFields(file, add=None, exclude=None, reader=<class 'changeo.IO.TSVReader'>)
Get field names from a db file
- Parameters:
file – db file to pull base fields from.
add – fields to append to the field set.
exclude – fields to exclude from the field set.
reader – reader class.
- Returns:
list of field names
- Return type:
- changeo.IO.getFormatOperators(format)
Simple wrapper for fetching the set of operator classes for a data format
- changeo.IO.getOutputHandle(file, out_label=None, out_dir=None, out_name=None, out_type=None)
Opens an output file handle
- Parameters:
file – filename to base output file name on.
out_label – text to be inserted before the file extension; if None do not add a label.
out_type – the file extension of the output file; if None use input file extension.
out_dir – the output directory; if None use directory of input file
out_name – the short filename to use for the output file; if None use input file short name.
- Returns:
File handle
- Return type:
file
- changeo.IO.getOutputName(file, out_label=None, out_dir=None, out_name=None, out_type=None)
Creates and output filename from an existing filename
- Parameters:
file – filename to base output file name on.
out_label – text to be inserted before the file extension; if None do not add a label.
out_type – the file extension of the output file; if None use input file extension.
out_dir – the output directory; if None use directory of input file
out_name – the short filename to use for the output file; if None use input file short name.
- Returns:
file name.
- Return type:
- changeo.IO.readGermlines(references, asis=False, warn=False)
Parses germline repositories
- Parameters:
- Returns:
Dictionary of germlines in the form {allele: sequence}.
- Return type:
- changeo.IO.splitName(file)
Extract the extension from a file name