CreateGermlines¶
Reconstructs germline sequences from alignment data
usage: CreateGermlines [--version] [-h] -d DB_FILES [DB_FILES ...]
[-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
[--outname OUT_NAME] [--log LOG_FILE] [--failed]
[--format {changeo,airr}] -r REFERENCES
[REFERENCES ...]
[-g {full,dmask,vonly,regions} [{full,dmask,vonly,regions} ...]]
[--cloned] [--sf SEQ_FIELD] [--vf V_FIELD]
[--df D_FIELD] [--jf J_FIELD]
-
--version
¶
show program’s version number and exit
-
-h
,
--help
¶
show this help message and exit
-
-d
<db_files>
¶ A list of tab delimited database files.
-
-o
<out_files>
¶ Explicit output file name. Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).
-
--outdir
<out_dir>
¶ Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
-
--outname
<out_name>
¶ Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
-
--log
<log_file>
¶ Specify to write verbose logging to a file. May not be specified with multiple input files.
-
--failed
¶
If specified create files containing records that fail processing.
-
--format
{changeo,airr}
¶ Specify input and output format.
-
-r
<references>
¶ List of folders and/or fasta files (with .fasta, .fna or .fa extension) with germline sequences. When using the default Change-O sequence and coordinate fields, these reference sequences must contain IMGT-numbering spacers (gaps) in the V segment. Alternative numbering schemes, or no numbering, may work for alternative sequence and coordinate definitions that define a valid alignment, but a warning will be issued.
-
-g
{full,dmask,vonly,regions}
¶ Specify type(s) of germlines to include full germline, germline with D segment masked, or germline for V segment only.
-
--cloned
¶
Specify to create only one germline per clone. Note, if allele calls are ambiguous within a clonal group, this will place the germline call used for the entire clone within the GERMLINE_V_CALL, GERMLINE_D_CALL and GERMLINE_J_CALL fields.
-
--sf
<seq_field>
¶ Field containing the aligned sequence. Defaults to SEQUENCE_IMGT (changeo) or sequence_alignment (airr).
-
--vf
<v_field>
¶ Field containing the germline V segment call. Defaults to V_CALL (changeo) or v_call (airr).
-
--df
<d_field>
¶ Field containing the germline D segment call. Defaults to D_CALL (changeo) or d_call (airr).
-
--jf
<j_field>
¶ Field containing the germline J segment call. Defaults to J_CALL (changeo) or j_call (airr).
- output files:
- germ-pass
- database with assigned germline sequences.
- germ-fail
- database with records failing germline assignment.
- required fields:
- SEQUENCE_ID, SEQUENCE_IMGT, V_CALL, D_CALL, J_CALL, V_SEQ_START, V_SEQ_LENGTH, V_GERM_START_IMGT, V_GERM_LENGTH_IMGT, D_SEQ_START, D_SEQ_LENGTH, D_GERM_START, D_GERM_LENGTH, J_SEQ_START, J_SEQ_LENGTH, J_GERM_START, J_GERM_LENGTH, NP1_LENGTH, NP2_LENGTH
- optional fields:
- N1_LENGTH, N2_LENGTH, P3V_LENGTH, P5D_LENGTH, P3D_LENGTH, P5J_LENGTH, CLONE
- output fields:
- GERMLINE_IMGT, GERMLINE_IMGT_D_MASK, GERMLINE_IMGT_V_REGION, GERMLINE_V_CALL, GERMLINE_D_CALL, GERMLINE_J_CALL, GERMLINE_REGIONS