CreateGermlines

Reconstructs germline sequences from alignment data

usage: CreateGermlines [-h] -d DB_FILES [DB_FILES ...] [--failed]
                           [--log LOG_FILE] [--outdir OUT_DIR]
                           [--outname OUT_NAME] [--version] -r REPO [REPO ...]
                           [-g {full,dmask,vonly,regions} [{full,dmask,vonly,regions} ...]]
                           [--cloned] [--vf V_FIELD] [--sf SEQ_FIELD]
-h, --help

show this help message and exit

-d <db_files>

A list of tab delimited database files.

--failed

If specified create files containing records that fail processing.

--log <log_file>

Specify to write verbose logging to a file. May not be specified with multiple input files.

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

--version

show program’s version number and exit

-r <repo>

List of folders and/or fasta files (with .fasta, .fna or .fa extension) with germline sequences.

-g {full,dmask,vonly,regions}

Specify type(s) of germlines to include full germline, germline with D-region masked, or germline for V region only.

--cloned

Specify to create only one germline per clone. Assumes input file is sorted by clone column, and will not yield correct results if the data is unsorted. Note, if allele calls are ambiguous within a clonal group, this will place the germline call used for the entire clone within the GERMLINE_V_CALL, GERMLINE_D_CALL and GERMLINE_J_CALL fields.

--vf <v_field>

Specify field to use for germline V call

--sf <seq_field>

Specify field to use for sequence

output files:
germ-pass
database with assigned germline sequences.
germ-fail
database with records failing germline assignment.
required fields:
SEQUENCE_ID, SEQUENCE_VDJ or SEQUENCE_IMGT, V_CALL or V_CALL_GENOTYPED, D_CALL, J_CALL, V_SEQ_START, V_SEQ_LENGTH, V_GERM_START_IMGT, V_GERM_LENGTH_IMGT, D_SEQ_START, D_SEQ_LENGTH, D_GERM_START, D_GERM_LENGTH, J_SEQ_START, J_SEQ_LENGTH, J_GERM_START, J_GERM_LENGTH, NP1_LENGTH, NP2_LENGTH
optional fields:
N1_LENGTH, N2_LENGTH, P3V_LENGTH, P5D_LENGTH, P3D_LENGTH, P5J_LENGTH, CLONE
output fields:
GERMLINE_VDJ, GERMLINE_VDJ_D_MASK, GERMLINE_VDJ_V_REGION, GERMLINE_IMGT, GERMLINE_IMGT_D_MASK, GERMLINE_IMGT_V_REGION, GERMLINE_V_CALL, GERMLINE_D_CALL, GERMLINE_J_CALL, GERMLINE_REGIONS