DefineClones

Assign Ig sequences into clones

usage: DefineClones [-h] [--version]  ...
-h, --help

show this help message and exit

--version

show program’s version number and exit

output files:
clone-pass
database with assigned clonal group numbers.
clone-fail
database with records failing clonal grouping.
required fields:

SEQUENCE_ID, V_CALL or V_CALL_GENOTYPED, D_CALL, J_CALL, JUNCTION

<field>
sequence field specified by the –sf parameter
output fields:
CLONE

DefineClones ademokun2011

Defines clones by method specified in Ademokun, 2011.

usage: DefineClones ademokun2011 [-h] -d DB_FILES [DB_FILES ...] [--failed]
                                     [--log LOG_FILE]
                                     [--delim DELIMITER DELIMITER DELIMITER]
                                     [--nproc NPROC] [--outdir OUT_DIR]
                                     [--outname OUT_NAME]
-h, --help

show this help message and exit

-d <db_files>

A list of tab delimited database files.

--failed

If specified create files containing records that fail processing.

--log <log_file>

Specify to write verbose logging to a file. May not be specified with multiple input files.

--delim <delimiter>

A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.

--nproc <nproc>

The number of simultaneous computational processes to execute (CPU cores to utilized).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

DefineClones bygroup

Defines clones as having same V assignment,
J assignment, and junction length with specified substitution distance model.
usage: DefineClones bygroup [-h] -d DB_FILES [DB_FILES ...] [--failed]
                                [--log LOG_FILE]
                                [--delim DELIMITER DELIMITER DELIMITER]
                                [--nproc NPROC] [--outdir OUT_DIR]
                                [--outname OUT_NAME] [-f FIELDS [FIELDS ...]]
                                [--mode {allele,gene}] [--act {first,set}]
                                [--model {ham,aa,hh_s1f,hh_s5f,mk_rs1nf,mk_rs5nf,hs1f_compat,m1n_compat}]
                                [--dist DISTANCE] [--norm {len,mut,none}]
                                [--sym {avg,min}]
                                [--link {single,average,complete}]
                                [--sf SEQ_FIELD]
-h, --help

show this help message and exit

-d <db_files>

A list of tab delimited database files.

--failed

If specified create files containing records that fail processing.

--log <log_file>

Specify to write verbose logging to a file. May not be specified with multiple input files.

--delim <delimiter>

A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.

--nproc <nproc>

The number of simultaneous computational processes to execute (CPU cores to utilized).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

-f <fields>

Additional fields to use for grouping clones (non VDJ)

--mode {allele,gene}

Specifies whether to use the V(D)J allele or gene for initial grouping.

--act {first,set}

Specifies how to handle multiple V(D)J assignments for initial grouping.

--model {ham,aa,hh_s1f,hh_s5f,mk_rs1nf,mk_rs5nf,hs1f_compat,m1n_compat}

Specifies which substitution model to use for calculating distance between sequences. The “ham” model is nucleotide Hamming distance and “aa” is amino acid Hamming distance. The “hh_s1f” and “hh_s5f” models are human specific single nucleotide and 5-mer content models, respectively, from Yaari et al, 2013. The “mk_rs1nf” and “mk_rs5nf” models are mouse specific single nucleotide and 5-mer content models, respectively, from Cui et al, 2016. The “m1n_compat” and “hs1f_compat” models are deprecated models provided backwards compatibility with the “m1n” and “hs1f” models in Change-O v0.3.3 and SHazaM v0.1.4. Both 5-mer models should be considered experimental.

--dist <distance>

The distance threshold for clonal grouping

--norm {len,mut,none}

Specifies how to normalize distances. One of none (do not normalize), len (normalize by length), or mut (normalize by number of mutations between sequences).

--sym {avg,min}

Specifies how to combine asymmetric distances. One of avg (average of A->B and B->A) or min (minimum of A->B and B->A).

Type of linkage to use for hierarchical clustering.

--sf <seq_field>

The name of the field to be used to calculate distance between records

DefineClones chen2010

Defines clones by method specified in Chen, 2010.

usage: DefineClones chen2010 [-h] -d DB_FILES [DB_FILES ...] [--failed]
                                 [--log LOG_FILE]
                                 [--delim DELIMITER DELIMITER DELIMITER]
                                 [--nproc NPROC] [--outdir OUT_DIR]
                                 [--outname OUT_NAME]
-h, --help

show this help message and exit

-d <db_files>

A list of tab delimited database files.

--failed

If specified create files containing records that fail processing.

--log <log_file>

Specify to write verbose logging to a file. May not be specified with multiple input files.

--delim <delimiter>

A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.

--nproc <nproc>

The number of simultaneous computational processes to execute (CPU cores to utilized).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.