BuildTrees¶
Converts TSV files into IgPhyML input files
usage: BuildTrees [--version] [-h] -d DB_FILES [DB_FILES ...]
[--outdir OUT_DIR] [--outname OUT_NAME] [--log LOG_FILE]
[--failed] [--format {changeo,airr}] [--collapse] [--ncdr3]
[--md META_DATA [META_DATA ...]]
[--clones TARGET_CLONES [TARGET_CLONES ...]]
[--minseq MIN_SEQ] [--sample SAMPLE_DEPTH]
[--append APPEND [APPEND ...]] [--igphyml] [--nproc NPROC]
[--clean {none,all}] [--optimize {n,r,l,lr,tl,tlr}]
[--omega {e,ce,e,e,ce,e,e,ce,ce,ce}] [-t {e,ce}]
[--motifs MOTIFS] [--hotness HOTNESS] [--oformat {tab,txt}]
[--nohlp]
-
--version
¶
show program’s version number and exit
-
-h
,
--help
¶
show this help message and exit
-
-d
<db_files>
¶ A list of tab delimited database files.
-
--outdir
<out_dir>
¶ Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
-
--outname
<out_name>
¶ Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
-
--log
<log_file>
¶ Specify to write verbose logging to a file. May not be specified with multiple input files.
-
--failed
¶
If specified create files containing records that fail processing.
-
--format
{changeo,airr}
¶ Specify input and output format.
-
--collapse
¶
If specified, collapse identical sequences before exporting to fasta.
-
--ncdr3
¶
If specified, remove CDR3 from all sequences.
-
--md
<meta_data>
¶ List of fields to containing metadata to include in output fasta file sequence headers.
-
--clones
<target_clones>
¶ List of clone IDs to output, if specified.
-
--minseq
<min_seq>
¶ Minimum number of data sequences. Any clones with fewer than the specified number of sequences will be excluded.
-
--sample
<sample_depth>
¶ Depth of reads to be subsampled (before deduplication).
-
--append
<append>
¶ List of columns to append to sequence ID to ensure uniqueness.
-
--igphyml
¶
Run IgPhyML on output?
-
--nproc
<nproc>
¶ Number of threads to parallelize IgPhyML across.
-
--clean
{none,all}
¶ Delete intermediate files? none: leave all intermediate files; all: delete all intermediate files.
-
--optimize
{n,r,l,lr,tl,tlr}
¶ Optimize combination of topology (t) branch lengths (l) and parameters (r), or nothing (n), for IgPhyML.
-
--omega
{e,ce,e,e,ce,e,e,ce,ce,ce}
¶ Omega parameters to estimate for FWR,CDR respectively: e = estimate, ce = estimate + confidence interval
-
-t
{e,ce}
¶ Kappa parameters to estimate: e = estimate, ce = estimate + confidence interval
-
--motifs
<motifs>
¶ Which motifs to estimate mutability.
-
--hotness
<hotness>
¶ Mutability parameters to estimate: e = estimate, ce = estimate + confidence interval
-
--oformat
{tab,txt}
¶ IgPhyML output format.
-
--nohlp
¶
Don’t run HLP model?
- output files:
- <folder>
- folder containing fasta and partition files for each clone.
- lineages
- successfully processed records.
- lineages-fail
- database records failed processing.
- igphyml-pass
- parameter estimates and lineage trees from running IgPhyML, if specified
- required fields:
- SEQUENCE_ID, SEQUENCE_INPUT, SEQUENCE_IMGT, GERMLINE_IMGT_D_MASK, V_CALL, J_CALL, CLONE, V_SEQ_START