ConvertDb

Parses tab delimited database files

usage: ConvertDb [--version] [-h]  ...
--version

show program’s version number and exit

-h, --help

show this help message and exit

output files:
airr
AIRR formatted database files.
changeo
Change-O formatted database files.
sequences
FASTA formatted sequences output from the subcommands fasta and clip.
genbank
feature tables and fasta files containing MiAIRR compliant input for tbl2asn.
required fields:
SEQUENCE_ID, SEQUENCE_INPUT, JUNCTION, V_CALL, D_CALL, J_CALL, V_SEQ_START, V_SEQ_LENGTH, D_SEQ_START, D_SEQ_LENGTH, J_SEQ_START, J_SEQ_LENGTH, NP1_LENGTH, NP2_LENGTH SEQUENCE_IMGT, V_GERM_START_IMGT, V_GERM_LENGTH_IMGT
optional fields:
GERMLINE_IMGT, GERMLINE_IMGT_D_MASK, CLONE, C_CALL

ConvertDb airr

Converts input to an AIRR TSV file.

usage: ConvertDb airr [--version] [-h] -d DB_FILES [DB_FILES ...]
                      [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
                      [--outname OUT_NAME]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-d <db_files>

A list of tab delimited database files.

-o <out_files>

Explicit output file name. Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

ConvertDb baseline

Creates a BASELINe fasta file from database records.

usage: ConvertDb baseline [--version] [-h] -d DB_FILES [DB_FILES ...]
                          [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
                          [--outname OUT_NAME] [--if ID_FIELD]
                          [--sf SEQ_FIELD] [--gf GERM_FIELD]
                          [--cf CLUSTER_FIELD]
                          [--mf META_FIELDS [META_FIELDS ...]]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-d <db_files>

A list of tab delimited database files.

-o <out_files>

Explicit output file name. Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

--if <id_field>

The name of the field containing identifiers

--sf <seq_field>

The name of the field containing reads

--gf <germ_field>

The name of the field containing germline sequences

--cf <cluster_field>

The name of the field containing containing sorted clone IDs

--mf <meta_fields>

List of annotation fields to add to the sequence description

ConvertDb changeo

Converts input into a Change-O TSV file.

usage: ConvertDb changeo [--version] [-h] -d DB_FILES [DB_FILES ...]
                         [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
                         [--outname OUT_NAME]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-d <db_files>

A list of tab delimited database files.

-o <out_files>

Explicit output file name. Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

ConvertDb fasta

Creates a fasta file from database records.

usage: ConvertDb fasta [--version] [-h] -d DB_FILES [DB_FILES ...]
                       [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
                       [--outname OUT_NAME] [--if ID_FIELD] [--sf SEQ_FIELD]
                       [--mf META_FIELDS [META_FIELDS ...]]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-d <db_files>

A list of tab delimited database files.

-o <out_files>

Explicit output file name. Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

--if <id_field>

The name of the field containing identifiers

--sf <seq_field>

The name of the field containing sequences

--mf <meta_fields>

List of annotation fields to add to the sequence description

ConvertDb genbank

Creates files for GenBank/TLS submissions.

usage: ConvertDb genbank [--version] [-h] -d DB_FILES [DB_FILES ...]
                         [-o OUT_FILES [OUT_FILES ...]] [--outdir OUT_DIR]
                         [--outname OUT_NAME] [--format {changeo,airr}]
                         [--mol MOLECULE] [--product PRODUCT] [--db DB_XREF]
                         [--inf INFERENCE] [--organism ORGANISM] [--sex SEX]
                         [--isolate ISOLATE] [--tissue TISSUE]
                         [--cell-type CELL_TYPE] [-y YAML_CONFIG]
                         [--label LABEL] [--cf C_FIELD] [--nf COUNT_FIELD]
                         [--if INDEX_FIELD] [--allow-stop] [--asis-id]
                         [--asis-calls] [--allele-delim ALLELE_DELIM] [--asn]
                         [--sbt ASN_TEMPLATE] [--exec TBL2ASN_EXEC]
--version

show program’s version number and exit

-h, --help

show this help message and exit

-d <db_files>

A list of tab delimited database files.

-o <out_files>

Explicit output file name. Note, this argument cannot be used with the –failed, –outdir, or –outname arguments. If unspecified, then the output filename will be based on the input filename(s).

--outdir <out_dir>

Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.

--outname <out_name>

Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.

--format {changeo,airr}

Specify input and output format.

--mol <molecule>

The source molecule type. Usually one of “mRNA” or “genomic DNA”.

--product <product>

The product name, such as “immunoglobulin heavy chain”.

--db <db_xref>

Name of the reference database used for alignment. Usually “IMGT/GENE-DB”.

--inf <inference>

Name and version of the inference tool used for reference alignment in the form tool:version.

--organism <organism>

The scientific name of the organism.

--sex <sex>

If specified, adds the given sex annotation to the fasta headers.

--isolate <isolate>

If specified, adds the given isolate annotation (sample label) to the fasta headers.

--tissue <tissue>

If specified, adds the given tissue-type annotation to the fasta headers.

--cell-type <cell_type>

If specified, adds the given cell-type annotation to the fasta headers.

-y <yaml_config>

A yaml file specifying sample features (BioSample attributes) in the form ‘variable: value’. If specified, any features provided in the yaml file will override those provided at the commandline. Note, this config file applies to sample features only and cannot be used for required source features such as the –product or –mol argument.

--label <label>

If specified, add a field name to the sequence identifier. Sequence identifiers will be output in the form <label>=<id>.

--cf <c_field>

Field containing the C region call. If unspecified, the C region gene call will be excluded from the feature table.

--nf <count_field>

If specified, use the provided column to add the AIRR_READ_COUNT note to the feature table.

--if <index_field>

If specified, use the provided column to add the AIRR_CELL_INDEX note to the feature table.

--allow-stop

If specified, retain records in the output with stop codons in the junction region. In such records the CDS will be removed and replaced with a similar misc_feature in the feature table.

--asis-id

If specified, use the existing sequence identifier for the output identifier. By default, only the row number will be used as the identifier to avoid the 50 character limit.

--asis-calls

Specify to prevent alleles from being parsed using the IMGT nomenclature. Note, this requires the gene assignments to be exact matches to valid records in the references database specified by the –db argument.

--allele-delim <allele_delim>

The delimiter to use for splitting the gene name from the allele number. Note, this only applies when specifying –asis-calls. By default, this argument will be ignored and allele numbers extracted under the expectation of IMGT nomenclature consistency.

--asn

If specified, run tbl2asn to generate the .sqn submission file after making the .fsa and .tbl files.

--sbt <asn_template>

If provided along with –asn, use the specified file for the template file argument to tbl2asn.

--exec <tbl2asn_exec>

The name or location of the tbl2asn executable.