Parsing 10X Genomics V(D)J data

Example data

10X Genomics provides an example data set of Ig V(D)J processed by the Cell Ranger pipeline, which is available for download from their Single Cell Immune Profiling support site.

Converting 10X V(D)J data into Change-O format

To process 10X V(D)J data, a combination of AssignGenes and MakeDb can be used to generate a TSV file compliant with Change-O that incorporates annotation information provided by the Cell Ranger pipeline. The --10x filtered_contig_annotations.csv specifies the path of the contig annotations file generated by cellranger vdj, which can be found in the outs directory.

Generate Change-O formatted data from the 10X V(D)J FASTA files using the steps below:

AssignGenes.py igblast -s filtered_contig.fasta -b ~/share/igblast \
   --organism human --loci ig --format blast
MakeDb.py igblast -i filtered_contig_igblast.fmt7 -s filtered_contig.fasta \
   -r IMGT_Human_*.fasta --10x filtered_contig_annotations.csv
   --regions --scores

all_contig.fasta can be exchanged for filtered_contig.fasta, and all_contig_annotations.csv can be exchanged for filtered_contig_annotations.csv.

Warning

The resulting table contains overwrites the V, D and J segment assignments generated by Cell Ranger and uses those generated by IgBLAST, IMGT or iHMMuneAlign.

See also

To process mouse data and/or TCR data alter the --organism and --loci arguments to MakeDb accordingly (e.g., --organism mouse, --loci tcr) and use the appropriate V, D and J IMGT reference databases (e.g., IMGT_Mouse_TR*.fasta)

See the IgBLAST usage guide for further details regarding the setup and use of IgBLAST with Change-O.

Joining Change-O data with 10X V(D)J annotations

Change-O compliant TSV files can also be merged without the use of MakeDb. This approach involves a simple join operation such that columns specific to 10X are appended with _10X. For instance, Cell Ranger V, D and J segment assignments are retained using this approach (as V_GENE_10X, D_GENE_10X, J_GENE_10X).

A simple script to perform the merge, merge10x.py, is provided on the Immcantation Bitbucket repository in the scripts directory. The following example shows how to merge the Change-O file (10x_igblast_db-pass.tab) with a 10X annotation file (filtered_contig_annotations.csv) and save it as 10x_annotated_db-pass.tab:

merge10x.py 10x_igblast_db-pass.tab filtered_contig_annotations.csv 10x_annotated_db-pass.tab

Identifying clones from B cells in Change-O format 10X V(D)J data

To group B cells into clones from Change-O format data, the output from MakeDb must be parsed into a light chain Change-O file and a heavy chain Change-O file:

ParseDb.py select -d 10x_igblast_db-pass.tab -f LOCUS -u "IGH" \
        --logic all --regex --outname heavy
ParseDb.py select -d 10x_igblast_db-pass.tab -f LOCUS -u "IG[LK]" \
        --logic all --regex --outname light

The heavy chain file must then be clonally clustered separately. See below for further details.

See also

See Assigning clones for futher details on clustering the heavy chain output.

DefineClones currently does not support light chain cloning. However cloning can be performed after heavy chain cloning using light_cluster.py provided on the Immcantation Bitbucket repository in the scripts directory:

light_cluster.py -d heavy_select-pass_clone-pass.tab -e light_select-pass.tab \
        -o 10X_clone-pass.tab

Here, heavy_select-pass_clone-pass.tab refers to the cloned heavy chain Change-O format file. light_select-pass.tab refers to the light chain Change-O format file, and 10X_clone-pass.tab is the resulting output file.

By default, light_chain.py expects the Change-O columns V_CALL, J_CALL, JUNCTION_LENGTH, UMICOUNT, CELL, and CLONE. To process AIRR Rearrangement (v_call, j_call, junction_length, umi_count, cell_id and clone_id), add the --format airr argument:

light_cluster.py -d heavy_select-pass_clone-pass.tab -e light_select-pass.tab \
        -o 10X_clone-pass.tab --format airr

The algorithm will (1) remove cells associated with more than one heavy chain and (2) correct heavy chain clone definitions based on an analysis of the light chain partners associated with the heavy chain clone.