changeo.Distance

Distance calculations

changeo.Distance.calcDistances(sequences, n, dist_mat, sym='avg', norm=None)

Calculate pairwise distances between input sequences

Parameters
  • sequences – List of sequences for which to calculate pairwise distances

  • n – Length of n-mers to be used in calculating distance

  • dist_mat – pandas.DataFrame of mutation distances

  • norm – Normalization method. One of None, ‘len’, or ‘mut’.

  • sym – Symmetry method; one of ‘avg’ of ‘min.

Returns

numpy matrix of pairwise distances between input sequences

Return type

ndarray

changeo.Distance.formClusters(dists, link, distance)

Form clusters based on hierarchical clustering of input distance matrix with linkage type and cutoff distance

Parameters
  • dists – numpy matrix of distances

  • link – Linkage type for hierarchical clustering

  • distance – Distance at which to cut into clusters

Returns

List of cluster assignments

Return type

list

changeo.Distance.getAADistMatrix(mat=None, mask_dist=0, gap_dist=0)

Generates an amino acid distance matrix

Parameters
  • mat – Input distance matrix to extend to full alphabet; if unspecified, creates Hamming distance matrix that incorporates IUPAC equivalencies

  • mask_dict – Score for all matches against an X character

  • gap_dist – Score for all matches against a gap (-, .) character

Returns

pandas.DataFrame of distances

Return type

DataFrame

changeo.Distance.getDNADistMatrix(mat=None, mask_dist=0, gap_dist=0)

Generates a DNA distance matrix

Parameters
  • mat – Input distance matrix to extend to full alphabet; if unspecified, creates Hamming distance matrix that incorporates IUPAC equivalencies

  • mask_dist – Distance for all matches against an N character

  • gap_dist – Distance for all matches against a gap (-, .) character

Returns

pandas.DataFrame of distances

Return type

DataFrame

changeo.Distance.getNmers(sequences, n)

Breaks input sequences down into n-mers

Parameters
  • sequences – List of sequences to be broken into n-mers

  • n – Length of n-mers to return

Returns

Dictionary mapping sequence to a list of n-mers

Return type

dict

changeo.Distance.zip_equal(*iterables)

Zips iterables and raises exception if different lengths

Parameters

iterables – pointer to iterables to zip together

Returns

A generator of tuples with combined elements from the iterables

Return type

iter