changeo.Distance

Distance calculations

changeo.Distance.calcDistances(sequences, n, dist_mat, sym='avg', norm=None)

Calculate pairwise distances between input sequences

Parameters:
  • sequences – List of sequences for which to calculate pairwise distances
  • n – Length of n-mers to be used in calculating distance
  • dist_mat – pandas.DataFrame of mutation distances
  • norm – Normalization method. One of None, ‘len’, or ‘mut’.
  • sym – Symmetry method; one of ‘avg’ of ‘min.
Returns:

numpy matrix of pairwise distances between input sequences

Return type:

ndarray

changeo.Distance.formClusters(dists, link, distance)

Form clusters based on hierarchical clustering of input distance matrix with linkage type and cutoff distance

Parameters:
  • dists – numpy matrix of distances
  • link – Linkage type for hierarchical clustering
  • distance – Distance at which to cut into clusters
Returns:

List of cluster assignments

Return type:

list

changeo.Distance.getAADistMatrix(mat=None, mask_dist=0, gap_dist=0)

Generates an amino acid distance matrix

Parameters:
  • mat – Input distance matrix to extend to full alphabet; if unspecified, creates Hamming distance matrix that incorporates IUPAC equivalencies
  • mask_dict – Score for all matches against an X character
  • gap_dist – Score for all matches against a gap (-, .) character
Returns:

pandas.DataFrame of distances

Return type:

DataFrame

changeo.Distance.getDNADistMatrix(mat=None, mask_dist=0, gap_dist=0)

Generates a DNA distance matrix

Parameters:
  • mat – Input distance matrix to extend to full alphabet; if unspecified, creates Hamming distance matrix that incorporates IUPAC equivalencies
  • mask_dist – Distance for all matches against an N character
  • gap_dist – Distance for all matches against a gap (-, .) character
Returns:

pandas.DataFrame of distances

Return type:

DataFrame

changeo.Distance.getNmers(sequences, n)

Breaks input sequences down into n-mers

Parameters:
  • sequences – List of sequences to be broken into n-mers
  • n – Length of n-mers to return
Returns:

Dictionary mapping sequence to a list of n-mers

Return type:

dict

changeo.Distance.zip_equal(*iterables)

Zips iterables and raises exception if different lengths

Parameters:iterables – pointer to iterables to zip together
Returns:A generator of tuples with combined elements from the iterables
Return type:iter