How are clusters created?
A combination of ESPRIT, SLP and mother computes taxonomic independent clusters (Operational Taxonomic Units – OTUs) using the total collection of available V6 sequences in VAMPS. The sequences were binned into separate datasets for the Archaeal or Eukaryal domains, and into Bacterial phylum- or Proteobacterial class-level datasets. For each bin, the unique.seqs function in mothur, eliminated duplicate sequences but retained information about observed frequencies for each unique read. The kmerdist module of ESPRIT (with default values) identified all sequence pairs within each bin that are predicted to be at least 90% similar. The needledist module in ESPRIT generated a sparse matrix of pairwise distances by performing a Needleman-Wunsch alignment on the sequence pairs and calculating pairwise distances using quickdist. The algorithm SLP uses the pairwise distances to perform a modified single-linkage preclustering at 2% to reduce noise in the sequence data. Initially SLP orders sequen