Why does RAST complain that it can find the “phylogenetic neighborhood” of my submission?
• Usually, this is because the sequence data submitted are too small, e.g., because you have only submitted a plasmid or small fragment of a genome. RAST is designed for complete or near-complete genomes, and it estimates the phylogenetic neighborhood of a submission and its initial training set for gene calls by first looking for members of the “Universal” protein families, or, failing that, members of other large, highly conserved protein families. Experience suggests that RAST needs at least 40 kbp of sequence data to find enough highly conserved genes to reliably place a submission’s phylogenetic neighborhood and develop an initial training set. For submissions smaller than 40 kbp, RAST’s performance degrades rapidly; we therefore strongly recommend that RAST submissions should be at least 100 kbp for safety.