Why does the GeneSpeed Database list domains with very low similarity?
A4: It is true that there are many low scoring domain hits in the GeneSpeed Database, and some of these hits have such a low e-score that it is extremely unlikely that they represent a homolog for the given protein domain. This, however, is one of the true strengths of the database; in many cases we have observed ‘true’ hits for domains with reasonably insignificant e-scores. If we were to implement a default e-score cutoff, then these ‘true’ hits would not reside in the database at all. As a result, we have allowed these low scoring hits to be in the database. We have given the user the ability to set the e-score to a stringent or a lenient value. In this way user may set a significant e-score (low e-score) and thus eliminate any false positives. On the other hand, the user may set a lenient score (high e-score) and thus include false positives, but at the same time homologies that might make biological sense may be found. Indeed, we have observed in many cases, ‘true’ members of a fa