Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

I found homologs with BLAST (or PSI-BLAST or FASTA) that are not reported by a SAM-T05 database search. Are they BLAST (or PSI-BLAST, FASTA) more sensitive than SAM-T05?

April 26, 2017blast database fasta homologs psi-blast reported sam-t05 search sensitive

0

Posted

I found homologs with BLAST (or PSI-BLAST or FASTA) that are not reported by a SAM-T05 database search. Are they BLAST (or PSI-BLAST, FASTA) more sensitive than SAM-T05?

1 Answer

0

Posted

You mentioned that FASTA, BLAST, and PSI-BLAST found a high-scoring similar sequence that SAM-T05 did not find. This happens fairly often the most common causes are composition bias and large helices (particularly coiled-coils). The programs FASTA, BLAST, and PSI-BLAST can all be fooled into reporting very strong scores for sequences whose only similarity is that they both have long amphipathic helices. SAM-T05’s reverse-sequence-null model cancels this signal (as well as composition bias and length signals), resulting in a method with many fewer false positives. A few true positives are lost, but not too many. As an example, the leucine zipper 1ce0A gets only 25 sequences in the 1ce0A.t02.a2m alignment. The 19 PDB sequences in the alignment are all homologs (at least, similar structure and somewhat similar sequence). Other methods are likely to get almost any coiled-coil as a strong hit. This is an example of the reverse-sequence-null model removing a lot of trash (and possibly some g