What do the sequence IDs in a SAM-T02 a2m file mean?
Most of the sequence IDs in a SAM-T02 a2m file come from the IDs in the NR database. The sequence IDs may be modified by SAM to indicate the first and last sequence positions that matched the SAM-T02 HMM. For example in the following sequence ID taken from a SAM-T02 alignment, >gi|16080670|ref|NP_391498.1|_1:234 (NC_000964) similar to hypothetical proteins [Bacillus subtilis] gi|7450240|pir||G70067 conserved hypothetical protein ywqL – Bacillus subtilis gi|1894750|emb|CAB07450.1| (Z92952) product similar to E.coli YjaF protein [Bacillus subtilis] gi|2636142|emb|CAB15634.1| (Z99122) similar to hypothetical proteins [Bacillus subtilis] the original sequence name gi|16080670|ref|NP_391498.1| has had _1:234 appended to indicate that the SAM-T02 HMM for the alignment matched the sequence starting a sequence position 1 and ending at sequence position 234. • I found homologs with BLAST (or PSI-BLAST or FASTA) that are not reported by a SAM-T02 database search. Are they BLAST (or PSI-BLAST, FA