Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

How do you select which sequences are of sufficient quality and length for reporting to GenBank?

0
Posted

How do you select which sequences are of sufficient quality and length for reporting to GenBank?

0

Vector sequences are automatically trimmed off the “raw sequence” after base calling is performed. The trimmed sequences less than 100 bases are rejected (the well is listed as a failure), and sequences with a phred score <15 are also rejected (the well is listed as a failure). Our typical phred scores are ~35 - 43. A phred score of 20 means 1 base calling error is likely in every one hundred bases; a phred score of 30 means 1 base calling error is likely in every one thousand bases; and a phred score of 40 means 1 base calling error in every 10,000 bases. We are performing single pass, single strand sequencing. To insure that the "quality scores" are realistic, we hand-check some of those ESTs that match existing maize genes by performing a BLAST search against maize genes at GenBank. For ESTs from 100 - 600+ bases, identities are 97-100%. Some of the mismatches are likely to be true polymorphisms, and some are sequencing errors.

Related Questions

What is your question?

*Sadly, we had to bring back ads too. Hopefully more targeted.

Experts123