Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

Because dbSNP contains Dr. Jim Mullikin’s double-hit SNPs, can you tell me what method he uses to determine these double-hits?

April 26, 2017dbsnp determine double-hit Dr jim mullikin method SNPs uses

0

Posted

Because dbSNP contains Dr. Jim Mullikin’s double-hit SNPs, can you tell me what method he uses to determine these double-hits?

1 Answer

0

Posted

In an email, Dr. Mullikin described his double-hit method (reprinted with permission): First, I align the following sequences to the human reference sequence: all human traces from the trace archive; all clone sequences not used in the reference sequence; cDNA sequence; the Celera WGSA assembly; and Celera reads from non-donor B individuals. For any rsIDs, I look at the alignment, and count how many times I see each allele. If I see each allele two or more times in different DNAs, I classify it as a double-hit SNP. I also use chimp to promote an allele from a count of one to two. For example, let’s say for an A/G SNP, A is seen in human DNA seven times, and G once. Then, if chimp is a G, it becomes a double-hit SNP. Also, if the chimp sequence is polymorphic, or does not agree with either human allele, the chimp allele(s) is not used. For human DNA, if the sequence comes from a single individual, I do not allow that individual to contribute to the allele counts more than once per allel