How do I determine to the refSNP (rs) ID for the snpID in theSNPContigLocusId.bcp.gz file?
The refSNP (rs) ID is the ID number given in the “snp_id” field of the SNPContigLocusId.bcp.gz in file. You can confirm this by doing the following: Look at the rs number associated with a gene on a refSNP page. For example, go to the refSNP cluster report for rs268. If you look at the very top of the “Geneview” section of this report, you will see that the SNP is associated with the LPL gene (Lipoprotein Lipase). If you click on the “LPL” link at the top of “Geneview” section, you will get a Entrez Gene report for LPL, which states that the gene ID for LPL is 4023. Now, if you go to the SNPContigLocusId.bcp file, you will see a row that contains snp_id 268. That same row will have gene_id 4023. I should mention that you will see two rows for rs268, one representing the reference contig, and one representing the Celera contig.