How do I determine the top/bottom strand in the Illumina SNP array data?
First of all, be sure to read the ILLUMINA guide to their method for determining strand. There are two ways you can get the top/bottom designation:1.You can compute the top/bottom designation yourself using the data in the /organisms/human_9606/GWAS_arrays/ directory on the dbSNP FTP site.2.You can look at dbSNP’s top/bottom assignment, which you can access if you download the SubSNP.bcp file located in the /database/organism_data/ directory for human. The field that includes the top/bottom data is called SubSNP.top_or_bot_strand. You can access the table DDL for SubSNP in the /database/organism_schema directory. The downside of this approach is that you need to download the entire SubSNP table, which includes 50million+ submitted SNPs.
Related Questions
- How do I determine orientation of a SNP allele (rs9934438) in dbSNP, and then find the corresponding strand and position information in the UCSC genome browser?
- Based on the Illumina note, am I correct in thinking that the same strand could be designated both "top" and "bottom" depending on which SNP was being examined?
- Is Illumina data compatible with Bioconductor?