What is the NCBI ToolBox?
Internally NCBI stores data in a variety of ways most appropriate to the flow of the data and its semantics. These may include normalized relational databases (eg. for ESTs), ASN.1 (eg. for other types of sequences), or XML (eg. for journal articles). NCBI also distributes the same data in a number of formats such as GenBank, FASTA, ASN.1, and XML, no matter how they are natively stored. For a particular nucleotide sequence of the human beta globin locus, options are: • GenBank format • FASTA format • ASN.1 format • Data Encoding – A formal specification and encoding rules. The telecommunications standard, ASN.1, has been used for this. This has also been mapped to XML. • Programming Libraries – Originally written in a portable dialect of C. This has also been written in C++. The ToolBox model and code is used extensively within NCBI for the internal pipelines and tools such as GenBank, Entrez, BLAST, Sequin, OMIM, RefSeq, and others. We make the same tools available to the public doma