How is an InChI created from the input information?
An InChI identifier is created from an input connection table (in MOL, SDF or CML format) in three steps: • Normalization – conventions are removed while maintaining a complete description of the compound. Steps involved are: • Ignore electron density and use simple atom connectivity only. • Disconnect salts and metal atoms in organometallic compounds. • Normalise mobile-hydrogens, variable protonation and charge. • Canonicalization – a set of atom labels are algorithmically generated that do not depend on how the structure was initially drawn. The algorithm used for this step is based on the Morgan algorithm1. • Serialization – the set of labels derived during canonicalization are converted into a string of characters, the InChI.