Simplified Molecular Input Line Entry System

This is an old revision of this page, as edited by AxelBoldt (talk | contribs) at 02:00, 23 September 2002 (one-minute tutorial). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The Simplified Molecular Input Line Entry Specification, or SMILES for short, is a specification for unambiguously describing the structure of chemicals using ASCII character strings. With a little bit of practice, these strings can be written, read and understood directly; a number of molecular software packages are able to read or generate SMILES strings.

Atoms are represented by the standard abbreviation of the chemical elements, in square brackets, such as [Au] for gold. Hydroxide anion is [OH-]. If the brackets are omitted, the proper number of implicit hydrogen atoms is assumed; for instance the SMILES for water is simply O and that for ethanol is CCO. Carbon dioxide is represented as O=C=O and cyclohexane as C1CCCCCC1 (the idea being that the C1 carbon is listed twice, forming a ring with six carbons). Branches are described using parentheses, as in CCC(=O)O for ??? acid and FC(F)F or alternatively C(F)(F)F for fluoroform.


External links: