The simplified molecular input line entry specification (SMILES) is a specification for unambiguously describing the structure
of chemicals using ASCII character strings. With a little bit of practice, these strings can be written, read, and understood directly; several molecular software packages can read or generate SMILES strings.
In a computational sense, a chemical graph is first trimmed into a spanning tree and SMILES is a string obtained by writing out the symbol nodes encountered in tree-traversal order. Where cycles have been broken in the creation of the spanning tree, numeric labels are used as connectors.
Atoms are represented by the standard abbreviation of the chemical elements, in square brackets, such as [Au] for gold. Hydroxide anion is [OH-]. If the brackets are omitted, the proper number of implicit hydrogen atoms is assumed; for instance the SMILES for water is simply O and that for ethanol is CCO. The double-bonded carbon dioxide is represented as O=C=O and the triple-bonded hydrogen cyanide as C#N. Cyclohexane is represented as C1CCCCC1, the idea being that the two ones label the same position in the molecule, thus forming a ring with six carbons. Branches are described with parentheses, as in CCC(=O)O for propionic acid and FC(F)F, or alternatively C(F)(F)F, for fluoroform.
The SMILES specification was developed by David Weininger in the late 1980s. It has since been modified and extended by others and most notably by Daylight Chemical Information Systems Inc.
SMARTS is a modifications of SMILES that allows specification of wildcard atom and bond specifications and helps in specifying search structures.
Canonical or unique SMILES are SMILES representations which are made unique by the application of canonicalization rules. A common application of unique SMILES is for exact matching of two structures and also for ensuring uniqueness among molecules in a database.
External links:
- SMILES tutorial, http://www.daylight.com/dayhtml/smiles/smiles-intro.html
- Web-based applications capable of rendering SMILES strings into 2D figures, http://www.daylight.com/daycgi/depict
- Molecule editor applet that can create SMILES, http://www.molinspiration.com/jme/index.html
- SMILES parsing, http://www.dalkescientific.com/writings/diary/archive/