The Simplified Molecular Input Line Entry Specification, or SMILES for short, is a specification for unambiguously describing the structure of chemicals using ASCII character strings. With a little bit of practice, these strings can be written, read and understood directly; a number of molecular software packages are able to read or generate SMILES strings.
Atoms are represented by the standard abbreviation of the chemical elements, in square brackets, such as [Au] for gold. Hydroxide anion is [OH-]. If the brackets are omitted, the proper number of implicit hydrogen atoms is assumed; for instance the SMILES for water is simply O and that for ethanol is CCO. Carbon dioxide is represented as O=C=O and cyclohexane as C1CCCCCC1 (the idea being that the C1 carbon is listed twice, forming a ring with six carbons). Branches are described using parentheses, as in CCC(=O)O for ??? acid and FC(F)F or alternatively C(F)(F)F for fluoroform.
External links:
- SMILES tutorial, http://www.daylight.com/dayhtml/smiles/smiles-intro.html
- Web based applications capable of rendering SMILES strings into 2D figures, http://www.daylight.com/daycgi/depict
- Molecule editor applet which can create SMILES, http://www.molinspiration.com/jme/index.html