Content deleted Content added
m →Examples: Citework |
Since there were different cite formats used, migrate the five "further reading" sources to more current inline references. Since date formats were mixed with no clear MOS:RETAIN, choosing {Use mdy dates} per MOS:STRONGTIES. |
||
Line 1:
{{Redirect|SMILES|other uses|Smiles (disambiguation)}}
{{Use mdy dates|date=July 2020}}
{{Infobox file format
| name = SMILES
Line 19 ⟶ 20:
==History==
The original SMILES specification was initiated by David Weininger at the USEPA Mid-Continent Ecology Division Laboratory in [[Duluth, Minnesota|Duluth]] in the 1980s.<ref name="
It has since been modified and extended by others, most notably by [[Daylight Chemical Information Systems]]. In 2007, an [[open standard]] called "OpenSMILES" was developed by the [[Blue Obelisk]] open-source chemistry community. Other 'linear' notations include the [[Wiswesser Line Notation]] (WLN), [[ROSDAL]] and [[SYBYL Line Notation|SLN]] (Tripos Inc).
Line 26 ⟶ 27:
== Terminology ==
The term SMILES refers to a line notation for encoding molecular structures and specific instances should strictly be called SMILES strings. However, the term SMILES is also commonly used to refer to both a single SMILES string and a number of SMILES strings; the exact meaning is usually apparent from the context. The terms "canonical" and "isomeric" can lead to some confusion when applied to SMILES. The terms describe different attributes of SMILES strings and are not mutually exclusive.
Typically, a number of equally valid SMILES strings can be written for a molecule. For example, <code>CCO</code>, <code>OCC</code> and <code>C(O)C</code> all specify the structure of [[ethanol]]. Algorithms have been developed to generate the same SMILES string for a given molecule; of the many possible strings, these algorithms choose only one of them. This SMILES is unique for each structure, although dependent on the [[canonicalization]] algorithm used to generate it, and is termed the canonical SMILES. These algorithms first convert the SMILES to an internal representation of the molecular structure; an algorithm then examines that structure and produces a unique SMILES string. Various algorithms for generating canonical SMILES have been developed and include those by [[Daylight Chemical Information Systems]], [[OpenEye Scientific Software]], [[MEDIT]], [[Chemical Computing Group]], [[MolSoft LLC]], and the [[Chemistry Development Kit]]. A common application of canonical SMILES is indexing and ensuring uniqueness of molecules in a [[Chemical database|database]].
The original paper that described the CANGEN<ref name="
SMILES notation allows the specification of [[molecular configuration|configuration at tetrahedral centers]], and double bond geometry. These are structural features that cannot be specified by connectivity alone, and therefore SMILES which encode this information are termed isomeric SMILES. A notable feature of these rules is that they allow rigorous partial specification of chirality. The term isomeric SMILES is also applied to SMILES in which [[isomer]]s are specified.
== Graph-based definition ==
In terms of a graph-based computational procedure, SMILES is a string obtained by printing the symbol nodes encountered in a [[depth-first search|depth-first]] [[tree traversal]] of a [[chemical graph]]. The chemical graph is first trimmed to remove hydrogen atoms and cycles are broken to turn it into a [[spanning tree (mathematics)|spanning tree]]. Where cycles have been broken, numeric suffix labels are included to indicate the connected nodes. Parentheses are used to indicate points of branching on the tree.
Line 231 ⟶ 230:
== Conversion ==
SMILES can be converted back to two-dimensional representations using structure diagram generation (SDG) algorithms.<ref
▲SMILES can be converted back to two-dimensional representations using structure diagram generation (SDG) algorithms (Helson, 1999). This conversion is not always unambiguous. Conversion to three-dimensional representation is achieved by energy-minimization approaches. There are many downloadable and web-based conversion utilities.
== See also ==
Line 244 ⟶ 242:
== References ==
{{Reflist|33em}}
== External links ==
|