Revision as of 15:51, 9 October 2020 edit WikiCleanerBot (talk \| contribs) Bots 1,007,764 edits m v2.03b - Bot 22 LintError/bogus-image-options - WP:WCW project (Bogus image options) Tag: WPCleaner ← Previous edit		Revision as of 00:35, 20 October 2020 edit undo Monkbot (talk \| contribs) Bots 3,695,952 edits m →Terminology: Task 17: replace to-be-deprecated: \|name-list-format= (1× replaced; usage: 1 of 9); Tag: AWB Next edit →
Line 31: Typically, a number of equally valid SMILES strings can be written for a molecule. For example, <code>CCO</code>, <code>OCC</code> and <code>C(O)C</code> all specify the structure of [[ethanol]]. Algorithms have been developed to generate the same SMILES string for a given molecule; of the many possible strings, these algorithms choose only one of them. This SMILES is unique for each structure, although dependent on the [[canonicalization]] algorithm used to generate it, and is termed the canonical SMILES. These algorithms first convert the SMILES to an internal representation of the molecular structure; an algorithm then examines that structure and produces a unique SMILES string. Various algorithms for generating canonical SMILES have been developed and include those by [[Daylight Chemical Information Systems]], [[OpenEye Scientific Software]], [[MEDIT]], [[Chemical Computing Group]], [[MolSoft LLC]], and the [[Chemistry Development Kit]]. A common application of canonical SMILES is indexing and ensuring uniqueness of molecules in a [[Chemical database\|database]]. The original paper that described the CANGEN<ref name="Weininger-1989" /> algorithm claimed to generate unique SMILES strings for graphs representing molecules, but the algorithm fails for a number of simple cases (e.g. [[cuneane]], 1,2-dicyclopropylethane) and cannot be considered a correct method for representing a graph canonically.<ref>{{cite book \|publisher=Springer \|___location=Berlin \|isbn=978-3-540-27967-9 \|volume=3615 \|pages=145–157 \| editor-first = Bertram \| editor-last=Ludäscher \| last1 = Hutchison \| first1 = David \| first2 = Takeo \| last2 = Kanade \| first3 = Josef \| last3 = Kittler \| first4 = Jon M. \| last4 = Klienberg \| author-link4 = Jon Kleinberg \| first5 = Friedemann \| last5 = Mattern \| first6 = John C. \| last6 = Mitchell \| first7 = Moni \| last7 = Naor \| author-link7 = Moni Naor \| first8 = Oscar \| last8 = Nierstrasz \| first9 = C. Pandu \| last9 = Rangan \| author-link9 = Bernhard Steffen (computer scientist) \| first10 = Bernhard \| last10 = Steffen \| first11 = Madu \| last11 = Sudan \| author-link11 = Madhu Sudan \| first12 = Demetri \| last12 = Terzopoulos \| first13 = Dough \| last13 = Tygar \| first14 = Moshe Y. \| last14 = Vardi \| author-link14 = Moshe Y. Vardi \| first15 = Gerhard \| last15 = Weikum \| first16 = Louiqa \| last16 = Raschid \|author16-link=Louiqa Raschid \| first17 = Greeshma \| last17 = Neglur \| first18 = Robert L. \| last18 = Grossman \| first19 = Bing \| last19 = Liu \| name-list-~~format~~style = vanc \| series = Lecture Notes in Computer Science \|title=Data Integration in the Life Sciences \|chapter=Assigning Unique Keys to Chemical Compounds for Data Integration: Some Interesting Counter Examples \|accessdate=2013-02-12 \|year=2005 \|chapterurl=https://doi.org/10.1007%2F11530084_13 \|doi=10.1007/11530084_13 }}</ref> There is currently no systematic comparison across commercial software to test if such flaws exist in those packages. SMILES notation allows the specification of [[molecular configuration\|configuration at tetrahedral centers]], and double bond geometry. These are structural features that cannot be specified by connectivity alone, and therefore SMILES which encode this information are termed isomeric SMILES. A notable feature of these rules is that they allow rigorous partial specification of chirality. The term isomeric SMILES is also applied to SMILES in which [[isomer]]s are specified.

Simplified Molecular Input Line Entry System: Difference between revisions