[[Image:Aqua-distillata.jpg|thumb|250px|Bottle for Distilled water in the Real Farmacia in Madrid.]]
{{cleanup-laundry}}
'''Distilled water''' is [[water]] that has virtually all of its [[Impurity|impurities]] removed through [[distillation]]. Distillation involves [[boiling]] the water and re-condensing the [[steam]] into a clean container, leaving most contaminants behind.
==Applications==
'''Optical character recognition''', usually abbreviated to '''OCR''', is a type of [[computer]] software designed to translate [[image]]s of handwritten or typewritten text (usually captured by a [[Image scanner|scanner]]) into machine-editable text, or to translate pictures of characters into a standard encoding scheme representing them (e.g. [[ASCII]] or [[Unicode]]). OCR began as a field of research in [[pattern recognition]], [[artificial intelligence]] and [[machine vision]]. Though academic research in the field continues, the focus on OCR has shifted to implementation of proven techniques.
{{Unreferencedsect|date=July 2007}}
In chemical and biological laboratories, as well as industry, cheaper alternatives such as [[deionized water]] are preferred over distilled water.{{Fact|date=February 2007}} However, if these alternatives are not sufficiently pure, distilled water is used. Where exceptionally high purity water is required, [[double distilled water]] is used.
Distilled water is also commonly used to top up [[lead acid batteries]] used in cars and trucks. The presence of other ions commonly found in tap water will cause a drastic reduction in its lifespan.
Optical character recognition (using optical techniques such as mirrors and lenses) and digital character recognition (using scanners and computer algorithms) were originally considered separate fields. Because very few applications survive that use true optical techniques, the optical character recognition term has now been broadened to cover digital character recognition as well.
Distilled water is preferable to tap water for use in automotive cooling systems. The minerals and ions typically found in tap water can be corrosive to internal engine components, and can cause a more rapid depletion of the anti-corrosion additives found in most [[antifreeze]] formulations.{{Fact|date=May 2007}}
Early systems required training (the provision of known samples of each character) to read a specific [[typeface|font]]. "Intelligent" systems with a high degree of recognition accuracy for most fonts are now common. Some systems are even capable of reproducing formatted output that closely approximates the original scanned page including images, columns and other non-textual components.
Using distilled water in [[steam iron]]s for pressing clothes can help reduce mineral build-up and make the iron last longer. However, many iron manufacturers say that distilled water is no longer necessary in their irons.{{Fact|date=February 2007}}
== History ==
Some people use distilled water for household [[aquariums]] because it lacks the chemicals found in [[tap water]] supplies. It is important to supplement distilled water when using it for [[fishkeeping]]; it is too pure to sustain proper chemistry to support an aquarium ecosystem.{{Fact|date=May 2007}}
In 1929, G. Tauschek obtained a patent on OCR in Germany, followed by Handel who obtained a US [[patent]] on OCR in USA in 1933 (U.S. Patent 1,915,993). Tauschek was in 1935 also granted a US patent on his method (U.S. Patent 2,026,329).
==Drinking distilled water==
Tauschek's machine was a mechanical device that used templates. A photodetector was placed so that when the template and the character to be recognised was lined up for an exact match, and a light was directed towards it, no light would reach the photodetector.
{{Unreferencedsect|date=February 2007}}
Drinking distilled water is quite common.
Many beverage manufacturers use distilled water to ensure a drink's purity and taste. Bottled distilled water is sold as well, and can usually be found in [[supermarkets]]. [[Water purification]], such as distillation, is especially important in regions where water resources or tap water is not suitable for ingesting without boiling or chemical treatment.
In 1950, David Shepard, a cryptanalyst at the [[Armed Forces Security Agency]] in the [[United States]], was asked by Frank Rowlett, who had broken the [[Purple code|Japanese PURPLE diplomatic code]], to work with Dr. Louis Tordella to recommend data automation procedures for the Agency. This included the problem of converting printed messages into machine language for computer processing. Shepard decided it must be possible to build a machine to do this, and, with the help of Harvey Cook, a friend, built "Gismo" in his attic during evenings and weekends. This was reported in the Washington Daily News on [[April 27]] [[1951]] and in the New York Times on [[December 26]] [[1953]] after his U.S. Patent Number 2,663,758 was issued. Shepard then founded [[Intelligent Machines Research Corporation]] (IMR), which went on to deliver the world's first several OCR systems used in commercial operation. While both Gismo and the later IMR systems used image analysis, as opposed to character matching, and could accept some font variation, Gismo was limited to reasonably close vertical registration, whereas the following commercial IMR scanners analyzed characters anywhere in the scanned field, a practical necessity on real world documents.
Water filtration devices are common in many households. Most of these devices do not distill water, though there continues to be an increase in consumer-oriented [[water distiller]]s and reverse osmosis machines being sold and used. Municipal water supplies often add or have trace impurities at levels which are regulated to be safe for consumption. Much of these additional impurities, such as [[volatile organic compounds]], [[fluoride]], and an estimated 75,000+ other chemical compounds{{Fact|date=February 2007}} are not removed through conventional filtration; however, distillation does eliminate nearly all of these impurities.
The first commercial system was installed at the [[Readers Digest]] in 1955, which, many years later, was donated by Readers Digest to the [[Smithsonian]], where it was put on display. The second system was sold to the [[Standard Oil]] Company of [[California]] for reading [[credit card]] imprints for billing purposes, with many more systems sold to other oil companies. Other systems sold by IMR during the late 1950s included a bill stub reader to the [[Ohio Bell Telephone Company]] and a page scanner to the [[United States Air Force]] for reading and transmitting by teletype typewritten messages. [[IBM]] and others were later licensed on Shepard's OCR patents.
Distilled water is also used as drinking water in arid seaside areas which do not have sufficient freshwater, by distilling seawater. It is quite common on ships, especially [[nuclear ship|nuclear powered ships]], which require a large supply of distilled water as coolant. The drinking water is produced in [[desalination plant]]s, although it is very expensive due to the large amount of fuel needed to boil water. Alternative technologies like [[reverse osmosis]] are becoming increasingly important in this regard due to their greatly reduced costs.
The [[United States Postal Service]] has been using OCR machines to sort mail since 1965 based on technology devised primarily by the prolific inventor [[Jacob Rabinow]]. The first use of OCR in Europe was by the British General Post Office or [[General Post Office (United Kingdom)|GPO]]. In 1965 it began planning an entire banking system, the [[National Giro]], using OCR technology, a process that revolutionized bill payment systems in the UK. [[Canada Post]] has been using OCR systems since 1971. OCR systems read the name and address of the addressee at the first mechanized sorting center, and print a routing bar code on the envelope based on the [[postal code]]. After that the letters need only be sorted at later centers by less expensive sorters which need only read the [[bar code]]. To avoid interference with the human-readable address field which can be located anywhere on the letter, special ink is used that is clearly visible under [[ultraviolet light]]. This ink looks orange in normal lighting conditions. Envelopes marked with the [[machine readable]] bar code may then be processed.
===Pros and cons===
== Current state of OCR technology ==
The drinking of distilled water has been both advocated and discouraged for health reasons. The lack of naturally-occurring minerals in distilled water has raised some concerns.
The Journal of General Internal Medicine<ref>{{Citation
The accurate recognition of [[Latin alphabet|Latin-script]], typewritten text is now considered largely a solved problem.
| last1=Azoulay | first1=Arik
| last2=Garzon | first2=Philippe
| last3=Eisenberg | first3=Mark
| year=2001
| title=Comparison of the Mineral Content of Tap Water and Bottled Waters
| periodical=Journal of General Internal Medicine
| volume=16
| issue=3
| pages=168-175
| url=http://www.blackwell-synergy.com/links/doi/10.1111/j.1525-1497.2001.04189.x/enhancedabs/
}}</ref> published a study on the mineral contents of different waters available in the US. The study concluded, "drinking water sources available to North Americans may contain high levels of [[Calcium]], [[Magnesium]], and [[Sodium]] and may provide clinically important portions of the recommended dietary intake of these minerals. Physicians should encourage patients to check the mineral content of their drinking water, whether tap or bottled, and choose water most appropriate for their needs." Since distilled water doesn't contain minerals, supplemental mineral intake through diet is needed to maintain proper health.
It is often observed that consumption of "hard" water, or water that has some minerals, may have beneficial cardiovascular effects. As noted in the American Journal of Epidemiology, consumption of hard drinking water is negatively correlated with atherosclerotic [[heart disease]].<ref>{{Citation
Recognition of hand printing, cursive handwriting, and even the printed typewritten versions of some other scripts (especially those with a very large number of characters), is still the subject of active research.
| last=Voors
| first=A. W.
| year=1971
| title=Mineral in the municipal water and atherosclerotic heart death
| periodical=American Journal of Epidemiology
| volume=93
| issue=4
| pages=259-266
| url=http://aje.oxfordjournals.org/cgi/content/abstract/93/4/259
}}</ref> Since distilled water is devoid of minerals, it will not have these potential benefits.
It has been suggested that -- because distilled water lacks [[fluoride]] ions that are added by many governments (e.g. municipalities in the United States) at water treatment plants using [[fluoridation]] for its supposed effect on the inhibition of [[caries|cavity]] formation -- the drinking of distilled water may increase the risk of tooth decay due to a lack of this element.<ref>[http://www.medpagetoday.com/PrimaryCare/DentalHealth/tb/1756 ''Bottled Water Cited as Contributing to Cavity Comeback'' at MedPage Today]</ref>
Systems for [[handwriting recognition|recognizing hand-printed text]] on the fly have enjoyed commercial success in recent years. Among these are the input device for [[personal digital assistant]]s such as those running [[Palm OS]]. The [[Apple Newton]] pioneered this technology. The algorithms used in these devices take advantage of the fact that the order, speed, and direction of individual lines segments at input are known. Also, the user can be retrained to use only specific letter shapes. These methods cannot be used in software that scans paper documents, so accurate recognition of hand-printed documents is still largely an open problem. Accuracy rates of 80% to 90% on neat, clean hand-printed characters can be achieved, but that accuracy rate still translates to dozens of errors per page, making the technology useful only in very limited contexts. This variety of OCR is now commonly known in the industry as ICR, or [[Intelligent Character Recognition]].
A purported effect of drinking water in its pure form is a 'more powerful solvent' that helps cleanse toxins from the body{{Fact|date=February 2007}}.
Recognition of [[cursive]] text is an active area of research, with recognition rates even lower than that of hand-printed text. Higher rates of recognition of general cursive script will likely not be possible without the use of contextual or grammatical information. For example, recognizing entire words from a dictionary is easier than trying to parse individual characters from script. Reading the ''Amount'' line of a [[cheque]] (which is always a written out number) is an example where using a smaller dictionary can increase recognition rates greatly. Knowledge of the grammar of the language being scanned can also help determine if a word is likely to be a verb or a noun, for example, allowing greater accuracy. The shapes of individual cursive characters themselves simply do not contain enough information to accurately (greater than 98%) recognize all handwritten cursive script.
The cost of distilling water (about 0.04 to 0.10 Euro or USD per litre in 2005) prohibits its use by most households worldwide.{{Fact|date=February 2007}}
A particularly difficult problem for computers and humans is that of old church baptismal and marriage records containing mostly names. The pages may be damaged by age, water or fire and the names may be obsolete or contain rare spellings.
Another research area is cooperative approaches, where computers assist humans and vice-versa. Computer image processing techniques can assist humans in reading extremely difficult texts such as the [[Archimedes Palimpsest]] or the [[Dead Sea Scrolls]].
==Myths==
Generally, for more complex recognition problems [[Artificial neural network|neural networks]] are commonly used as they generally can be made indifferent to both [[affine]] and [[non-linear]] transformations.<ref>http://yann.lecun.com/exdb/lenet/</ref>
A popular myth about distilled water is that it has the dangerous property of being more easily heated above its normal [[boiling point]] without actually boiling (as seen in "Mythbusters") in a process known as [[superheating]]. When superheated water is disturbed or has impurities added to it, a nucleation center for bubbles form. These bubbles are then new nucleation centers, and a sudden, explosive boiling can occur, possibly causing serious injury to those nearby. However, distilled water and tap water do not differ in their ease of or danger in being superheated. The dissolved impurities in motionless tap water do not present enough disturbance to inhibit superheating.
== Music OCR References==
<references/>
{{main|Music OCR}}
==See also==
Early research into recognition of printed sheet music was performed in the mid [[1970s]] at [[Massachusetts Institute of Technology|MIT]] and other institutions. Successive efforts were made to localize and remove musical staff lines leaving symbols to be recognized and parsed. The first proprietary music-scanning program, MIDISCAN, was released in [[1991]]. Three proprietary products are now available but music OCR software does not recognize handwritten scores.
* [[Deionized water]]
* [[Atmospheric water generator]] ''(Make distilled water from air)''
* [[Heavy water]]
* [[Double distilled water]]
[[Category:Liquid water]]
== MICR ==
[[Category:Distillation]]
[[Category:Drinking water]]
[[de:Destilliertes Wasser]]
One area where accuracy and speed of computer input of character information exceeds that of humans is in the area of [[magnetic ink character recognition]], where the error rates range around one read error for every 20,000 to 30,000 checks.
[[es:Agua destilada]]
[[gl:Auga destilada]]
== Optical Character Recognition in Unicode ==
[[it:Acqua distillata]]
[[he:מים מזוקקים]]
In [[Unicode]], ''Optical Character Recognition'' symbol characters are placed in the [[hexadecimal]] range 0x'''2440'''–0x'''245F''', as shown below (see also [[Unicode Symbols]]):
[[nl:Gedestilleerd water]]
[[pl:Woda destylowana]]
{| class="wikitable" {{CT-1}}
[[pt:Água destilada]]
! colspan="4" rowspan="3" {{CT-2}}| || <small>'''Symbol'''</small>|| rowspan="2" {{CT-3}}| Name|| colspan="4" rowspan="3" {{CT-4}}|
[[ru:Дистиллированная вода]]
|-
[[sl:Destilirana voda]]
! Hex
[[sv:Destillerat vatten]]
|-
[[zh:蒸馏水]]
! colspan="2" {{CT-2}}| <small>Symbol's Picture</small>
|- class="Unicode"
| width="0*" {{CT-7}}| ⑀|| rowspan="2" {{CT-3}}| OCR Hook || width="0*" {{CT-7}}| ⑁|| rowspan="2" {{CT-3}}| OCR Chair || width="0*" {{CT-7}}| ⑂|| rowspan="2" {{CT-3}}| OCR Fork || width="0*" {{CT-7}}| ⑃|| rowspan="2" {{CT-3}}| OCR Inverted Fork|| width="0*" {{CT-7}}| ⑄|| rowspan="2" {{CT-3}}| OCR Belt Buckle
|-
| 0x2440|| 0x2441|| 0x2442|| 0x2443|| 0x2444
|-
| colspan="2" width="20%" {{CT-2}}| [[Image:U+2440.gif]] || colspan="2" width="20%" {{CT-2}}| [[Image:U+2441.gif]]|| colspan="2" width="20%" {{CT-2}}| [[Image:U+2442.gif]]|| colspan="2" width="20%" {{CT-2}}| [[Image:U+2443.gif]]|| colspan="2" width="20%" {{CT-2}}| [[Image:U+2444.gif]]
|- class="Unicode"
| {{CT-7}}| ⑅|| rowspan="2" {{CT-3}}| OCR Bow Tie|| {{CT-7}}| ⑆|| rowspan="2" {{CT-3}}| OCR Branch Bank Identification|| {{CT-7}}| ⑇|| rowspan="2" {{CT-3}}| OCR Amount Of Check|| {{CT-7}}| ⑈|| rowspan="2" {{CT-3}}| OCR Customer Account Number|| {{CT-7}}| ⑉|| rowspan="2" {{CT-3}}| OCR Dash
|-
| 0x2445|| 0x2446|| 0x2447|| 0x2448|| 0x2449
|-
| colspan="2" {{CT-2}}| [[Image:U+2445.gif]] || colspan="2" {{CT-2}}| [[Image:U+2446.gif]]|| colspan="2" {{CT-2}}| [[Image:U+2447.gif]]|| colspan="2" {{CT-2}}| [[Image:U+2448.gif]]|| colspan="2" {{CT-2}}| [[Image:U+2449.gif]]
|- class="Unicode"
| {{CT-7}}| ⑊|| rowspan="2" {{CT-3}}| OCR Double Backslash|| || rowspan="2" {{CT-3}}| <small>Not Defined</small>|| || rowspan="2" {{CT-3}}| <small>Not Defined</small>|| || rowspan="2" {{CT-3}}| <small>Not Defined</small>|| || rowspan="2" {{CT-3}}| <small>Not Defined</small>
|-
| 0x244A|| 0x244B|| 0x244C|| 0x244D|| 0x244E
|-
| colspan="2" {{CT-3}}| [[Image:U+244A.gif]] || colspan="2" {{CT-3}}| - || colspan="2" {{CT-3}}| - || colspan="2" {{CT-3}}| - || colspan="2" {{CT-3}}| -
|}
== Software ==
=== Proprietary software ===
* [[Abbyy]] FineReader - growing in the market. In recent years is the default OCR software bundled with many scanner brands.
* [http://www.axios.es/ Axios] Agile Guard - Recognizing of personal identification cards.
* [http://www.ocr.com/ Cuneiform]
* [http://www.docAssist.com/ docAssist] - OCR for full text search and document management
*[http://www.intelliant.fr/en/ocr-document-management-systems-software.html Intelliant OCR] is a commandline OCR utility, based on [http://www.ocr.com Tiger OCR].
* [[NovoDynamics]]'s [http://www.novodynamics.com VERUS] - High-performance Optical Character Recognition and image enhancement for Arabic-based scripts, including Farsi, Pashto, Urdu and Arabic OCR.
*[http://www.arhungary.hu OCR Document Readers] Highest performance readers from Adaptive Recognition Hungary
* [[OmniPage]] - for years the most recognized OCR and market leader software suite. Owns the current [[PC Magazine]] Editor's Choice awarded in [[2003]].
* [[Readiris]] - reads European languages, Arabic, Hebrew, Asian languages.
* [http://www.odt-oce.com RecoStar] A high performance OCR Engine
*[http://www.simpleocr.com/ SimpleOCR] A relatively simple [[freeware]] accurate on common fonts. Engine also available as an [[SDK]] (supports English, French and Dutch language recognition)
* [http://www.pegasusimaging.com/icr.htm SmartZone OCR] - offers developers the ability to perform zonal OCR.
* [[TeleForm|TeleForm]] - for capturing data from handwritten forms.
* [http://www.nuance.com/textbridge/ TextBridge] - bundled with many scanners, simpler and with less resources than its sister product Omnipage.
* [http://www.rerecognition.com/htm/english.htm KADMOS] An easily integrated professional OCR/ICR engine, with highest handprint recognition accuracy and support for many languages and fonts.
* [http://www.ideatechnosoft.com/ocr_icr.html ixFormCL, ixtract] Custom OCR/ICR/Barcode solutions for special purposes and higher accuracy in specific applications
=== Free and open source software ===
*[[Tesseract (software)|Tesseract]] is an open source OCR, initially developed by [[HP]], and released under the [[Apache License]], Version 2.0. It can be compiled using MSVC 6.0 or GCC (~120000 [[Source_lines_of_code|LOC]])
*[[Clara (OCR software)|Clara]] - [http://www.geocities.com/claraocr/], [http://freshmeat.net/projects/claraocr/] (~50000 [[Source_lines_of_code|LOC]])
*[[GOCR]] - (~20000 [[Source_lines_of_code|LOC]] + [http://unpaper.berlios.de Unpaper] + [http://www.linux-speakup.org/socrates.html Socrates]) - GOCR included in [[Debian]] and other distributions
*[[Ocrad]] - [http://www.gnu.org/software/ocrad/ocrad.html] - (~9900 [[Source_lines_of_code|LOC]]) - "is an OCR [...] program based on a feature extraction method".
*[http://www.simpleocr.com Simple OCR] Royalty Free
*[http://www.isri.unlv.edu/ISRI/Software ISRI Software] - some experimental OCR tools
*[http://http.cs.berkeley.edu/%7Efateman/kathey/ocrchie.html OCRchie] - dormant since 1996
*[http://oocr.sourceforge.net OOCR] OOCR is an OCR program still in development, under the [[GPL]].
*[http://www.phpclasses.org/browse/package/2874.html phpOCR] A base implementation for an OCR tool in PHP
[[Category:Optical character recognition]]
*[[Kognition]] - [http://kognition.sourceforge.net/]
== See also ==
* [[Automatic number plate recognition]]
* [[Barcode]] and [[barcode scanner]]s
* [[Captcha]]
* [[Computer vision]]
* [[Digital image processing]]
* [[Intelligent Character Recognition|ICR]]
* [[Machine learning]]
* [[Machine vision]]
* [[Magnetic ink character recognition]] (MICR)
* [[Mapping of Unicode characters]]
* [[Optical mark recognition]] (OMR)
* [[Pattern recognition]]
* [[Raster to vector]]
* [[Raymond Kurzweil]]
* [[Speech recognition]]
* [[SmartScore]]
* [[VueScan]] - has [http://www.simpleocr.com Simple OCR] embedded
== External links ==
* [http://www.icdar2007.org ICDAR] ICDAR is one of the most comprehensive conferences on all aspects of document recognition, including OCR, and is held every two years.
* [http://www.phpclasses.org/browse/package/2874.html phpOCR] A base implementation for an OCR tool in PHP
* [http://www.gnu.org/software/ocrad/ocrad.html GNU Ocrad] "is an OCR [...] program based on a feature extraction method".
* [http://www.drr2007.org DRR] SPIE DRR is an annual conference on OCR and document retrieval.
* [http://www.isri.unlv.edu/ISRI/Software#Experimental_Open_Source_OCR Reference OCR Engine] An open-source OCR project.
*[http://oocr.sourceforge.net OOCR] OOCR is an OCR program still in development, under the [[GPL]].
* [http://jocr.sourceforge.net/ GOCR] GOCR is an OCR program, developed under the [[GPL]].
*[http://sourceforge.net/projects/tesseract-ocr Tesseract] Tesseract is an open source OCR, initially developed by [[HP]], and released under the [[Apache License]], Version 2.0. It can be compiled using MSVC 6.0 or GCC.
{{SpecialChars}}
{{Paper data storage media}}
[[Category:Artificial intelligence applications]]
[[Category:Applications of computer vision]]
[[Category:Optical character recognition|*]]
[[Category:Information technology]]
[[Category:Unicode]]
[[Category:Symbols]]
[[cs:OCR]]
[[de:Texterkennung]]
[[el:Οπτική Αναγνώριση Χαρακτήρων]]
[[es:Reconocimiento óptico de caracteres]]
[[eo:Optika signorekono]]
[[fa:تشخیص نوری نویسهها]]
[[fr:Reconnaissance optique de caractères]]
[[gl:Optical Character Recognition]]
[[hr:Optičko prepoznavanje znakova]]
[[is:Ljóslestur]]
[[it:Optical Character Recognition]]
[[he:זיהוי תווים אופטי]]
[[hu:Optikai karakterfelismerés]]
[[nl:Optical Character Recognition]]
[[ja:光学文字認識]]
[[pl:OCR]]
[[pt:OCR]]
[[fi:Tekstintunnistus]]
[[sv:Optical character recognition]]
[[th:โอซีอาร์]]
[[tr:OCR]]
[[zh:光学字符识别]]
|