Content deleted Content added
Changing short description from "Process of converting data from written forms into electronic form" to "Process of converting data from written forms into electronic format" |
|||
(45 intermediate revisions by 29 users not shown) | |||
Line 1:
{{Short description|Process of converting data from written forms into electronic format}}
'''Forms processing''' is a process by which one can capture information entered into data fields and convert it into an electronic format. This can be done manually or automatically, but the general process is that [[hard copy]] data is filled out by humans and then "captured" from their respective fields and entered into a database or other electronic format.
Line 5 ⟶ 6:
==Manual data entry==
This method of [[data processing]] involves human operators keying in data found on the form. The manual process of data entry has many disadvantages in speed, accuracy and cost.
==Automated forms processing==
This method can automate data processing by using pre-defined templates and configurations. A template in this case, would be a ''map'' of the document, detailing where the data fields are located within the form or document. As compared to the manual data entry process, automatic form input systems are
Automatic form input systems use different types of recognition methods such as [[optical character recognition]] (OCR) for machine print, [[optical mark recognition|optical mark reading]] (OMR) for check/mark sense boxes, [[bar code]] recognition (BCR) for barcodes, and [[intelligent character recognition]] (ICR) for hand print.
Line 15 ⟶ 16:
Forms Processing has developed beyond basic capture of the data. Forms processing not only encompasses a recognition process but also helps manage the complete [[:wikt:life cycle|life cycle]] of documents which starts from scanning of the document to the extraction of the data, and often to delivery into a back-end system. In some cases it may also include processing or generating well formatted results through calculations and analysis. An automated forms processing system can be valuable if there is a need to process hundreds or thousands of images every day.
=== First Step: Assessment of the form structure ===
The first step in understanding automated forms processing is to analyze the type of form from which the extraction of data is desired. Forms can be classified as one of two high level categories for the purpose of extracting data. Four categories have been proposed<ref>{{Cite book|url=https://books.google.com/books?id=44arCAAAQBAJ&q=example+of+a+fixed+form+for+extraction&pg=PA425|title=Pattern Recognition and Machine Intelligence: 4th International Conference, PReMI 2011, Moscow, Russia, June 27 - July 1, 2011, Proceedings|last1=Kuznetsov|first1=Sergei O.|last2=Mandal|first2=Deba P.|last3=Kundu|first3=Malay K.|last4=Pal|first4=Sankar Kumar|date=2011-06-25|publisher=Springer|isbn=9783642217869|language=en}}</ref> however the document capture industry has settled up these two:
# Fixed forms. This type of form is defined as one in which the data to be extracted is always found in the same absolute position on a page. This allows a type of lens grid to be applied to the document and every subsequent occurrence of this document in order to extract the data. An example of a fixed form is a typical credit application form.<ref>{{Cite web|url=http://www.bfma.org/resource/resmgr/articles/05_04.pdf|title=CAPTURING SEMI-STRUCTURED FORMS AND DOCUMENTS: CHALLENGES AND AVAILABLE TECHNOLOGIES|last=Vassylyev|first=Artur|date=10 June 2008|archive-url=https://web.archive.org/web/20170428144034/http://www.bfma.org/resource/resmgr/articles/05_04.pdf|archive-date=2017-04-28|url-status=dead|access-date=4 April 2017}}</ref>
# Semi-structured (or unstructured) form. This form is one in which the ___location of the data and fields holding the data vary from document to document. This type of document is perhaps most easily defined by the fact that it is not a fixed form. In the document capture industry, a semi-structured form is also called an unstructured form. Examples of these types of forms include letters, contracts, and invoices. According to a study by AIIM, about 80% of the documents in an organization fall under the semi-structured definition.<ref>{{Cite web|url=https://www.aiim.org/pdfdocuments/MIWP_Forms-Processing_2012.pdf|title=Forms Processing- user experiences of text and handwriting recognition (OCR/ICR)|access-date=4 April 2017|archive-date=28 April 2017|archive-url=https://web.archive.org/web/20170428142430/http://www.aiim.org/pdfdocuments/MIWP_Forms-Processing_2012.pdf|url-status=dead}}</ref>
Although the components (described below) used for the extraction of data from either type of form is the same the way in which these are applied varies considerably based upon the type of document.
===Components===
Line 24 ⟶ 31:
#MICR – [[Magnetic ink character recognition]]
OCR
ICR recognizes hand-printed American and [[European English (disambiguation)|European English]] characters using pre-defined character sets: uppercase, lowercase, [[mixed case]] alphabetic, digits, currency (including $ (dollar), ¢ (cent) € (Euro) £ (pound), ¥ (Yen)), arithmetic and punctuation characters (including period, comma, [[Quotation mark|single quote]], double quote, ! & ( ) ? @ { } \ # % * + – / : ; < = >)
Line 43 ⟶ 50:
===Prerequisites===
Though automated forms processing has many great advantages over manual data entry, it still comes with some limitations. To achieve the best accuracy, some prerequisites should be followed.
#Scan format: It includes the format of scanned file, Resolution and DPI, Color Mode
#Configuration: The scanned image layout needs to be configured for this automation
Line 49 ⟶ 56:
#Result /analyze: Any specific format of result of capture value data presentation.
One very important consideration is indexing, determining the [[metadata]] that will be used to describe the data contained within the documents. This attribute perhaps drives the forms processing solution more than any other.
==External links==
{{wikiquote}}
* [https://web.archive.org/web/20100529053053/http://www.aiim.org.uk/industrywatch/surveys.asp AIIM market intelligence reports]▼
▲* [http://www.aiim.org.uk/industrywatch/surveys.asp AIIM market intelligence reports]
==References==
|