[an error occurred while processing the directive]
RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2
2011/05/18 17:04:00

OCR - Optical Character Recognition

OCR (Optical Character Recognition), optical character recognition – a class of the software broadcasting the image of the printed or handwritten texts in machine-readable (or mashino-edited) texts.

Methods directly come from such areas as Computer vision and Recognition of templates form a basis for creation of algorithms of optical recognition. However the majority of modern OCR programs for quality improvement of recognition use the built-in linguistic modules which can include dictionaries of the general lexicon or specialized subject dictionaries (for example, names dictionaries and surnames, or names of the cities) and also morphological rules for creation of forms of words, or if the word absolutely is absent in the dictionary, for check of admissible derivational rules. Languages for which similar modules are implemented carry names of languages with dictionary support.

In addition to recognition actually of characters, elements of formatting of a source text, such as pictures, columns, type face and other not text components most close to the original allow to reproduce the majority of the industrial OCR systems.

EDMS - Systems of stream recognition

Основная статья: EDMS are the Systems of stream recognition