What is optical character recognition (OCR)?
Optical character recognition (OCR), also called text recognition, is the technology that converts images to text so that computers can extract text data from image files. OCR technology classifies optical patterns in digital images based on how they correspond to alphanumeric characters.
OCR can be a huge productivity shortcut for students, researchers and entrepreneurs who deal with a lot of documents. Once you process a document with OCR technology, you can easily edit, search, index and retrieve the text data. You can also compress the document into zip files, highlight keywords or incorporate it into a website.
How does optical character recognition work?
OCR works by examining a physical document and translating the characters into code that can be used for data processing. The basic steps are image acquisition, preprocessing, segmentation, feature extraction, classification and post-processing.
You’ll need to preprocess the training data thoroughly before feeding it into the model. Preprocessing tasks include thresholding (converting a color or gray raw image into a binary image), normalization and noise reduction. You can use various techniques such as morphological operations to connect unconnected pixels, remove isolated pixels and smooth pixels boundary.
At the beginning of an OCR project, you’ll scan and copy the physical documents and have the OCR software convert them to a binary version. Then, the computer analyzes the scanned images for light and dark areas. It’ll identify light areas as background and dark areas as written characters that need to be recognized.
Next, the computer processes the dark areas to find alphabetic letters, numeric digits and symbols. There are various techniques for OCR programs, but most involve targeting one character, word or block of text at a time.
How are optical character recognition systems trained?
You can train some OCR programs with pattern recognition. These models are trained with examples of texts in various fonts and formats that are then used to compare and recognize characters in the scanned document. Other OCR systems use feature detection, where the OCR program applies rules regarding the features of a specific letter, number or symbol, in order to recognize characters in the scanned image. For example, some common features could be the number of angled lines, cross lines or curves in a written character. Your OCR model might store the capital letter “A” as having two diagonal lines that meet with a horizontal line across the middle.
Finally, when your model identifies a written character or number, it can be converted into an ASCII (American Standard Code for Information Interchange) code. ACSII is the most common format for text files in computers and on the internet, where each character or number is represented with a 7-bit binary number.
What is optical character recognition used for?
You can use OCR for a variety of data entry and data categorization tasks. Here are a few examples.
- Data Entry: OCR can automate data entry tasks for business documents. You can use OCR software to turn hard copies of legal or historical documents into PDF files. This way, you can edit, format and search as if you created the document with a word processor.
- Data Categorization: You can use OCR for a wide range of data categorization tasks. For example, you can automate sorting letters for mail delivery, or electronically depositing checks without the need for a bank teller. Use cases include adding certified legal documents into an electronic database and indexing print material for search engines. You can also use OCR to decipher documents into text, which you can then convert to audio for visually impaired users. More examples of OCR-powered technology include translation apps, online databases like Google Books and security cameras to recognize license plates.