ID Verification and Optical Character Recognition

There’s a major change going on in the industry, DIY online identification, and efforts by companies to combine OCR technology, facial recognition software, and low-cost manual review teams. On its face, using DIY online identity verification makes sense, but it’s vital to understand the major limitations of ID verification solutions. Optical character recognition is a commonly used method for online identity verification. OCR extracts important data from ID documents such as driver’s license or passport. This will generally include a person’s name, address, date of birth, and ID number. The data extraction process is incredibly fast and removes the need for manual data input.

OCR is a great method to verify customer ID but it does have its fair share of challenges. OCR technology was first intended for reading black text against a white background using a flatbread scanner, not for extracting key data fields from ID documents using small fonts and different colored backgrounds that may include holograms, watermarks, and printing on glossy surfaces.

Common Limitations of OCR

1. Structuring Data Involves More than Just OCR

Whenever users take a picture of their ID document with their smartphone or webcam, several steps are required to extract the information. The first is to recognize the type of ID document that the user is submitting. This allows the technology to properly structure the information to read the OCR, which requires figuring out the first name, last name, DOB, and other important data. Straight OCR without AI or any technology built to specifically recognize ID types will lack the required accuracy you need to fight fraud and deliver a good user experience. 

2. OCR Have to be Combined with Image Rectification

When a user clicks a photo of their ID documents, the image needs to be de-skewed if the image wasn’t aligned properly and reoriented for the OCR technology to properly authenticate the ID data for online ID data. 

3. Colored Background ID Documents Can Be Challenging for OCR

OCR usually takes color/grayscale photos and converts them to plain black and white to reduce blurred texts and better separate black and white texts from their background. 

4. Glare & Blur Leads to Mistake

It’s extremely common for customers to click photos of ID documents with glare and blur. Whenever there’s a glare or blurriness in the ID image, the probability of data extraction and authentication mistakes becomes significantly higher. 

5. Webcams are a Challenge for Traditional OCR 

OCR poses another challenge for businesses operating in the financial industry and trying to offer an Omnichannel experience by allowing customers to click the photo of ID documents using a couple of technologies. While most smartphone cameras right now offer high picture quality, the same can’t be said for webcams built into laptops and tablets. If a business allows customers to click photos for customer onboarding using webcams, then it can impact the picture quality of the document. This can increase the risk of mistakes caused by OCR technologies.

6. OCR May be Challenged by Some ID Subtypes

Optical Character Recognition (OCR) is based on extensive learning of the patterns that characterize a specific ID type, and this can make it challenging for solutions based on numerous ID subtypes. OCR is only usable if the ID data is collected and authenticated correctly as it requires the software to understand all nuances and minor features of different ID types around the globe.