Docsumo’s auto-classification feature makes processing of non-uniform utility bills smooth & accurate.
“We’re processing utility bills from 6 different service providers for portfolio management. The challenge was to have just one solution to process all different versions of bills to save us the hassle of retraining & switching amongst multiple solutions. Docsumo has been able to deliver just that - one solution for all different variations.”
About the customer
The case study: In a nutshell
Process 6 types of utility bills from 6 different vendors
- Westland processes 2,000+ utility bills in 6 different layouts from 6 different vendors.
- The team of data entry operators were stretched as they manually captured data from bills for analysis
Extract accurate data from bills
- Westland had to scan and extract accurate data from unstructured utility bills and feed it into Yardi.
Bills had varying structures
- Received utility bills vary in terms of structure, and the fields to capture vary.
No in-built validation procedures
- The manual extraction lacked a logical validation of data captured.
The Docsumo Solution
Ingesting utility bills
- API-based direct integration that seamlessly ingests utility bills onto Docsumo.
Pre-processing and getting ready for data extraction
- Inbuilt document pre-processors identified the bills formats (JPG, PDF, PNG etc.) and queued them up for data extraction.
Data extraction from unstructured text
- Docsumo's OCR module used the vectorized position reference in a letter to extract data.
- The OCR not only parsed through letters with varying fonts, layouts, image quality, and resolution; it even extracted data from the tables with 95%+ accuracy.
Intelligent categorization of key value pairs
- Our proprietary NLP-based classification framework started rapidly learning from the debt settlement letter templates. It was trained to categorize key value pairs and line items.
- Another algorithm started making intelligent predictions to identify the data within a letter.
Rule-based data validation
- Once the data was extracted, a rule-based validation engine applied contextual data validation and correction algorithms.
Integration with Yardi
- The data was extracted in a JSON format that was easily integrated into Yardi via APIs and iframe.
Result: 99%+ Data extraction accuracy
Fill up the form to speak with an automation expert.