Docsumo is your go-to solution if you need a flexible solution to capture data from unstructured documents
“Docsumo does a very good job when it comes to our specific use-case. Debt settlement letters vary a lot from each other, but Docsumo manages to capture data accurately almost every single time at the processing speed which is unprecedented. We’re witnessing a straight through processing rate of over 95% with Docsumo.”
About the customer
The case study: In a nutshell
Process large amounts of debt settlement letters
- National Debt Relief (NDR) needed to process 350K debt settlement letters received from creditors annually.
- The team of 50 agents were stretched as they tried to manually reconcile the letters with the negotiated deal.
Extract accurate data from letters
- NDR had to scan and extract accurate data from unstructured settlement letters and feed it into Salesforce.
- Data included names, account numbers, settlement amounts, payment terms etc.
Letters had varying structures and mostly had running text
- Not only did the structures vary for different debt collectors, but the payment schedule was often written as running text.
- Some of them were in tabular formats.
No in-built validation procedures
- The manual extraction lacked a logical validation of debt amounts or instalments.
The Docsumo Solution
Ingesting debt settlement letters
- API-based direct integration that seamlessly ingests debt settlement letters onto Docsumo.
Pre-processing and getting ready for data extraction
- Inbuilt document pre-processors identified the letter formats (JPG, PDF, PNG etc.) and queued them up for data extraction.
Data extraction from unstructured text
- Docsumo's OCR module used the vectorized position reference in a letter to extract data.
- The OCR not only parsed through letters with varying fonts, layouts, image quality, and resolution; it even extracted data from the tables with 95%+ accuracy.
Intelligent categorization of key value pairs
- Our proprietary NLP-based classification framework started rapidly learning from the debt settlement letter templates. It was trained to categorize key value pairs and line items.
- Another algorithm started making intelligent predictions to identify the data within a letter.
Rule-based data validation
- Once the data was extracted, a rule-based validation engine applied contextual data validation and correction algorithms.
- For example, the validation ensured 12 instalments of $50 each amounted to $600 within the letter.
Integration with Salesforce
- The data was extracted in a JSON format that was easily integrated into NDR's Salesforce instance via APIs and iframe.
Result: 99%+ Data extraction accuracy
Fill up the form to speak with an automation expert.