Using Docsumo turned out to be a real game changer for us.
“Bringing down the invoice processing time from a few hours to less than 5 minutes with 99% accuracy has been a real-game changer for us. With Docsumo’s help, we have been able to automate invoice processing resulting in lower turnaround time and better customer experience.”
About the customer
The case study: In a nutshell
Process unstructured invoices
- Valtatech collects data from varying invoices received from 60+ enterprise customers.
Identify & classify invoices
- Valtatech needs to classify and categorize data from different types of invoices
- Data to extract includes transaction details and key-value pairs consisting of company details
Capture data from invoices with 60+ layouts from 60+ enterprises
- Not only did the structures vary for different invoices but the position of data to capture varies for these documents
- Some of them has nested tables as a part of transaction details
Categorize & derive attributes from extracted data
- The manual extraction lacked a logical validation of payment and transaction details.
The Docsumo Solution
Ingesting invoices to the API
- API-based direct integration that seamlessly ingests invoices onto Docsumo.
Pre-processing and getting ready for data extraction
- Inbuilt document pre-processors identified the letter formats (JPG, PDF, PNG etc.) and queued them up for data extraction.
Data extraction from unstructured text
- Docsumo's OCR module used the vectorized position reference in a letter to extract data.
- The OCR not only parsed through letters with varying fonts, layouts, image quality, and resolution; it even extracted data from the tables with 95%+ accuracy.
Intelligent categorization of key value pairs
- Our proprietary NLP-based classification framework started rapidly learning from all the documents. It was trained to categorize key value pairs and line items.
- Another algorithm started making intelligent predictions to identify the data within an invoice.
Rule-based data validation
- Once the data is extracted, a rule-based validation engine applied contextual data validation and correction algorithms.
Integration with downstream software
- The data was extracted in a JSON format that was easily integrated into downstream bill payment software via APIs and iframe.
Result: 99%+ Data extraction accuracy
Fill up the form to speak with an automation expert.