With Docsumo, we are now able to save more than 200 hours per month.
“As a white label ATM provider, we were completely overburdened with monthly reconciliation from bank statements sent by our ATM operators. Manually processing them was just not cutting it forus with our growing volume. With Docsumo, we are able to process bank statements in less than 30 mins with an accuracy rate over 99%. Docsumo’s UI makes it so smooth to review and validate extracted line-items, review the categorization & then reconcile the data with payouts. It has automated the whole process for us and we are really happy with their product & service.”
About the customer
The case study: In a nutshell
Process unstructured bank statements
- Hitachi collects 3,000+ digital and scanned bank statement copies per month to reconcile payouts to ATM operators with withdrawals from the ATM.
Identify & classify bank statements
- A clean or contaminated bank statement is determined based on the transactions recorded. In order to classify a bank statement as clean/contaminated, each transaction recorded in it needs to be categorized accordingly.
- Data to extract includes transaction details and category.
Capture data from bank statements with 50+ layouts
- Not only did the structures vary for different bank statements but the position of data to capture varies for these documents
- Some of them were in tabular formats.
Categorize & derive attributes from extracted data
- The manual extraction lacked a logical validation of payment and transaction details.
The Docsumo Solution
Ingesting bank statements
- API-based direct integration that seamlessly ingests Bank Statements onto Docsumo.
Pre-processing and getting ready for data extraction
- Inbuilt document pre-processors identified the letter formats (JPG, PDF, PNG etc.) and queued them up for data extraction.
Data extraction from unstructured text
- Docsumo's OCR module used the vectorized position reference in a letter to extract data.
- The OCR not only parsed through letters with varying fonts, layouts, image quality, and resolution; it even extracted data from the tables with 95%+ accuracy.
Intelligent categorization of key value pairs
- Our proprietary NLP-based classification framework started rapidly learning from all the documents. It was trained to categorize key value pairs and line items.
- Another algorithm started making intelligent predictions to identify the data within a bank statements.
Rule-based data validation
- Once the data is extracted, a rule-based validation engine applied contextual data validation and correction algorithms.
Integration with downstream software
- The data was extracted in a JSON format that was easily integrated into client's database via APIs and iframe.
Result: 99%+ Data extraction accuracy
Fill up the form to speak with an automation expert.