How to improve data quality and work independently with new tools

How to improve data quality and work independently with new tools

The Customer

An innovative bank that requested the service provides high-tech financial services for private clients and businesses.

Project implementation period: 18 months.

The bank has several dozens of information systems and a corporate data warehouse. Over time, the Customer became concerned about the quality of the data contained in the repository.

Additionally, only a limited number of users had access to the existing repository and primary data sources — as a rule, they were technical specialists of the service organization (contractor). The business user was forced to turn to intermediaries in order to obtain a certain set of data. A big number of intermediate links in the work entailed additional labor costs.

Problems faced by the Customer:

  • Business users couldn’t work directly with the required data set.
  • The quality of the data in the existing repository didn’t meet the requirements of business users.
  • Specialists didn’t have knowledge of the location of certain information. And there was no documentation on how to work with information systems.
  • A third party had to be involved to work with the data.
  • There was a limited number of reports and difficulty in creating new reports.

The Customer has set a number of strategic goals for themselves:

  • Ensure the collection of data from various sources and provide business users with access to a single data warehouse.
  • Measure the existing level of data quality.
  • Determine strategies and methods for improving the level of data quality.
  • Formulate a knowledge base about stored data and repository objects.
  • Provide a set of tools and technologies for creating BI reporting, as well as developing new analytical reports for several departments of the bank.

The tasks:

  • Prepare a Data Lake and fill it with business data from as many primary systems as possible.
  • Create a data and information system that allows the user to determine where the necessary business data is located.
  • Create a tool to measure and manage data quality.
  • Create a "Golden Record" of the client.
  • Install and configure the necessary tool for BI reporting and start creating a database of reports.
  • Train users on how to use the new tools.
The solution

The implementation consisted of 3 projects.

As a first project, Invento Labs built a corporate warehouse based on MPP Greenplum and also set up an automatic collection of data from primary sources of business value.

Then, our specialists have created a data and information system — a knowledge base concerning the meaning of certain data, which allows the user to determine where they are located.

As part of the second project, Invento Labs built an MDM system — a toolkit for managing and monitoring data quality. Its task was to resolve the problem of missing information, existing duplicates, and erroneous data. As a result of this system operation, the client’s “Golden Record” was formed.

The “Golden record” of a customer is the most reliable, consistent, and complete view of each company's data object (customer, product, counterparty, etc.). It contains all the attributes necessary to describe the portrait of the client. This data can be accessed by employees in order to use the relevant information.


Measuring and improving the quality of data in primary systems allows specialists to identify problem areas in data sources and eliminate them. To monitor the quality of data, Invento Labs specialists formulated and described a methodology for calculating a system of indicators, which was programmed and calculated daily.

The third project was responsible for implementing the Tableau BI analytics tool and building analytical reports.

With the help of a BI system, business-relevant information was provided in the form of interactive reports where not only analysts but also managers of various levels were able to make management decisions in real time.

As part of the project, 16 data sources from the corporate warehouse were used in the work. The planned volume at the start of the project was more than 50 TB.

The result

Benefits received by the Customer:

  • Creation of a Data Lake, consisting of primary data.
  • Obtained access to a single data warehouse by bank’s business users.
  • Formation of the client’s "Golden record", consisting of more than 250 attributes.
  • Ability to self-measure and manage data quality.
  • Creation of a knowledge base of stored data and storage objects.
  • Support of management decisions in the form of BI reporting, as well as the additional opportunity for bank employees to independently generate analytical reporting.
Didn't find what you were looking for?
Order a free consultation from Invento Labs experts!
Contact us