Article

Key process steps in data management

A couple of weeks ago I wrote about data management and exchange in the investment management industry (see previous article here). I argued that it is necessary to bridge the gap between manual and hard to automate data management processes on the one hand, and formal data management or systems of record on the other. Standards and enterprise data management systems are not helping solve these real world data management problems.

To bridge that gap we need to look at the nature of these processes and what common issues need to be solved.

Prior to going into common issues and solutions, I want to set the stage by describing a generic view that we, as a company, have on data management and data exchange processes. I will discuss underlying key business requirements and challenges of each of the following steps:

Key steps in data management
Key steps in data management

Collection

Gathering data from different internal or external suppliers (e.g. back office teams, external asset managers, actuaries, accountants, service providers, etc.) involves receipt of the actual data sets, tracking delivery of data sets, sending reminders and escalating if data does not arrive on time. Data, on receipt, must be stored in a central location in order for the team or internal systems to locate it and be processed further. Tracking delivery ideally is related to an entity within the business context (e.g. external manager, fund, client, portfolio, data vendor) so that reporting on progress is easy and accessible.

Data can be received from internal as well as external sources. In an internal scenario a team might have more influence on how data is delivered to them. When dealing with external entities data delivery agreements might be more difficult to enforce. Finding the right level of leverage over data suppliers can be a balancing act between providing the best technical solutions (i.e. making it very easy) and a more contractual/transactional approach.

Validation

Received data ideally passes through a series of technical and functional checks and controls to make sure that the it conforms to agreed quality standards. An exception driven process should enable specialist in the business teams to intervene in the data processing, make changes where applicable or escalate back to the data supplier with clear and structured feedback.

The challenge is to facilitate and structure the communication around expectations and actual data quality to improve that and create a virtuous cycle. More concretely, based on specific feedback data suppliers should be enabled to improve on an ongoing basis rather than recipients continue to fix data issues on behalf of.

Preparation

Teams request data for one or more specific purposes. This step is about shaping data in such a way that it is fit for further consumption by other processes, applications and/or systems. For example, a spreadsheet tool requires input from various other departments, a data warehouse system relies on manual inputs for a number of it’s tables or an accounting system requires weekly or monthly valuations to be processed.

Preparation of data is not just simply renaming and moving columns around in the right order or restructuring data (e.g. denormalising, disaggregate, etc.). It is also about mapping data to internal structures and conventions as well: mapping data formats to something applicable (e.g. dates, numbers), cross referencing internal and external identifiers (e.g. ISIN, LEI), translating classification/rating schemes (e.g. GICS, Moody's, S&P), enrichment, etc. This requires loads of reference data, translation tables and an efficient method to maintain all of that.

Delivery

Data delivery is the final piece in any process where validation and prepared data is moved on to a next step (e.g. processing by a different team, application, etc.). Here teams want to track delivery and understand that data was successfully consumed by the intended recipient. Delivery can happen in different shapes or forms: certain recipients might want to receive an email with attachment, others have structured systems in place that will allow delivery onto a Sharepoint or through SFTP, web services or API’s. The trick here is to minimize friction for the data recipients and set them up for an efficient experience.

As delivery of data can happen to both internal as well as external stakeholders, security is a big consideration when setting up these workflows. Striking a balance between usability and a secure delivery method is an interesting challenge for many organisations.

Next up, practical solutions

Again, when dealing with highly structured, organised and predictable data flows the above issues might prove less urgent and traditional enterprise tools (EDM, ETL, etc.) could fit the bill. However this approach very quickly falls down if data frequently changes, data comes in at unpredictable intervals, progress needs to be tracked, visibility/audit of the process is necessary, structured communication/feedback on data is required. These problems are compounded by the typical tools that are used for this today: client portals, internal Sharepoints, Excel and email. These tools don’t integrate to provide a coherent, user friendly view on the process described above.

It is clear that each step in the process has different challenges and that point solutions have been created as coping mechanisms. Stay tuned as I will discuss each of the steps further and present practical and better solutions in the weeks to come.

Mesoica’s data quality platform is designed to meet the evolving needs of today's organizations. By using our platform, you can continuously monitor data, identify trends, flag regressions, and foster communication and collaboration around data. Our platform is built to scale with your organization's growing data quality maturity needs and provide peace of mind. Start your journey towards becoming a truly data-driven organization today. Visit our website or contact us to learn more about how Mesoica can empower your organization to anticipate, prevent, and continuously improve data quality.