The Art and Science of Data Curation (By Mark Evans, Confluence)
There’s an interesting shift happening in the asset management industry. Five years ago managers were content to have a cluster of vendors that could each produce some data on how the manager ran its business. Each of these vendors held one chapter of the manager’s complete book of knowledge. If the firm needed to find a piece of information, it knew which chapter to search. That style of data management is quickly disappearing.
Most managers I speak with these days believe that it’s simply no longer acceptable to have standalone islands of data that drive analytical decisions or regulatory and investor reporting. Today’s asset manager has to compile and disseminate data far too quickly to allow it to live in silos. And it is this need for a more fluid data management model that is sparking a revolution in the asset management industry.
Managers are beginning to question how data curation—not just management— could impact their business. They have done the math and recognize that solving today’s data challenges can genuinely contribute to their competitiveness and improve their cost structures. But the truth is that most firms are still a long way from unlocking the full potential of their data. So how do they get to where they need to be? The first step is better understanding the essence of how data needs to be treated in order to be effective.
If data could speak it would tell us that it wants to inform, to provide context and meaning. It would tell us that it doesn’t want to be moved around and re-examined every time there’s a new use case. We need to be able to tend to our data in a way that renders it applicable to problem after problem, again and again, without having to be picked up and moved into system after system. Data can live in a state where it is relied upon and leveraged frequently. When data reaches this state, it has been transformed into information, which arguably is the most important thing managers can accomplish with their data.
Data also wants to be used—and reused. It wants to deliver the same intelligence every time it is used. It doesn’t want to be dependent on where or how it is sourced, validated and delivered. It doesn’t want to be resynthesized or reformulated every time it is queried. Whether data is living in the cloud or within a system or database, data wants to be repeatedly reliable. Its location or journey should be irrelevant.
Finally, data wants to be ubiquitous; it wants to be shared. The truth is that asset managers sit on a mountain of data that could be more powerful shared than locked away. Imagine an asset management industry where we share information among peers, enabling us to improve risk management capabilities, reduce operating costs, and ultimately deliver greater transparency and control to investors. We’ve already seen this happen with the energy industry, which has developed an opt-in data exchange where firms can share anonymized information about their business that helps to guide investing, trading and pipeline management decisions.
Imagine if we were able to do this with the asset management industry. Just think of the insights asset managers would gain by sharing information about how they operate with their peers. Those benefits would be felt by managers and investors, alike. We could see the emergence of new benchmark data for optimizing regulatory reporting processes and reducing operating cost. Fund expense information could be available, up-to-date, and easily consumed.
Imagine a future where all data is held in one platform, available to be used and reused whenever we choose and for whatever purpose we need. The insights we could glean as an industry would be revolutionary. But, unless we move beyond simply managing our data and begin to curate it, those insights won’t be as frequent or as meaningful as possible.
Is it time to rethink how you are treating your data?
Mark Evans is Chairman, CEO and Founder of Confluence
A generational shift is happening on the desktop.
ICMA said improvements need to begin with the Esma bond databases.
ISINs will be linked to LEIs to improve transparency.
Users can validate their data files against the new SFTR reporting formats.
The bank made 220.2m errors in its transaction reporting over 9.5 years.