03.15.2013
By Terry Flanagan

Big Data Gets Cut Down to Size

The explosion of Big Data has affected all industries, but the capital markets has its own unique set of issues, such as the need to capture time-series data and merge it with real-time event processing systems.

“People don’t realize how big ‘Big Data’ really is,” said Simon Garland, chief strategist at Kx Systems, whose flagship product, kdb+, is used by many top tier banks for high-performance database and time series analysis. “When you’re dealing with hundreds of gigabytes a day, it might take weeks to load a large historical data set, so you need to know what you’re going to do with it before you load it in.”

Specialized hardware is required for the kinds of number crunching that go with such huge amounts of data. “You need machines with optimized CPUs and memory, such as SSD for processing data in storage.”

Solid-state drive (SSD) is a data storage device using integrated circuit assemblies as memory to store data persistently. SSDs have no moving mechanical components, which distinguish them from traditional hard drives, and are less susceptible to physical shock, have lower access time, and less latency.

The code in kdb+ utilizes processor-specific instructions available at run-time, with significant speed increases when running calculations using Intel’s Advanced Vector Extensions (AVX) and SSE instructions, available on Intel’s latest generation of Sandy Bridge family of processors.

“The growing volumes of derivatives and trading volumes in FX and equity markets, as well as regulatory requirements, all result in institutions having to store and analyze vast quantities of data,” Garland said. “The simplified storage in kdb+ makes the design and implementation of large systems much less complex.”

Capital markets also stand to benefit from an infusion of advanced data center technology that’s now being deployed by infrastructure providers.

The Clifton, N.J. campus of data center provider Telx is being outfitted for ultra-low-latency as well as airtight security to meet the needs of market participants now and in the future.

“The amount of information that’s being digitized continues to expand in the financial sector,” said Shawn Kaplan, general manager of financial services at Telx. “That’s tempered somewhat by Mooore’s Law, whereby compute power continues to increase per footprint. This facility is designed to handle increases in data loads as compute needs intensify.”

Kdb+ is a column-oriented database, meaning that it stores its content by column rather than by row. This has advantages for heavy-duty-number crunching apps that involve complex queries.

Columnar databases such are faster for processing time-series data than relational databases, in which data is stored in tables and the tables comprising an application are typically related to each other.

Kx offers a unified approach to real-time and historical data analysis with its high-performance kdb+, said Garland.

“While there are excellent products on the market for streaming queries, complex event processing and in-memory and historical databases, they are all different products, with different dialects of SQL,” he said. “The problem comes arise when one vendor upgrades, and all these things that are supposed to be talking to each other are now incompatible, so the system degrades into a maintenance nightmare.”

Related articles

  1. Data extraction and integration is the second stage of a digitization process.

  2. Shedding Light on 'Dark' Data

    Financial Instrument Global Identifier enables consistency through trade lifecycle and across institutions.

  3. Spending on ESG data has an annual growth rate of 20%.

  4. Increased electronification has created useable and accessible real-time and historic trade data.

  5. Trade Reporting In Focus

    The GIPS standards, created by CFA Institute, are for calculating and presenting investment performance.