03.20.2012

A Needle in a Haystack

03.20.2012

Capital markets are looking to advanced technologies to make sense out of Big Data.

While data might be the lifeblood of organizations, too much of it can threaten their livelihood unless it’s controlled.

“Large amounts of data can be a challenge,” said Rodney Comegys, head of the Index Analysis and ETF Trading teams in Vanguard Equity Investment Group. “You have to be careful not to overwhelm the team with unneeded or unwanted information where they might miss something of importance because it is buried in a haystack of additional data.”

Capital markets firms are harnessing new technology such as in-memory and noSQL databases to make sense of high-volume real-time event stream, perform analytics.

“These systems leverage massively parallel processing architectures, in-memory processing and appliance technologies for predictive analytics on structured information, Hadoop appliances for unstructured information and purpose-built appliances for simulations,” said Matt Benati, vice president of marketing at Attunity, a provider of real-time data integration software.

Technologies which could have an impact on Big Data in capital markets—such as massively parallel processing databases, MapReduce frameworks such as Apache Hadoop and NoSQL databases—are in the early stages of implementation.

MapReduce, introduced by Google in 2004, is a framework for processing huge datasets using a large number of computers.

Roji Oommen, Senior Director, at Savvis

“The capital markets industry needs to make sense out of huge amounts of data, and is starting to look at technologies which form the web,” said Emmanuel Carjat, managing director of TMX Atrium, a firm that connects trading platforms to institutions. “The industry is looking at how Google makes sense of billions of web pages.”

 

“These technologies are generally reserved for more complex implementations with extremely large data sets, involving petabytes of data,” said Matt Blakely, business intelligence technology practice lead at SWI, a software consulting company. “While there are certainly applications within capital markets, most firms are only just beginning to understand and utilize Big Data.”

LinkedIn uses a Hadoop cluster to power features such as “People you may know” and Twitter (which generates over one terabyte of tweets every single day) uses it for both storage and analytics, said Blakely.

“Using MapReduce, it’s possible to effectively store and analyze petabytes of data, something which is generally not possible with standard databases,” he said.

NoSQL databases are another method of handling Big Data. “NoSQL databases, as the name implies, do not use SQL as their primary query language,” said Blakely. “They are highly optimized for retrieve and append operations, but offer little functionality beyond key-value record storing.”

NoSQL is typically used when extreme performance and real-time data appending and retrieval is more important than consistent results and query flexibility, Blakely said.

Open-source data storage systems such as Hadoop and Cassandra are ideal for capital markets apps because they can process, store and trigger actions based on a high-volume real-time event stream, perform analytics on historical data, and update models directly into the application.

“A number of our customers are running projects to evaluate and test new tools such as Hadoop and Cassandra,” said Roji Oommen, senior director, business development for financial services at Savvis, a web hosting and service provider.

The Cassandra data model is designed for distributed data on a very large scale. In a relational database, data is stored in tables and the tables comprising an application are typically related to each other.

Cassandra is a column-oriented database, meaning that it stores its content by column rather than by row. This has advantages for heavy-duty number crunching apps that involve complex queries.

“Columnar databases are faster for processing time-series data than relational databases,” said Oommen. “Cassandra is an open-source columnar database, and firms are testing its applicability to tick data management.”

Hadoop is an open-source framework that allows for distributed processing of large data sets across clusters of computers. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

“Hadoop is a distributed computing framework developed by Yahoo,” said Oommen. “Hadoop distributes data and workload to commodity services and can scale arbitrarily large, up to exobytes.”

SAP, the business software company that acquired Sybase in 2010, has created its own Big Data solution, called High-Performance Analytic Appliance, or HANA.

Sybase is planning to incorporate design elements of HANA into its own in-memory analytics product, Sybase RAP.

“Analytics can run 1,000 times faster in memory,” said Neil McGovern, senior director of strategy and financial services at enterprise software maker Sybase. “In-memory makes the challenge of analyzing huge amounts of data easier.”

HANA is designed to capture massive amounts of transactional data in memory, and to provide flexible views of analytic information in seconds.

SAP, which also owns Business Objects, a leading business intelligence software company, created HANA to tackle the most data-intensive applications encountered by the customers of Business Objects. “Business intelligence is a key market segment for SAP,” said McGovern. “They needed to speed up analytics, and came up with HANA.”

Like HANA, RAP employs a columnar-type database that can outperform traditional databases for complex analytics, including time series and event-stream processing.

🏆 The 2026 Global Markets Choice Awards are here! 🌍 Nominations are officially OPEN for the celebration of excellence in global capital markets trading & technology. Nominate below:
https://www.jotform.com/form/260086385121150

Delaware Life Insurance Company is becoming the first insurance carrier to offer an index that contains cryptocurrency, adding the BlackRock U.S. Equity Bitcoin Balanced Risk 12% Index to its fixed index annuity (FIA) portfolio.

As the digital assets industry pushes toward

Franklin Templeton is expanding its tokenized fund suite, signaling growing institutional demand for blockchain-based fund infrastructure and regulated investment products moving onchain. Read the full article below:

$50 billion in active ETF inflows helped fuel a record year for @BlackRock 's iShares business, as investors continue to lean into active strategies.

Load More

Related articles

  1. Batch Auctions Could Combat HFT

    Terry Duffy, chairman and chief executive of CME, has opposed regulatory approval of the perpetual futures.

  2. Easy Money Tamps Down Volatility

    The new 24/7 trading framework expands the utility of these contracts.

  3. Basel Committee Consults on Interest-Rate Risk

    Demand for inflation risk management tools has grown.

  4. Income Equity Fund IPOs

    SpaceX IPO is expected to be a catalyst for the market.

  5. The announcement follows an invitation to tender in November 2025.