02.09.2012

Open-Source Software Addresses Big Data

02.09.2012

Terry Flanagan

Cassandra and Hadoop promise orders of magnitude gains over relational databases.

Capital markets firms are experimenting with open-source database technology capable of capturing, storing, and analyzing enormous amounts of data.

Open-source data storage systems such as Hadoop and Cassandra are ideal for capital markets apps because they can process, store and trigger actions based on a high-volume real-time event stream, perform analytics on historical data, and update models directly into the application.

“A number of our customers are running projects to evaluate and test new tools such as Hadoop and Cassandra,” Roji Oommen, senior director, business development for financial services at Savvis, told Markets Media.

The explosion of Big Data has affected all industries, but the capital markets has its own unique set of issues, such as the need to capture time-series data and merge it with real-time event processing systems.

“As electronic trading becomes pervasive, and you’re collecting full depth tick data feeds, it’s a staggering amount of data,” said Oommen. “The data management issues associated with storing and transforming information are complex.”
Cassandra is an open source distributed database management system designed to store and allow very low-latency access to large amounts of data.

The Cassandra data model is designed for distributed data on a very large scale.

In a relational database, data is stored in tables and the tables comprising an application are typically related to each other.
Cassandra, is a column-oriented database, meaning that it stores its content by column rather than by row. This has advantages for heavy-duty-number crunching apps that involve complex queries.

“Columnar databases such are faster for processing time-series data than relational databases,” said Oommen. “Cassandra is an open-source columnar database, and firms are testing its applicability to tick data management.”

Hadoop is an open-source framework that allows for distributed processing of large data sets across clusters of computers. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

“Hadoop is a distributed computing framework developed by Yahoo,” said Oommen. “Hadoop distributes data and workload to commodity services and can scale arbitrarily large, up to exobytes.”

Markets Media Follow

Digital publisher covering trading & technology in capital markets. @marketsmedia @traders_tweets @FIXGlobalOnline @TheBondDesk @BestExecution @DerivSource

Markets Media @marketsmedia ·

13 Sep 2023

As Technology Evolves, Asset Managers Adapt and Innovate

Reply on Twitter 1702005834081821046 Retweet on Twitter 1702005834081821046 Like on Twitter 1702005834081821046 6 Twitter 1702005834081821046

Markets Media @marketsmedia ·

13 Sep 2023

Citi Changes Organizational Structure

Reply on Twitter 1701949103419048274 Retweet on Twitter 1701949103419048274 Like on Twitter 1701949103419048274 Twitter 1701949103419048274

Markets Media @marketsmedia ·

13 Sep 2023

SEC Charges Virtu for Disclosures Relating to Information Barriers

Reply on Twitter 1701875573344157802 Retweet on Twitter 1701875573344157802 Like on Twitter 1701875573344157802 Twitter 1701875573344157802

Markets Media @marketsmedia ·

13 Sep 2023

ICE Futures Singapore Partners with CoinDesk Indices

Reply on Twitter 1701870991540895961 Retweet on Twitter 1701870991540895961 1 Like on Twitter 1701870991540895961 Twitter 1701870991540895961

Terry Flanagan

Managing Editor

Terry Flanagan is Managing Director at Markets Media Group, where he oversees the Markets Media, Traders Magazine, and GlobalTrading platforms....

More about this author

Daily Email Feature

JPMorganChase Makes Data “AI Ready”

In 2024 the bank set up a firmwide chief data and analytics office.

06.17.2025 By Shanny Basar , Senior Writer
From The Markets

FTSE Russell, StepStone to Develop Private Asset Data

They expect the initial cohort of daily indices to be launched this year.

06.11.2025 By Shanny Basar , Senior Writer
From The Markets

Deutsche Börse Offers Buy Side Integrated Connectivity

Investment managers on SimCorp One will have integrated access to Clearstream’s post-trade solutions.

05.22.2025 By Shanny Basar , Senior Writer
From The Markets

Droit Integrates LSEG’s Shareholding Disclosure Data

Institutions can automate their regulatory reporting of shareholder disclosures in one streamlined workflow.

05.20.2025 By Shanny Basar , Senior Writer
Daily Email Feature

Volatile Markets Boost Demand for Data and Analytics

JP Morgan’s Execute Analytics platform provides cross-asset traders personalized and targeted insights.

05.15.2025 By Markets Media ,

Want the latest news on securities markets -- FREE?

Open-Source Software Addresses Big Data

NEWSLETTER SIGN UP

Managing Editor

Related articles