07.24.2014
By Terry Flanagan

Squeezing Alpha Out of Operations

With the race to zero latency diminishing in importance, the gating factor isn’t necessarily the distance between cages in a data center, but rather the amount of computing cycles spent on trade execution.

Operational latency “is almost always where the majority of the time is spent,” said Jake Loveless, CEO of Lucera Financial Infrastructures. “Most of these systems are already co-located. You’re talking about microsecond connectivity from one cage to another. Measuring how an application acts under load, that’s where the big wins are.”

Lucera, a financial IT company, has built a platform around its SmartOS open-source operating system. SmartOS’s DTrace feature is designed to detect microscopic quirks in complex applications that in the aggregate slow down throughput, and therefore negate any latency improvements a company expects to receive from networking and co-location.

“What DTrace allows you to do is dynamically instrument a system in production, so you can actually go measure all of these different metrics inside of a running application in production,” said Loveless.

According to Lucera, DTrace, short for Dynamic Tracing, enables a firm to peer into all parts of its system — kernel, device drivers, libraries, services, web servers, applications, databases. “DTrace is wonderful for peeking into an application while it is running and seeing where the time is spent and what is going on,” said Loveless. “We use it frequently ourselves to improve our systems and infrastructure.”

Often, the topic of optimization revolves around faster servers or network connections, but these only provide incremental improvements. “Companies often naively focus on making the hardware or the network faster to achieve a minimal improvement in performance,” Loveless said. “What they don’t realize or find too challenging to optimize are the middle layers where there is the most latency to be gained – the operating system and the application layer.”

Firms need to measure latency in these complex layers both during development and while they’re in production. “You need to be able to go in and measure an application in production, because that’s where performance matters most,” Loveless said.

Loveless spoke of one latency-sensitive company that had spent several million dollars on an upgraded link from New York to Chicago to shave about 150 microseconds off latency. After using DTrace to instrument the system while in production, the company was able to eliminate in excess of 200 microseconds of latency inside the application, just by changing a single line of code.

Another company had a large trade-capture database, and was looking for ways to scale up performance. Using DTrace, it discovered that the system was making an excessive number of ‘writes’, Loveless said, thereby degrading performance. “It was a bug that had been sitting in that application for years that they had never seen before, because they didn’t have the ability to instrument the system,” Loveless said. “They went and made that fix, and the system was able to do 8 times more capacity using the same hardware footprint, just because they were doing things more efficiently.”

Related articles

  1. Upstart exchange has seen market share increase to near 4%.

  2. Goldman Sachs Asset Management’s fundamental equity business manages over $20bn in thematic equities.

  3. Data extraction and integration is the second stage of a digitization process.

  4. With Ankit Mittal, Business Change Manager, Global Trading, Schroders

  5. IIGCC and lead investors will launch a pilot with companies including BP, Eni, Repsol, Shell and Total.