Sentiment Analysis: Where Next?
by Nigel Farmer, Software AG
Over the last few years, the use of news analysis algorithms has become fairly commonplace in the world of electronic trading. There are many market data vendors and news agencies that can provide information such as corporate earnings, Treasury announcements and a wide range of macro-economic indicators, in machine-readable format, at high speed.
Because the sources of these news feeds are (for the most part) highly trusted and distributed by well-established vendors, market participants around the world have been quick to incorporate the data into their automated trading platforms.
Social Media & Breaking News
Increasingly however, firms are trying to gain a competitive edge by accessing news from less well-established sources – such as social media platforms – and performing not just news analysis, but also “sentiment analysis” to determine where a particular asset should be trading.
Sentiment analysis is an inexact science that typically involves the mining of unstructured data such as text – invariably sourced from social media or elsewhere on the Web – to determine whether sentiment on a particular topic is positive, negative or neutral.
The first high-profile usage of social media sentiment analysis in the financial markets was the launch of Derwent Capital’s Absolute Return Fund – the so-called “Twitter Fund” – in 2010. This fund was established on the basis of an academic study that categorized tweets from Twitter into one of six mood states and correlated them with the DJIA, with an 87.6% accuracy rate in predicting up/down movements in the index. The fund didn’t last , but it was a sign of things to come.
These days, it has become common for trading firms to use social media as a source for breaking news. Because events are often picked up on sites like Twitter before they hit the more traditional media sources, this can create some fleeting trading opportunities as the sentiment of the breaking news is rapidly analysed.
In May 2015 for example, financial data firm Selerity tweeted that Twitter was going to miss its Q1 revenue estimates. The tweet was published one hour before Twitter’s earnings call, leading to an immediate drop of 5% in the company’s share price before trading was halted. When trading was resumed, the shares fell 20% in minutes. As it turned out, this was not rumour or fiction, the data actually came from a reliable source, the investor relations section on Twitter’s own website . Because the tweet was accurate, those who were able to sell early gained a huge advantage on the rest of the market.
The news is not all good however. In 2014 the G4S security firm suffered from a bogus website set up to mirror the company’s own site. A statement was sent to journalists – linking to the fake site – warning of a £400m accounting cover up at the company and announcing that the CFO had been fired. G4S shares fell amid heavy trading, with over 500,000 trades executed in less than a minute, before recovering to their previous levels once the story was established as false .
Then in July this year, a news story appeared on a fake Bloomberg site, reporting that Twitter had received a $31 billion takeover offer. The shares jumped 8% within eight minutes before the real Bloomberg tweeted that the story was fake. Again, shares then returned to their previous levels .
It is not just bogus sites that can send out false information however. In 2013, someone hacked into the official Associated Press Twitter account and published a tweet about two explosions in the White House, injuring President Obama. Almost immediately, the Dow Jones dropped 140 points – wiping £130 billion in stock value – before the market recovered after the story was refuted.
All of the examples above – and there are many more – demonstrate how news stories from unreliable sources are increasingly triggering electronic trading platforms. So how can firms ensure that the news they are trading on is accurate? Not only that, but how can they make better use of sentiment analysis tools to generate more positive returns?
Detecting the authenticity of a website, looking at historical data to gauge the validity and reliability of information, correlating against larger numbers of data sources and then using predictive analytics to determine how much the price or volatility of stock might be affected by news and changing sentiment, can increase the intelligence of news and sentiment-based trading strategies.
In order to do this however, firms need to be able to stream large amounts of data from a variety of sources, analyse that data, and act on it, all in real time with negligible impact on latency.
The incoming data is likely to come in both structured and unstructured forms, and some of the data sources will be more trustworthy than others. So it is essential to aggregate and correlate the data across multiple sources and platforms in order to validate news, assess sentiment and provide additional insight.
For example, in the cases of both G4s and Bloomberg above, the fake websites had been set up less than two weeks before the stories were released; information readily accessible from various trusted sources (such as the site WhoIs.net). Regarding the AP story about the White House explosion, at the time this was not validated by any other news or social media sources, a situation that – with the right analysis tools – should have flagged a potential hack.
The ability to validate incoming, streaming data against other sources is key here. It is all very well pulling in data to generate trading signals, but if the source of that data – or the history of the source – is questionable, then you could very well be acting on wrong information, as has been demonstrated time and again. And determining whether your algos could be trading on information from fake websites is actually a fairly basic thing to check.
This ability to work with streaming data and validate it against other sources before performing complex analysis on it and triggering appropriate actions, is where tools like APAMA can really help. Particularly as APAMA now has a Social Media Framework, which includes adapters to a range of social media sources and a number of specific analytical capabilities around sentiment analysis.
Having a single platform that can pull in structured and unstructured data (including market data, news, economic indicators, social media data, static data, historical data, etc) from a variety of sources; verify, validate and score/rank that data; perform complex analysis on it; and then take action such as firing orders into the market, is the direction that sentiment analysis in trading is heading.