InfoSphere Streams for Real Time Analytics in Financial Services Industry Krishna Mamidipaka, [email protected] Roger Rea, [email protected]
Dec 23, 2015
InfoSphere Streams for Real Time Analytics in Financial Services IndustryKrishna Mamidipaka, [email protected]
Roger Rea, [email protected]
Housekeeping
• We value your feedback - don't forget to complete your evaluation for each session you attend and hand it tothe room monitors at the end of each session
• Overall Conference Evaluation will be providedat the General Session on Friday
• Visit the Expo Solutions Centre
• Please remember this is a 'non-smoking' venue!
• Please switch off your mobile phones
• Please remember to wear your badge at all times
Disclaimer
The Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.
Agenda
• Financial Markets Business Challenges• Industry Technical Challenges • InfoSphere Streams• Trend Calculator• Financial Toolkit• Data Mining in Real Time• InfoSphere Streams Directions
4
Firms Must Capitalize on Drivers of Change
Drivers
Markets becoming electronic
Implications
Speed as source of Alpha
Transparency is required
Volume is a barrier
Information availability
Real-time data pressures
Actions
Accelerate the end-to-end marketplace connectivity and execution
Store, retrieve and distribute comprehensive time series data in a timely manner
Increase capacity to handle current and forecasted volumes
Detailed analysis of trading process
Transaction costs pressures
Access to broader markets by accessing multiple markets
5
For US equity electronic trading brokerage 1 millisecond = $4M in annual revenue
Source: Tabb Group
We are in a technology arms race
Latency reductions with a clear business value or cost associated
Exponential increases in volumes
Real time data pressures
6
The Volume, Complexity & Semantic Depth of data that to be analysed will increase significantly
MarketData
RiskAnalyticsData
HistoricalTrade Data
Analytics & Insight
MarketData
RiskAnalyticsData
VideoNewsFeeds
CorporatePressReports
RSSFeedsWeb
Pages
WeatherData
GovernmentStatistics
InternalMessageBus
Blogs&Commentary
HistoricalTradeData
Analytics & Insight
Real World Sensors
Tomorrow?
+ Other Feeds
Structured data Structured & Unstructured data
Information overload
Today
7
The Transaction Life Cycle or latency loop – end to end latency is the key to success and there are no prizes for coming second
Investment / trading goals
MarketData
Trading DecisionWhat to Buy/Sell
Execution Algorithm
VWAP, etc.
Order Routing Decision
Matching
TransactionCost
Analysis
latency measurement is a competitive advantage to deliver Alpha
WAN Connectivity
Middleware CEP Engines OMS/EMS
Exchanges,
End to end latency knowledge and a continuous performance road map is required Speed Speed Speed Speed
Current approaches reaching limits, based on x86 and networking technologies
8
RAM
CPU
DSK
I/O
Single CoreSingle Thread100% Serial Programming
Yesterday
Core
RAM
Core
RAM
Core
RAM
Core
RAM
Core
RAM
Core
RAM
Core
RAM
Core
RAM
Core
RAM
Core
RAM
Core
RAM
Core
RAM
Core
RAM
Core
RAM
Core
RAM
Core
RAM
DSKI/O NET
Multicore (2-16)Multithread (10s)80/20 Serial/Parallel Programming
Today
DSKI/O NET
Manycore (32-100s)20/80 Serial/Parallel Programming Threading model breaks as complexity exceeds programmer capability
Tomorrow
The Manycore programming challenge
Programmers cannot cope with thousands of threads and complex data flows using existing programming models
9
Options for exposing parallelism in a programming model
Full exposure of machine details
Only usable by experts
High performance Low productivity
ParallelismFully Exposed
ParallelismImplicit
PartialExposure
Limits exposure to machine details
Expands programmer community
High performance Higher productivity for C/C+
+ class programmers- Bounds checks, pointer
checks, strong typing, etc.
No exposure of machine details, e.g., Hadoop/map reduce, IBM Streams Processing Language
Usable by larger number of programmers
High Performance High Productivity
10
Time is ripe for a new era of computing
• Emerging trends create need for new languages– Scientific programming Fortran – Business programming Cobol – Systems programming at higher level C– Increased productivity C++– Web programming Java
• Streaming data sources and multicore architectures – Streams Processing Language
11
Delivering ‘Continuous Intelligence’ with Powerful Analytics
Automated Options Market Making:
– Peak throughput of 10 million messages per second
– Mean latency under 100 micro seconds across 28 dual quad core x86 blades
Millions of events per
second
Microsecond Latency
Traditional / Non-traditional
data sources
Real time delivery
PowerfulAnalytics
12
IBM InfoSphere Streams v1.2
Development Environment
Runtime Environment
Toolkits & Adapters
Front Office 3.0
RHEL v5.3 or v5.4x86 multicore hardwareInfiniBand supportUp to 125 servers
Eclipse IDEStreamSightStream Debugger
Connectors to data sourcesOperator LibraryFinancial ToolkitMining Toolkit
13
Scalable stream processing
• InfoSphere Streams provides – A programming model and IDE for defining data sources and
software analytic modules called operators that are fused into process execution units (PEs)
– infrastructure to support the composition of scalable stream processing applications from these components
– deployment and operation of these applications across distributed x86 processing nodes, when scaled processing is required
– stream connectivity between data sources and PEs of a stream processing application
14
Trend File 1 playback
Trend File 2 playback
Trend File 3 playback
Up/down trend for Requested symbols
Symbols to be output
Algo ParametersPer Symbol
Trend Calculator Example
15
Streams offers tremendous deployment flexibility
With only a simple re-compile of application:
All on one machine fused into one multi-threaded process
All on one machine; each operator in its own process
Each operator in its own process, each process on its own machine
16
Trend Calculator Example
17
Financial Services Toolkit
• Adapters layer used by top two layers and user-written apps• Functions layer used by top layer and user-written apps• Solution Frameworks are “starter” applications that target a particular use case
Speeds development of Streams financial domain applications
18
Adapters, Functions, Utilities
• Financial Information Exchange (FIX) Adapters– fixInitiator Operator, fixAcceptor Operator, FixMessageToStream Operator,
StreamToFixMessage Operator• WebSphere Front Office for Financial Markets (WFO) Adapters
– WFOSource Operator, WFOSink Operator• WebSphere MQ Low-Latency Messaging (LLM) Adapters
– MQRmmSink Operator• Functions:
– Coefficient of Correlation– “The Greeks” (Put/Call values, Delta, Theta, Rho, Charm, DualDelta, etc.)
• Operators:– Wrappering QuantLib financial analytics open source package.– Provides operators to compute theoretical value of an option:
• EuropeanOptionValue Operator – 11 different analytic pricing engines– e.g. Black Scholes, Integral, Finite Differences, Binomial, Monte Carlo, etc.
• AmericanOptionValue Operator - 11 different analytic pricing engines– e.g. Barone Adesi Whaley, Bjerksund Stensland, Additive Equiprobabilities, etc.
19
Equities Trading “Starter Application”Modular design
Components are plug-replaceable – extend these or substitute your own
Demonstrates how trading strategies may be swapped out at runtime, without stopping the rest of the application
TradingStrategy module looks for opportunities that have specific quality values and trends
OpportunityFinder module looks for opportunities and computes quality metrics
SimpleVWAPCalculator module computes a running volume-weighted average price metric
20
OptionPrice
Data Filtering and Preparation
DataSourcesStockPrice
StockInformation
Risk FreeRate
Pricing
Decision
Theoretical Price Computation
Identification of Buying
Opportunities
OptionsPriceFeedData
RiskFreeRate
Stock
OptionsValue
DataSinks
Options Trading “Starter Application”DataSources module consumes incoming data; formats and maps for later use
Pricing module computes theoretical put and call values
Decision module matches theoretical values against incoming market values to identify buying opportunities
21
Multinational Mutual Funds Manager and Broker
• High speed market trend calculation system that can provide instant insights into the market behavior
• Improved development time from days to hours to add new features to the trend calculation system using the Streams programming model
• Customizable to run on one server or distributed across many servers to garner more compute power
• Visualization tools for effective live trade monitoring and risk assessment
22
Notional Information Supply Chain for Decision-making Transforming the Information Supply Chain to reduce the time to action!
SOURCES
Elapsed Time to Action
WAREHOUSE
ReportsAd-hoc Queries
DATA INTEGRATIONOPERATIONAL DATA STORES
DATAMARTS
Bus Process & Event Mgmt
Operational Reports
Dashboards Planning Scorecarding
Analytical Modeling & Information
Typical information supply chain
23
Time to Action
SOURCES
WAREHOUSE
ReportsAd-hoc Queries
DATA INTEGRATIONOPERATIONAL DATA STORES
DATAMARTS
Bus Process & Event Mgmt
Operational Reports
Dashboards Planning Scorecarding
Analytical Modeling & Information
Stream Computing:Analytical Modeling
& Information
More context
Reduces Time to ActionWidens the apertureReduces costs
24
Market Surveillance & Fraud applications
Rule Parameters
Market Feeds and Trade Data
Historical
Real time analysis processing
Enrichment
Existing business
rules
PMML Model Scoring
Additional
sophisticated
analytics
Alerts
Collected
results
Solution User Interface
Solution User Interface
25
What are key advantages of Streams?
Compiling groups of operators into single processes enables:
• Efficient use of cores• Distributed execution• Very fast data exchange • Can be automatic or tuned• Can be scaled with the push of a button
Language built for Streaming
applications: • Reusable operators• Rapid application development• Continuous “pipeline”
processing
Extremely flexible and high performance transport:
• Very low latency• High data rates
Easy to extend:• Built in adaptors• Extend with C++ and Java • Extend running applications
Use the data that gives you a competitive advantage:
• Can handle virtually any data type
• Use data that is too expensive and time sensitive for other approaches
26
IBM InfoSphere Streams directions WebSphereBusiness
Events
Existing business information
Data in motion
InfoSphere Warehouse IBM
MashupHub
8BI
ToolsStreams Studio enhancementsVideo/audio analyticsText/unstructured analyticsStreams Processing Language
improvementsNative XML support
RuntimeHigh Availability Expanded platform supportPerformance improvements
AdaptersWebSphere MQRSS feedsMashup HubWebSphere Business EventsOracleSQL ServerMySQL
Millions of events per
second
Millisecond Latency
Cognos
Front Office
All statements regarding IBM's plans, directions, and intent are subject to change or withdrawal without notice. Any reliance on these statements are at the relying party's sole risk and will not create any liability or obligation for IBM. 27
InfoSphere Streams sessionsTime Session Title Location
Thursday May 2010:45 AM - 11:35 AM
3666A InfoSphere Streams for Real Time Analytics in Financial Services Industry
Marriott Park Hotel, Room 14
Friday May 2109:00 AM – 09:50 AM
3661A InfoSphere Streams helps Stockholm build Ver 2.0 Traffic Control System
Marriott Park Hotel, Room 13
Friday May 2111:30 AM - 12:30 PM
3692A InfoSphere Streams at Marine Institute of Ireland: Deep Dive
Marriott Park Hotel, IOD Mini Theatre 3
Wednesday 10AM - 6PMThursday 10AM - 5PMFriday 9AM - 2PM
Demo Room
InfoSphere Streams Demonstrations Marriott Park Hotel, IOD Demo Room Station 19
Wednesday 10:30 – 11:30Thursday 12:30 – 13:00Thursday 16:30 – 17:00
Mini Theater on Expo Floor
InfoSphere Streams in TelcoInfoSphere Streams Business InsightLeverage Warehouse, SPSS with Streams
Marriott Park Hotel, InfoSphere Mini Theater Expo Floor