Big Data, Spark Streaming, Oil and Gas Kyiv, Ukraine. 07 June 2016 Oil production sensors data monitoring Yaroslav Nedashkovsky, System Architect
Apr 14, 2017
Big Data, Spark Streaming, Oil and Gas
Kyiv, Ukraine. 07 June 2016
Oil production sensors data monitoring
Yaroslav Nedashkovsky, System Architect
Shell data project
“Next year, BP will connect 650 wells to the Industrial Internet. If all goes according to plan, the companies will expand the scope to 4,000 BP subsea wells around the world”
Digital Oil Field
How we could handle such huge data flow ?
What kind of streaming technology could we use ?
Stream processing system- Apache Spark Streaming
- Apache Storm- Apache Samza- Azure Stream Analytics- Google Dataflow- Heron
- Apache Flink
What we need from streaming system?- scalable- fault-tolerant- low latency- data distribution- distributed computations- good API- “exactly-once” guarantees or maybe “at most once” or “at least once ” will be enough ?
(near real time, but not real time)
- high-level api (windows, joins, etc.)
- exactly-one semantics (?!), fault tolerant, scalable
- integration with SQL, DataFrames, Mllib, GraphX
How this work ?
Spark 2.0: Structured Streaming
Oil location data flow monitor
“Christmas tree”
IoT (MQTT) + Spark Streaming + Vizualization
Let look at “monitor” implementation and see how it works
Contacts:
email: [email protected]