Event Driven Architectures with Apache Kafka on Heroku

Post on 15-Apr-2017

212 Views

Category:

Software

0 Downloads

Preview:

Click to see full reader

Transcript

Event DrivenArchitectures with

Apache Kafka on Heroku

Chris Castle, Developer AdvocateRand Fitzpatrick, Director of Product

November 3, 2016

What problems does Apache Kafkasolve?

What are the core concepts of Kafka?

Why Apache Kafka on Heroku?

Forward-Looking StatementsStatement under the Private Securities Litigation Reform Act of 1995:

This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties

materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results

expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be

deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other

financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any

statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services.

The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new

functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our

operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of any

litigation, risks associated with completed and any possible mergers and acquisitions, the immature market in which we operate, our

relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of

our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to

larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is

included in our annual report on Form 10-K for the most recent fiscal year and in our quarterly report on Form 10-Q for the most recent

fiscal quarter. These documents and others containing important disclosures are available on the SEC Filings section of the Investor

Information section of our Web site.

Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently

available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based

upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-

looking statements.

What problems doesApache Kafka solve?

Event-Driven Architecture

Event-driven architecture (EDA), also knownas message-driven architecture, is asoftware architecture pattern promoting theproduction, detection, consumption of, andreaction to events.

Source: Wikipedia

What Are Events?

Context

When was the event? (event time, process time)?

What produced the event? (causal history, device, etc)

Where did the event occur? (system location, geo location)

Operation

What function was applied? (create, update, delete, etc)

What are the characteristics of the function?

StateWhat is the data involved in the event?

How is that data identified?

"Contextualized operation on state"

Event ExamplesProduct viewsCompleted salesPage visitsSite loginsShipping notificationsInventory receivedIoT sensor valuesWeather dataTraffic dataTweetsElection polling data!

Completed sale2016-11-03T15:13:27Z

Retail www site

referrer Google search

Inventory item purchased

Amazon Echo, Black

$179.99

ID B00X4WHP5E

Context

Operation

State

Why Should I Care?

Scaling too slowly leads to dropped data

Overprovisioning leads to inefficient systems

Dataflow between processing stages requires coordination

Parallel pipelines with the same data can be nontrivial

Service discovery must support current and future processes

Sequencing service availability is critical to system function

Possible loss of state when individual services fail

Why Should I Care?Inbound Streams

Scaling too slowly leads to dropped dataOverprovisioning leads to inefficient systemsBackpressure and other coordination is hard!

Data Pipelines

Dataflow between processing stages requires coordinationParallel pipelines with the same data can be nontrivialProvenance and auditability!?

Microservices

Service discovery must support current and future processesSequencing service availability is critical to system functionPossible loss of state when individual services fail

Why Should I Care?Inbound Streams

Event streams in Kafka allow other resources to pull when readyResources can fail and reconnect without dropping eventsKafka provides elasticity, reducing the need for backpressure

Data Pipelines

Dataflow coordination is reduced via event stream structureThe immutability of data allows for trivial parallel processingTracking provenance and lineage of data becomes possible

Microservices

Services now only need to discover topics in KafkaService availability sequencing is relaxedInter-service communication is more robust

Use CasesHeroku Platform Event Stream

Learn more athttps://blog.heroku.com/powering-the-heroku-platform-api-a-distributed-systems-approach-using-streams-and-apache-kafka

Use CasesHeroku Operational Experience: App Metrics

Use CasesHeroku App Metrics

Learn more athttps://engineering.heroku.com/blogs/2016-05-26-heroku-metrics-there-and-back-again/

Use CasesTwitter Analytics Dashboard

Use Cases Generalized

Inbound Streams Data Pipelines Microservices

PlatformEvent Stream

App Metrics

Twitter Analytics

What are the coreconcepts of Kafka?

Apache Kafka Core Concepts

PRODUCERS CONSUMERS

Brokers

The instances running Kafka and managingstreams of events in a cluster.

Producers + Consumers

Clients that write to or read from a Kafkacluster.

Topics

Streams of events that are replicated acrossthe brokers. Configured with time basedretention or log compaction.

Partitions

Discrete subsets of topics, and importanttuning points for parallelism and ordering.

BROKER

TOPIC

PARTITION

Example ProducersProduct viewsCompleted salesPage visitsSite loginsShipping notificationsInventory receivedIoT dataWeather dataTraffic dataTweetsElection polling data!

Web serverPayment processorBrowserAuthentication serviceShipping providerWarehouseMotion sensorRain gaugeVehicle sensorTwitterOnline/phone survey

Personalization engineAccounting systemReporting dashboardSecurity audit serviceShipping providerInventory databaseActuatorClimate modelTraffic mapAnalytics dashboardElection forecast

Example ConsumersProduct viewsCompleted salesPage visitsSite loginsShipping notificationsInventory receivedIoT dataWeather dataTraffic dataTweetsElection polling data!

Complex Architecture

Complex Controls

TOPIC

PARTITION

Other Kafka primitives to provide structure to Kafka event streams

Retention

Log compaction

Replication factor

Delivery guarantees

Interacting with Kafka

and many more...

Kafka Connect

Some examples: HDFS, JDBC, Elasticsearch, Couchbase,Oracle, MS SQL Server, Cassandra, DynamoDB,

Salesforce Streaming API, Splunk

Image credit: Confluent Kafka Connect announcement blog post

Why Apache Kafkaon Heroku?

Without Heroku

Apache KafkaThe heart of the event management system, witha broad variety of configurations and options.

Apache ZookeeperThe system’s consensus and coordination clusteris vital for Kafka’s operation.

OS + JVM TuningTuning the cluster runtimes can be an art.

Instances + NetworkingPhysical or virtual, the infrastructure behindclusters must be well considered.

Myriad Moving Pieces

Apache Kafka on HerokuSimple Configuration

Apache Kafka on HerokuAutomated Operations

Apache Kafka on HerokuExperienced Staff

Self-HealingCurrent VersionNo-Downtime Upgrades

Heroku engineers have contributed patchesto the core open source Kafka project.

Apache Kafka on HerokuGlobal

US WestUS EastIrelandGermanyJapanSydney

Let's Review......and get you started with Kafka!

Apache Kafka is a valuable tool for building architectures to supportinbound event streams, data processing pipelines, and microservicescoordination. The primitives provided by Kafka -- topics, partitions, retentionduration, log compaction, and replication -- provide the tools tomanage structured event streams. Apache Kafka on Heroku simplifies operational complexity so thatany developer can get started quickly and feel confident that theirapplication is supported by a rock-solid, production service.

Get started athrku.co/use-kafka

Q&ARand Fitzpatrick, Director of Product

Chris Castle, Developer Advocate

But first, please take one minute to answer a fewquick questions so we can make webinars like this

even better for you.

Learn MoreApache Kafka on Heroku

Get Started

Documentation

Kafka Event Stream Modeling

Podcast: Managed Kafka with Heroku Engineer Tom Crayford

https://www.heroku.com/kafka

https://elements.heroku.com/addons/heroku-kafka

https://devcenter.heroku.com/articles/kafka-on-heroku

https://devcenter.heroku.com/articles/kafka-event-stream-modeling

http://softwareengineeringdaily.com/2016/10/25/managed-kafka-with-tom-crayford/

Thank you!

top related