Top Banner
Cassandra / Kafka Support in EC2/AWS. Kafka Training , Kafka Consulting Avro Avro Apache Avro Data Serialization
17

Avro Tutorial - Records with Schema for Kafka and Hadoop

Jan 21, 2018

Download

Technology

Jean-Paul Azar
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting

Avro

Avro Apache Avro Data

Serialization

Page 2: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Apache Avro

❖ Data serialization system

❖ Data structures

❖ Binary data format

❖ Container file format to store persistent data

❖ RPC capabilities

❖ Does not require code generation to use

Page 3: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Avro Schemas

❖ Supports schemas for defining data structure

❖ Serializing and deserializing data, uses schema

❖ File schema

❖ Avro files store data with its schema

❖ RPC Schema

❖ RPC protocol exchanges schemas as part of the

handshake

❖ Schemas written in JSON

Page 4: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Avro compared to…

❖ Similar to Thrift, Protocol Buffers, JSON, etc.

❖ Does not require code generation

❖ Avro needs less encoding as part of the data since it

stores names and types in the schema

❖ It supports evolution of schemas.

Page 5: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Avro Schema

Avro schema stored in src/main/avro by default.

Page 6: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Code Generation

Page 7: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Employee Code Generation

Page 8: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Using Generated Avro class

Page 9: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Writing employees to an Avro File

Page 10: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Reading employees From a File

Page 11: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Using GenericRecord

Page 12: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Writing Generic Records

Page 13: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Reading using Generic Records

Page 14: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Avro Schema Validation

Page 15: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Avro supported types

❖ Records

❖ Arrays

❖ Enums

❖ Unions

❖ Maps

❖ Strings, Int, Boolean, Decimal, Timestamp, Date

Page 16: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Fuller example Avro Schema

Page 17: Avro Tutorial - Records with Schema for Kafka and Hadoop

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Avro

❖ Fast data serialization

❖ Supports data structures

❖ Supports Records, Maps, Array, and basic types

❖ You can use it direct or use Code Generation

❖ Read more

❖ Kafka Training

❖ Kafka Consulting