Top Banner
APACHE FLUME NG Kai Voigt, Cloudera Inc London, Hadoop User Group, 10 Oct 2012 Donnerstag, 11. Oktober 12
18

Apache Flume NG

Dec 04, 2014

Download

Technology

huguk

Talk given by Kai Voigt, Cloudera Inc, at the Hadoop User Group UK meetup on 10 Oct 2012 in London
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Apache Flume NG

APACHE FLUME NGKai Voigt, Cloudera IncLondon, Hadoop User Group, 10 Oct 2012

Donnerstag, 11. Oktober 12

Page 2: Apache Flume NG

FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA

”Donnerstag, 11. Oktober 12

Page 3: Apache Flume NG

FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA

”Donnerstag, 11. Oktober 12

Page 4: Apache Flume NG

httpd

/var/log/htaccess

HDFS

Flume

Donnerstag, 11. Oktober 12

Page 5: Apache Flume NG

5

Donnerstag, 11. Oktober 12

Page 6: Apache Flume NG

6

mysource

mychannel

mysink

myagent.sources = mysourcemyagent.sinks = mysinkmyagent.channels = mychannel

Donnerstag, 11. Oktober 12

Page 7: Apache Flume NG

7

myagent.sources.mysource.type = execmyagent.sources.mysource.command = tail -F /var/log/htaccessmyagent.sources.mysource.channels = mychannel

mysource

mychannel

mysink

Donnerstag, 11. Oktober 12

Page 8: Apache Flume NG

8

myagent.sinks.mysink.type = hdfsmyagent.sinks.mysink.hdfs.path = /user/cloudera/htaccessmyagent.sinks.mysink.hdfs.fileType = DataStreammyagent.sinks.mysink.channel = mychannel

mysource

mychannel

mysink

Donnerstag, 11. Oktober 12

Page 9: Apache Flume NG

9

myagent.channels.mychannel.type = memorymyagent.channels.mychannel.capacity = 1000myagent.channels.mychannel.transactionCapactiy = 100

mysource

mychannel

mysink

Donnerstag, 11. Oktober 12

Page 10: Apache Flume NG

10

$ flume-ng agent --conf-file simple.conf --name myagent$ hadoop fs -ls htaccess-rw-r--r-- 1 cloudera cloudera 1001 2012-09-30 05:58 htaccess/FlumeData.1348999108529-rw-r--r-- 1 cloudera cloudera 993 2012-09-30 05:58 htaccess/FlumeData.1348999108530-rw-r--r-- 1 cloudera cloudera 997 2012-09-30 05:59 htaccess/FlumeData.1348999108531-rw-r--r-- 1 cloudera cloudera 1009 2012-09-30 05:59 htaccess/FlumeData.1348999108532...

Donnerstag, 11. Oktober 12

Page 11: Apache Flume NG

FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA

”Donnerstag, 11. Oktober 12

Page 12: Apache Flume NG

12

MULTI HOP

Donnerstag, 11. Oktober 12

Page 13: Apache Flume NG

13

myagent1.sinks = mysinkmyagent1.sinks.mysink.type = avromyagent1.sinks.mysink.bind = 10.10.10.20myagent1.sinks.mysink.port = 4141

myagent2.sources = mysourcemyagent2.sources.mysource.type = avromyagent2.sources.mysource.bind = 10.10.10.10myagent2.sources.mysource.port = 4141

Donnerstag, 11. Oktober 12

Page 14: Apache Flume NG

14

CONSOLIDATION

Donnerstag, 11. Oktober 12

Page 15: Apache Flume NG

15

MULTIPLEXING

Donnerstag, 11. Oktober 12

Page 16: Apache Flume NG

16

Sources Sinks Channels

Avro Avro Memory

Exec Logger JDBC

NetCat IRC File

Sequence Generator File

Syslog HBase

Scribe

Donnerstag, 11. Oktober 12

Page 17: Apache Flume NG

DEMODEMODEMODEMODEMO

Donnerstag, 11. Oktober 12

Page 18: Apache Flume NG

Thank [email protected]://www.cloudera.com/

Donnerstag, 11. Oktober 12