Top Banner

of 220

The Log Stash Book

Jun 02, 2018

Download

Documents

Mannu Sharma
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/10/2019 The Log Stash Book

    1/220

    The Book

    Log management made easy

    James Turnbull

    logstash

  • 8/10/2019 The Log Stash Book

    2/220

    The Logstash Book

    James Turnbull

    January 26, 2014

    Version: v1.3.4 (5f439d1)

    Website: The Logstash Book

    http://www.logstashbook.com/
  • 8/10/2019 The Log Stash Book

    3/220

    Contents

    Foreword 1

    Who is this book for? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    Credits and Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 1

    Technical Reviewers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    Jan-Piet Mens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Paul Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    Technical Illustrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    Conventions in the book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    Code and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    Colophon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    Errata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    Version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    Copyright . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    Chapter 1 Introduction or Why Should I Bother? 6

    Introducing Logstash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    Logstash design and architecture . . . . . . . . . . . . . . . . . . . . . . . . 8

    What's in the book? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    Logstash resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    Getting help with Logstash . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    A mild warning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    Chapter 2 Getting Started with Logstash 12Installing Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    On the Red Hat family . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    i

  • 8/10/2019 The Log Stash Book

    4/220

    Contents

    On Debian & Ubuntu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    Testing Java is installed . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    Getting Logstash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    Starting Logstash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Our sample conguration le . . . . . . . . . . . . . . . . . . . . . . . . 15

    Running the Logstash agent . . . . . . . . . . . . . . . . . . . . . . . . . 16

    Testing the Logstash agent . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    Chapter 3 Shipping Events 21

    Our Event Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    Installing Logstash on our central server . . . . . . . . . . . . . . . . . . . 23

    Installing a broker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    ElasticSearch for Search . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Creating a basic central conguration . . . . . . . . . . . . . . . . . . . 34

    Running Logstash as a service . . . . . . . . . . . . . . . . . . . . . . . . 36

    Installing Logstash on our rst agent . . . . . . . . . . . . . . . . . . . . . 39

    Our agent conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    Installing Logstash as a service . . . . . . . . . . . . . . . . . . . . . . . 43

    Sending our rst events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

    Checking ElasticSearch has received our events . . . . . . . . . . . . . 47

    The Logstash Kibana Console . . . . . . . . . . . . . . . . . . . . . . . . 49

    Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    Chapter 4 Shipping Events without the Logstash agent 57

    Using Syslog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

    A quick introduction to Syslog . . . . . . . . . . . . . . . . . . . . . . . 58

    Conguring Logstash for Syslog . . . . . . . . . . . . . . . . . . . . . . 59

    Conguring Syslog on remote agents . . . . . . . . . . . . . . . . . . . 62

    Using the Logstash Forwarder . . . . . . . . . . . . . . . . . . . . . . . . . . 71

    Congure the Logstash Forwarder on our central server . . . . . . . . 72

    Installing the Logstash Forwarder on the remote host . . . . . . . . . 77

    Other log shippers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

    Beaver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

    Woodchuck . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

    Version: v1.3.4 (5f439d1) ii

  • 8/10/2019 The Log Stash Book

    5/220

    Contents

    Others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

    Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

    Chapter 5 Filtering Events with Logstash 88Apache Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

    Conguring Apache for Custom Logging . . . . . . . . . . . . . . . . . 90

    Sending Apache events to Logstash . . . . . . . . . . . . . . . . . . . . 97

    Postx Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

    Our rst lter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

    Adding our own lters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

    Extracting from dierent events . . . . . . . . . . . . . . . . . . . . . . 111

    Setting the timestamp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

    Filtering Java application logs . . . . . . . . . . . . . . . . . . . . . . . . . 118

    Handling blank lines with drop . . . . . . . . . . . . . . . . . . . . . . . 119Handling multi-line log events . . . . . . . . . . . . . . . . . . . . . . . 121

    Grokking our Java events . . . . . . . . . . . . . . . . . . . . . . . . . . 124

    Parsing an in-house custom log format . . . . . . . . . . . . . . . . . . . . 127

    Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

    Chapter 6 Outputting Events from Logstash 137

    Send email alerts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

    Updating our multiline lter . . . . . . . . . . . . . . . . . . . . . . . . 138

    Conguring the email output . . . . . . . . . . . . . . . . . . . . . . . . 138

    Email output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140Send instant messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

    Identifying the event to send . . . . . . . . . . . . . . . . . . . . . . . . 141

    Sending the instant message . . . . . . . . . . . . . . . . . . . . . . . . . 143

    Send alerts to Nagios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

    Nagios check types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

    Identifying the trigger event . . . . . . . . . . . . . . . . . . . . . . . . . 145

    The nagios output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

    The Nagios external command . . . . . . . . . . . . . . . . . . . . . . . 148

    The Nagios service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

    Outputting metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

    Collecting metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

    Version: v1.3.4 (5f439d1) iii

  • 8/10/2019 The Log Stash Book

    6/220

    Contents

    StatsD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

    Setting the date correctly . . . . . . . . . . . . . . . . . . . . . . . . . . 153

    The StatsD output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

    Sending to a dierent StatsD server . . . . . . . . . . . . . . . . . . . . 159Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

    Chapter 7 Scaling Logstash 161

    Scaling Redis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

    Installing new Redis instances . . . . . . . . . . . . . . . . . . . . . . . 164

    Test Redis is running . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

    Conguring Redis output to send to multiple Redis servers . . . . . . 166

    Conguring Logstash to receive from multiple Redis servers . . . . . 167

    Testing our Redis failover . . . . . . . . . . . . . . . . . . . . . . . . . . 168

    Shutting down our existing Redis instance . . . . . . . . . . . . . . . . 170Scaling ElasticSearch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

    Installing additional ElasticSearch hosts . . . . . . . . . . . . . . . . . 171

    Monitoring our ElasticSearch cluster . . . . . . . . . . . . . . . . . . . 175

    Managing ElasticSearch data retention . . . . . . . . . . . . . . . . . . 176

    More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

    Scaling Logstash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

    Creating a second indexer . . . . . . . . . . . . . . . . . . . . . . . . . . 180

    Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

    Chapter 8 Extending Logstash 183Anatomy of a plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

    Creating our own input plugin . . . . . . . . . . . . . . . . . . . . . . . . . 187

    Adding new plugins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

    Adding a plugin via directory . . . . . . . . . . . . . . . . . . . . . . . . 191

    Adding a plugin to the Logstash JAR le . . . . . . . . . . . . . . . . . 193

    Writing a lter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

    Writing an output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

    Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

    Index 200

    Version: v1.3.4 (5f439d1) iv

  • 8/10/2019 The Log Stash Book

    7/220

    List of Figures

    1 Copyright . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.1 The Logstash Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    3.1 Our Event Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    3.2 The Logstash web interface . . . . . . . . . . . . . . . . . . . . . . . . . 50

    3.3 The Logstash web interface's light theme . . . . . . . . . . . . . . . . . 51

    3.4 Query results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    3.5 Specic events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    3.6 Basic query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    3.7 Advanced query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    3.8 Customizing the dashboard . . . . . . . . . . . . . . . . . . . . . . . . . 55

    3.9 Adding a panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

    3.10The Dashboard control panel . . . . . . . . . . . . . . . . . . . . . . . . 56

    4.1 Syslog shipping to Logstash . . . . . . . . . . . . . . . . . . . . . . . . . 63

    5.1 Apache log event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

    5.2 Querying for 404 status codes . . . . . . . . . . . . . . . . . . . . . . . . 99

    5.3 Postx log ltering workow . . . . . . . . . . . . . . . . . . . . . . . . 118

    5.4 Tomcat log event workow . . . . . . . . . . . . . . . . . . . . . . . . . 126

    5.5 The Grok debugger at work . . . . . . . . . . . . . . . . . . . . . . . . . 130

    6.1 Java exception email alert . . . . . . . . . . . . . . . . . . . . . . . . . . 140

    6.2 Jabber/XMPP alerts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

    6.3 Apache status and method graphs . . . . . . . . . . . . . . . . . . . . . 157

    6.4 Apache bytes counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1586.5 Apache request duration timer . . . . . . . . . . . . . . . . . . . . . . . 159

    v

  • 8/10/2019 The Log Stash Book

    8/220

    List of Figures

    7.1 Logstash Scaled Architecture . . . . . . . . . . . . . . . . . . . . . . . . 162

    7.2 Logstash Redis failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

    7.3 ElasticSearch scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

    7.4 The Paramedic ElasticSearch plugin . . . . . . . . . . . . . . . . . . . . 1767.5 Logstash indexer scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

    8.1 Cow said "testing" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

    Version: v1.3.4 (5f439d1) vi

  • 8/10/2019 The Log Stash Book

    9/220

    Listings

    2.1 Installing Java on Red Hat . . . . . . . . . . . . . . . . . . . . . . . . . 13

    2.2 Installing Java on Debian and Ubuntu . . . . . . . . . . . . . . . . . . 13

    2.3 Testing Java is installed . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    2.4 Downloading Logstash . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    2.5 Sample Logstash conguration . . . . . . . . . . . . . . . . . . . . . . 15

    2.6 Running the Logstash agent . . . . . . . . . . . . . . . . . . . . . . . . 16

    2.7 Logstash startup message . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    2.8 Tuning the Logstash agent . . . . . . . . . . . . . . . . . . . . . . . . . 17

    2.9 Running Logstash interactively . . . . . . . . . . . . . . . . . . . . . . 18

    2.10 A Logstash JSON event . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    2.11 A Logstash plain event . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    3.1 Installing Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    3.2 Creating the Logstash directory . . . . . . . . . . . . . . . . . . . . . . 24

    3.3 Downloading the Logstash JAR le . . . . . . . . . . . . . . . . . . . 243.4 Creating Logstash conguration directory . . . . . . . . . . . . . . . 24

    3.5 Creating Logstash log directory . . . . . . . . . . . . . . . . . . . . . . 24

    3.6 Installing Redis on Debian . . . . . . . . . . . . . . . . . . . . . . . . . 25

    3.7 Installing EPEL on CentOS and RHEL . . . . . . . . . . . . . . . . . . 26

    3.8 Installing Redis on Red Hat . . . . . . . . . . . . . . . . . . . . . . . . 26

    3.9 Changing the Redis interface . . . . . . . . . . . . . . . . . . . . . . . 26

    3.10 Commented out interface . . . . . . . . . . . . . . . . . . . . . . . . . 27

    3.11 Binding Redis to a single interface . . . . . . . . . . . . . . . . . . . . 27

    3.12 Starting the Redis server . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    3.13 Testing Redis is running . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    3.14 Telneting to the Redis server . . . . . . . . . . . . . . . . . . . . . . . 28

    3.15 A Logstash index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

    vii

  • 8/10/2019 The Log Stash Book

    10/220

    Listings

    3.16 Downloading ElasticSearch . . . . . . . . . . . . . . . . . . . . . . . . 31

    3.17 Installing ElasticSearch . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    3.18 Starting ElasticSearch . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    3.19 Initial cluster and node names . . . . . . . . . . . . . . . . . . . . . . 323.20 New cluster and node names . . . . . . . . . . . . . . . . . . . . . . . 32

    3.21 Restarting ElasticSearch . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    3.22 Checking ElasticSearch is running . . . . . . . . . . . . . . . . . . . . 33

    3.23 ElasticSearch status information . . . . . . . . . . . . . . . . . . . . . 33

    3.24 ElasticSearch status page . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    3.25 Creating the central.conf le . . . . . . . . . . . . . . . . . . . . . . . 34

    3.26 Initial central conguration . . . . . . . . . . . . . . . . . . . . . . . . 35

    3.27 Copying the central init script . . . . . . . . . . . . . . . . . . . . . . . 37

    3.28 Starting the central Logstash server . . . . . . . . . . . . . . . . . . . 37

    3.29 Checking the Logstash server is running . . . . . . . . . . . . . . . . 38

    3.30 Checking the Logstash process . . . . . . . . . . . . . . . . . . . . . . 38

    3.31 Logstash log output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    3.32 Create the Logstash directory . . . . . . . . . . . . . . . . . . . . . . . 39

    3.33 Download Logstash JAR le . . . . . . . . . . . . . . . . . . . . . . . . 39

    3.34 Creating the remaining Logstash directories . . . . . . . . . . . . . . 40

    3.35 Creating the Logstash agent conguration . . . . . . . . . . . . . . . 40

    3.36 Logstash event shipping conguration . . . . . . . . . . . . . . . . . . 41

    3.37 File input globbing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    3.38 File recursive globbing . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.39 Conguring the Logstash agent init script . . . . . . . . . . . . . . . . 44

    3.40 Starting the Logstash agent . . . . . . . . . . . . . . . . . . . . . . . . 44

    3.41 Checking the Logstash agent is running . . . . . . . . . . . . . . . . . 44

    3.42 Checking the Logstash process is running . . . . . . . . . . . . . . . . 45

    3.43 Logstash agent startup log output . . . . . . . . . . . . . . . . . . . . 45

    3.44 Watching the Logstash shipper.log le . . . . . . . . . . . . . . . . . 46

    3.45 Watching the Logstash central.log le . . . . . . . . . . . . . . . . . . 46

    3.46 Testing Redis is operational . . . . . . . . . . . . . . . . . . . . . . . . 46

    3.47 Connecting to Maurice via SSH . . . . . . . . . . . . . . . . . . . . . . 46

    3.48 A Logstash login event . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.49 Querying the ElasticSearch server . . . . . . . . . . . . . . . . . . . . 48

    3.50 Launching the Logstash KIbana web interface . . . . . . . . . . . . . 49

    Version: v1.3.4 (5f439d1) viii

  • 8/10/2019 The Log Stash Book

    11/220

    Listings

    3.51 Logstash web interface address . . . . . . . . . . . . . . . . . . . . . . 50

    4.1 A Syslog message . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

    4.2 Adding the `syslog` input . . . . . . . . . . . . . . . . . . . . . . . . . . 60

    4.3 The `syslog` input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604.4 Restarting the Logstash server . . . . . . . . . . . . . . . . . . . . . . . 61

    4.5 Syslog input startup output . . . . . . . . . . . . . . . . . . . . . . . . 61

    4.6 Conguring RSyslog for Logstash . . . . . . . . . . . . . . . . . . . . . 64

    4.7 Specifying RSyslog facilities or priorities . . . . . . . . . . . . . . . . 64

    4.8 Restarting RSyslog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

    4.9 Monitoring les with the imle module . . . . . . . . . . . . . . . . . 65

    4.10 SyslogNG s_src source statement . . . . . . . . . . . . . . . . . . . . . 67

    4.11 New SyslogNG destination . . . . . . . . . . . . . . . . . . . . . . . . 67

    4.12 New SyslogNG log action . . . . . . . . . . . . . . . . . . . . . . . . . 67

    4.13 Restarting SyslogNG . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

    4.14 Conguring Syslogd for Logstash . . . . . . . . . . . . . . . . . . . . . 68

    4.15 Restarting Syslogd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

    4.16 Testing with logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    4.17 Logstash log event from Syslog . . . . . . . . . . . . . . . . . . . . . . 71

    4.18 Checking for openssl . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

    4.19 Generating a private key . . . . . . . . . . . . . . . . . . . . . . . . . . 73

    4.20 Generating a CSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

    4.21 Signing our CSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

    4.22 Copying the key and certicate . . . . . . . . . . . . . . . . . . . . . . 744.23 Cleaning up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

    4.24 Adding the Lumberjack input . . . . . . . . . . . . . . . . . . . . . . . 75

    4.25 The Lumberjack input . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

    4.26 Restarting Logstash for Lumberjack . . . . . . . . . . . . . . . . . . . 76

    4.27 Checking Lumberjack has loaded . . . . . . . . . . . . . . . . . . . . . 77

    4.28 Downloading the Forwarder . . . . . . . . . . . . . . . . . . . . . . . . 77

    4.29 Installing the developer tools . . . . . . . . . . . . . . . . . . . . . . . 78

    4.30 Installing Go on Ubuntu . . . . . . . . . . . . . . . . . . . . . . . . . . 78

    4.31 Installing prerequisite Forwarder packages . . . . . . . . . . . . . . . 78

    4.32 Installing FPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784.33 Creating a Forwarder DEB package . . . . . . . . . . . . . . . . . . . 78

    4.34 Forwarder make output . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

    Version: v1.3.4 (5f439d1) ix

  • 8/10/2019 The Log Stash Book

    12/220

    Listings

    4.35 Installing the Forwarder . . . . . . . . . . . . . . . . . . . . . . . . . . 79

    4.36 Creating the Forwarder conguration directory . . . . . . . . . . . . 79

    4.37 Copying the Forwarder's SSL certicate . . . . . . . . . . . . . . . . . 80

    4.38 Creating logstash-forwarder.conf . . . . . . . . . . . . . . . . . . . . . 804.39 The logstash-forwarder.conf le . . . . . . . . . . . . . . . . . . . . . 81

    4.40 Testing the Forwarder . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

    4.41 Test the Forwarder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

    4.42 The Forwarder connection output . . . . . . . . . . . . . . . . . . . . 83

    4.43 Forwarder events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    4.44 Installing the Forwarder init script . . . . . . . . . . . . . . . . . . . . 84

    4.45 The Forwarder defaults le . . . . . . . . . . . . . . . . . . . . . . . . 84

    4.46 Starting the Forwarder . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

    4.47 Checking the Forwarder process . . . . . . . . . . . . . . . . . . . . . 85

    4.48 Installing Beaver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

    5.1 An Apache log event . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

    5.2 The Apache LogFormat and CustomLog directives . . . . . . . . . . 91

    5.3 Apache VirtualHost logging conguration . . . . . . . . . . . . . . . 91

    5.4 The Apache Common Log Format LogFormat directive . . . . . . . 92

    5.5 Apache custom JSON LogFormat . . . . . . . . . . . . . . . . . . . . . 93

    5.6 Adding the CustomLog directive . . . . . . . . . . . . . . . . . . . . . 94

    5.7 Restarting Apache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

    5.8 A JSON format event from Apache . . . . . . . . . . . . . . . . . . . . 96

    5.9 Apache logs via the le input . . . . . . . . . . . . . . . . . . . . . . . 975.10 Apache events via the Logstash Forwarder . . . . . . . . . . . . . . . 97

    5.11 A Postx log entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

    5.12 Unltered Postx event . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

    5.13 File input for Postx logs . . . . . . . . . . . . . . . . . . . . . . . . . . 102

    5.14 Postx grok lter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

    5.15 The grok pattern for Postx logs . . . . . . . . . . . . . . . . . . . . . 103

    5.16 The syntax and the semantic . . . . . . . . . . . . . . . . . . . . . . . 103

    5.17 The SYSLOGBASE pattern . . . . . . . . . . . . . . . . . . . . . . . . . 103

    5.18 The SYSLOGPROG pattern . . . . . . . . . . . . . . . . . . . . . . . . . 104

    5.19 The PROG pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1045.20 Postx date matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

    5.21 Converting semantic data . . . . . . . . . . . . . . . . . . . . . . . . . 105

    Version: v1.3.4 (5f439d1) x

  • 8/10/2019 The Log Stash Book

    13/220

    Listings

    5.22 The Postx event's elds . . . . . . . . . . . . . . . . . . . . . . . . . . 105

    5.23 A fully grokked Postx event . . . . . . . . . . . . . . . . . . . . . . . 106

    5.24 Partial Postx event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

    5.25 Creating the patterns directory . . . . . . . . . . . . . . . . . . . . . . 1075.26 Creating new patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

    5.27 Adding new patterns to grok lter . . . . . . . . . . . . . . . . . . . . 108

    5.28 Postx event grokked with external patterns . . . . . . . . . . . . . . 109

    5.29 A named capture for Postx's queue ID . . . . . . . . . . . . . . . . . 110

    5.30 Adding new named captures to the grok lter . . . . . . . . . . . . . 110

    5.31 Postx event ltered with named captures . . . . . . . . . . . . . . . 111

    5.32 Postx event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

    5.33 Updated grok lter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

    5.34 Postx component tagged events . . . . . . . . . . . . . . . . . . . . . 112

    5.35 Nested eld syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

    5.36 A grok lter for qmgr events . . . . . . . . . . . . . . . . . . . . . . . 113

    5.37 The /etc/logstash/patterns/postx le . . . . . . . . . . . . . . . . . 114

    5.38 A partial ltered Postx event . . . . . . . . . . . . . . . . . . . . . . 115

    5.39 The date lter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

    5.40 Postx event timestamps . . . . . . . . . . . . . . . . . . . . . . . . . . 117

    5.41 File input for Tomcat logs . . . . . . . . . . . . . . . . . . . . . . . . . 119

    5.42 A Tomcat log entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

    5.43 A drop lter for blank lines . . . . . . . . . . . . . . . . . . . . . . . . 120

    5.44 Examples of the conditional syntax . . . . . . . . . . . . . . . . . . . . 1205.45 Conditional inclusion syntax . . . . . . . . . . . . . . . . . . . . . . . . 121

    5.46 Using the multiline codec for Java exceptions . . . . . . . . . . . . . 122

    5.47 A Java exception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

    5.48 Another Java exception . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

    5.49 A multiline merged event . . . . . . . . . . . . . . . . . . . . . . . . . 124

    5.50 A grok lter for Java exception events . . . . . . . . . . . . . . . . . 124

    5.51 Our Java exception message . . . . . . . . . . . . . . . . . . . . . . . . 125

    5.52 Grokked Java exception . . . . . . . . . . . . . . . . . . . . . . . . . . 125

    5.53 Alpha log entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

    5.54 File input for our Alpha logs . . . . . . . . . . . . . . . . . . . . . . . . 1285.55 Single Alpha log entry . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

    5.56 A Grok regular expression for Alpha . . . . . . . . . . . . . . . . . . . 129

    Version: v1.3.4 (5f439d1) xi

  • 8/10/2019 The Log Stash Book

    14/220

    Listings

    5.57 Alpha grok lter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

    5.58 Alpha date lter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

    5.59 Alpha environment eld . . . . . . . . . . . . . . . . . . . . . . . . . . 133

    5.60 Setting the line eld to an integer . . . . . . . . . . . . . . . . . . . . 1345.61 A ltered Alpha event . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

    6.1 The Tomcat multiline le input and codec . . . . . . . . . . . . . . . 138

    6.2 The email output plugin . . . . . . . . . . . . . . . . . . . . . . . . . . 139

    6.3 The content of our email . . . . . . . . . . . . . . . . . . . . . . . . . . 139

    6.4 The le input for /var/log/secure . . . . . . . . . . . . . . . . . . . . 141

    6.5 Failed SSH authentication log entry . . . . . . . . . . . . . . . . . . . 141

    6.6 Failed SSH authentication grok lter . . . . . . . . . . . . . . . . . . 142

    6.7 Failed SSH authentication Logstash event . . . . . . . . . . . . . . . . 143

    6.8 The xmpp output plugin . . . . . . . . . . . . . . . . . . . . . . . . . . 144

    6.9 A STONITH cluster fencing log event . . . . . . . . . . . . . . . . . . 146

    6.10 Identify Nagios passive check results . . . . . . . . . . . . . . . . . . 146

    6.11 The grokked STONITH event . . . . . . . . . . . . . . . . . . . . . . . 147

    6.12 The Nagios output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

    6.13 The Nagios output with a custom command le . . . . . . . . . . . . 148

    6.14 A Nagios external command . . . . . . . . . . . . . . . . . . . . . . . . 149

    6.15 A Nagios service for cluster status . . . . . . . . . . . . . . . . . . . . 150

    6.16 JSON format event from Apache . . . . . . . . . . . . . . . . . . . . . 152

    6.17 The Apache event timestamp eld . . . . . . . . . . . . . . . . . . . . 153

    6.18 Getting the date right for our metrics . . . . . . . . . . . . . . . . . . 1546.19 The statsd output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

    6.20 Incremental counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

    6.21 Apache status metrics in Graphite . . . . . . . . . . . . . . . . . . . . 156

    6.22 Apache method metrics in Graphite . . . . . . . . . . . . . . . . . . . 156

    6.23 The apache.bytes counter . . . . . . . . . . . . . . . . . . . . . . . . . 157

    6.24 The apache.duration timer . . . . . . . . . . . . . . . . . . . . . . . . . 158

    6.25 The StatsD output with a custom host and port . . . . . . . . . . . . 159

    7.1 Installing Redis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

    7.2 Binding Redis to the external interface . . . . . . . . . . . . . . . . . 165

    7.3 Start the Redis instances . . . . . . . . . . . . . . . . . . . . . . . . . . 1657.4 Test Redis is running . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

    7.5 Multi instance Redis output conguration . . . . . . . . . . . . . . . 166

    Version: v1.3.4 (5f439d1) xii

  • 8/10/2019 The Log Stash Book

    15/220

    Listings

    7.6 Restarting the Logstash agent for Redis . . . . . . . . . . . . . . . . . 167

    7.7 Multiple Redis instances . . . . . . . . . . . . . . . . . . . . . . . . . . 168

    7.8 Restart the Logstash agent . . . . . . . . . . . . . . . . . . . . . . . . . 168

    7.9 Stopping a Redis instance . . . . . . . . . . . . . . . . . . . . . . . . . 1697.10 Redis connection refused exception . . . . . . . . . . . . . . . . . . . 169

    7.11 Stopping a second Redis instance . . . . . . . . . . . . . . . . . . . . . 169

    7.12 Remote agent event sending failures . . . . . . . . . . . . . . . . . . . 170

    7.13 Shut down Redis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

    7.14 Stop Redis starting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

    7.15 Installing Java for ElasticSearch . . . . . . . . . . . . . . . . . . . . . 172

    7.16 Download ElasticSearch . . . . . . . . . . . . . . . . . . . . . . . . . . 172

    7.17 Install ElasticSearch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

    7.18 ElasticSearch cluster and node names . . . . . . . . . . . . . . . . . . 173

    7.19 Grinner cluster and node names . . . . . . . . . . . . . . . . . . . . . 173

    7.20 Sinner cluster and node names . . . . . . . . . . . . . . . . . . . . . . 173

    7.21 Restarting ElasticSearch to recongure . . . . . . . . . . . . . . . . . 173

    7.22 Checking the cluster status. . . . . . . . . . . . . . . . . . . . . . . . . 174

    7.23 Installing Paramedic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

    7.24 The Paramedic URL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

    7.25 Deleting indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

    7.26 Optimizing indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

    7.27 Optimizing all indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

    7.28 Getting the size of an index . . . . . . . . . . . . . . . . . . . . . . . . 1787.29 Setting up a second indexer . . . . . . . . . . . . . . . . . . . . . . . . 181

    7.30 Starting second Logstash instance . . . . . . . . . . . . . . . . . . . . 181

    8.1 The stdin input plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

    8.2 Requiring the Logstash module . . . . . . . . . . . . . . . . . . . . . . 186

    8.3 Requiring the LogStash::Inputs::Base class . . . . . . . . . . . . . . . 186

    8.4 The plugin class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

    8.5 The namedpipe framework . . . . . . . . . . . . . . . . . . . . . . . . 187

    8.6 The namedpipe framework plugin options . . . . . . . . . . . . . . . 188

    8.7 The namedpipe input conguration . . . . . . . . . . . . . . . . . . . 189

    8.8 The namedpipe input . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1908.9 Creating plugins directories . . . . . . . . . . . . . . . . . . . . . . . . 191

    8.10 Adding the namedpipe input . . . . . . . . . . . . . . . . . . . . . . . 192

    Version: v1.3.4 (5f439d1) xiii

  • 8/10/2019 The Log Stash Book

    16/220

    Listings

    8.11 Running Logstash with plugin support . . . . . . . . . . . . . . . . . . 192

    8.12 Registering the namedpipe input . . . . . . . . . . . . . . . . . . . . . 192

    8.13 Creating directory structure . . . . . . . . . . . . . . . . . . . . . . . . 193

    8.14 Copying the plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1938.15 Adding a plugin to the JAR le . . . . . . . . . . . . . . . . . . . . . . 193

    8.16 Checking the plugin has been added . . . . . . . . . . . . . . . . . . . 193

    8.17 Our sux lter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

    8.18 Conguring the addsux lter . . . . . . . . . . . . . . . . . . . . . . 195

    8.19 An event with the ALERT sux . . . . . . . . . . . . . . . . . . . . . . 196

    8.20 Installing CowSay on Debian and Ubuntu . . . . . . . . . . . . . . . . 196

    8.21 Installing CowSay via a RubyGem . . . . . . . . . . . . . . . . . . . . 196

    8.22 The CowSay output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

    8.23 Conguring the cowsay output . . . . . . . . . . . . . . . . . . . . . . 198

    Version: v1.3.4 (5f439d1) xiv

  • 8/10/2019 The Log Stash Book

    17/220

    Foreword

    Who is this book for?

    This book is designed for SysAdmins, operations sta, developers and DevOpswho are interested in deploying a log management solution using the open source

    toolLogstash.

    There is an expectation that the reader has basic Unix/Linux skills, and is familiar

    with the command line, editing les, installing packages, managing services, and

    basic networking.

    NOTE This book focuses on Logstash version 1.2.0 and later. It is not recom-mended for earlier versions of Logstash.

    Credits and Acknowledgments

    Jordan Sissel for writing Logstash and for all his assistance during the writing

    process.

    Rashid Khan for writing Kibana.

    Dean Wilson for his feedback on the book.

    Aaron Mildenstein for his Apache to JSON logging postshereandhere.

    R.I. Pienaar for his excellent documentation on message queuing.

    1

    http://untergeek.com/2013/09/11/getting-apache-to-output-json-for-logstash-1-2-x/http://untergeek.com/2012/10/11/getting-apache-to-output-json-for-logstash/http://www.logstash.net/
  • 8/10/2019 The Log Stash Book

    18/220

    Foreword

    The ne folks in the Freenode #logstash channel for being so helpful as I

    peppered them with questions, and

    Ruth Brown for only saying "Another book? WTF?" once, proof reading the

    book, making the cover page and for being awesome.

    Technical Reviewers

    Jan-Piet Mens

    Jan-Piet Mens is an independent Unix/Linux consultant and sysadmin who's

    worked with Unix-systems since 1985. JP does odd bits of coding, and has

    architected infrastructure at major customers throughout Europe. One of hisspecialities is the Domain Name System and as such, he authored the book

    Alternative DNS Serversas well as a variety ofother technical publications.

    Paul Stack

    Paul Stack is a London based developer. He has a passion for continuous inte-

    gration and continuous delivery and why they should be part of what developers

    do on a day to day basis. He believes that reliably delivering software is just as

    important as its development. He talks at conferences all over the world on thissubject. Paul's passion for continuous delivery has led him to start working closer

    with operations sta and has led him to technologies like Logstash, Puppet and

    Chef.

    Technical Illustrator

    Royce Gilbert has over 30 years experience in CAD design, computer support,

    network technologies, project management, business systems analysis for majorFortune 500 companies such as; Enron, Compaq, Koch Industries and Amoco Corp.

    He is currently employed as a Systems/Business Analyst at Kansas State University

    Version: v1.3.4 (5f439d1) 2

    mailto:[email protected]://mens.de/http://twitter.com/jpmens
  • 8/10/2019 The Log Stash Book

    19/220

    Foreword

    in Manhattan, KS. In his spare time he does Freelance Art and Technical Illustra-

    tion as sole proprietor of Royce Art. He and his wife of 38 years are living in and

    restoring a 127 year old stone house nestled in the Flinthills of Kansas.

    Author

    James is an author and open source geek. James authored the two books about

    Puppet (Pro Puppetand theearlier bookabout Puppet). He is also the author of

    three other books includingPro Linux System Administration,Pro Nagios 2.0, and

    Hardening Linux.

    For a real job, James is VP of Engineering for Venmo. He was formerly VP of

    Technical Operations for Puppet Labs. He likes food, wine, books, photographyand cats. He is not overly keen on long walks on the beach and holding hands.

    Conventions in the book

    This is an inline code statement.

    This is a code block:

    Listing 1: A sample code block

    This is a code block

    Long code strings are broken with .

    Code and Examples

    You can nd all the code and examples from the book on the websiteor you cancheck out theGit repo.

    Version: v1.3.4 (5f439d1) 3

    https://github.com/jamtur01/logstashbook-codehttp://www.logstashbook.com/code/index.htmlhttp://www.amazon.com/gp/product/1590594444?ie=UTF8&tag=puppet0e-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=1590594444http://www.amazon.com/gp/product/1590596099?ie=UTF8&tag=puppet0e-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=1590596099http://www.amazon.com/gp/product/1430219122?ie=UTF8&tag=puppet0e-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=1430219122http://www.amazon.com/gp/product/1590599780?ie=UTF8&tag=puppet0e-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=1590599780http://www.amazon.com/gp/product/1430230576/ref=as_li_ss_tl?ie=UTF8&tag=puppet0e-20&linkCode=as2&camp=217145&creative=399349&creativeASIN=1430230576
  • 8/10/2019 The Log Stash Book

    20/220

    Foreword

    Colophon

    This book was written in Markdown with a large dollop of LaTeX. It was then

    converted to PDF and other formats using PanDoc (with some help from scripts

    written by the excellent folks who wroteBackbone.js on Rails).

    Errata

    Please email any Errata you ndhere.

    Trademarks

    Kibana and Logstash are trademarks of Elasticsearch BV. Elasticsearch is a regis-

    tered trademark of Elasticsearch BV.

    Version

    This is version v1.3.4 (5f439d1) of The Logstash Book.

    Copyright

    Figure 1: Copyright

    Some rights reserved. No part of this publication may be reproduced, stored in a

    retrieval system or transmitted in any form or by any means, electronic, mechan-

    ical or photocopying, recording, or otherwise for commercial purposes without

    the prior permission of the publisher.

    Version: v1.3.4 (5f439d1) 4

    mailto:[email protected]://learn.thoughtbot.com/products/1-backbone-js-on-rails
  • 8/10/2019 The Log Stash Book

    21/220

    Foreword

    This work is licensed under the Creative Commons Attribution-NonCommercial-

    NoDerivs 3.0 Unported License. To view a copy of this license, visithere.

    Copyright 2014 - James Turnbull

    ISBN: 978-0-9888202-1-0

    Version: v1.3.4 (5f439d1)

    Version: v1.3.4 (5f439d1) 5

    mailto:[email protected]://creativecommons.org/licenses/by-nc-nd/3.0/
  • 8/10/2019 The Log Stash Book

    22/220

    Chapter 1

    Introduction or Why Should I

    Bother?

    Log management is often considered both a painful exercise and a dark art. In-

    deed, understanding good log management tends to be a slow and evolutionary

    process. In response to issues and problems, new SysAdmins are told: "Go look at

    the logs." A combination ofcat, tail and grep (and often sed, awk or perl too)

    become their tools of choice to diagnose and identify problems in log and event

    data. They quickly become experts at command line and regular expression kung-

    fu: searching, parsing, stripping, manipulating and extracting data from a humble

    log event. It's a powerful and practical set of skills that strongly I recommend allSysAdmins learn.

    Sadly, this solution does not scale. In most cases you have more than one host and

    multiple sources of log les. You may have tens, hundreds or even thousands of

    hosts. You run numerous, inter-connected applications and services across multi-

    ple locations and fabrics, both physically, virtually and in the cloud. In this world

    it quickly becomes apparent that logs from any one application, service or host

    are not enough to diagnose complex multi-tier issues.

    To address this gap your log environment must evolve to become centralized. The

    tools of choice expand to include conguring applications to centrally log andservices like rsyslog and syslog-ng to centrally deliver Syslog output. Events

    start owing in and log servers to hold this data are built, consuming larger and

    6

  • 8/10/2019 The Log Stash Book

    23/220

    Chapter 1: Introduction or Why Should I Bother?

    larger amounts of storage.

    But we're not done yet. The problem then turns from one of too little information

    to one of too much information and too little context. You have millions or billions

    of lines of logs to sift through. Those logs are produced in dierent timezones,

    formats and sometimes even in dierent languages. It becomes increasingly hard

    to sort through the growing streams of log data to nd the data you need and

    harder again to correlate that data with other relevant events. Your growing

    collection of log events then becomes more of a burden than a benet.

    To solve this new issue you have to extend and expand your log management

    solution to include better parsing of logs, more elegant storage of logs (as at les

    just don't cut it) and the addition of searching and indexing technology. What

    started as a simple grep through log les has become a major project in its own

    right. A project that has seen multiple investment iterations in several solutions(or multiple solutions and their integration) with a commensurate cost in eort

    and expense.

    There is a better way.

    Introducing Logstash

    Instead of walking this path, with the high cost of investment and the potential of

    evolutionary dead ends, you can start with Logstash. Logstash provides an inte-

    grated framework for log collection, centralization, parsing, storage and search.

    Logstash is free and open source (Apache 2.0licensed) and developed by Ameri-

    can developer and Logging Czar at Dreamhost, Jordan Sissel. It's easy to set up,

    performant, scalable and easy to extend.

    Logstash has a wide variety of input mechanisms: it can take inputs from

    TCP/UDP, les, Syslog, Microsoft Windows EventLogs, STDIN and a variety of

    other sources. As a result there's likely very little in your environment that you

    can't extract logs from and send them to Logstash.

    When those logs hit the Logstash server, there is a large collection of lters that

    allow you to modify, manipulate and transform those events. You can extract the

    Version: v1.3.4 (5f439d1) 7

    http://www.semicomplete.com/https://github.com/jordansisselhttp://www.apache.org/licenses/LICENSE-2.0.htmlhttp://logstash.net/
  • 8/10/2019 The Log Stash Book

    24/220

    Chapter 1: Introduction or Why Should I Bother?

    information you need from log events to give them context. Logstash makes it

    simple to query those events. It makes it easier to draw conclusions and make

    good decisions using your log data.

    Finally, when outputting data, Logstash supports a huge range of destinations,

    including TCP/UDP, email, les, HTTP, Nagios and a wide variety of network

    and online services. You can integrate Logstash with metrics engines, alerting

    tools, graphing suites, storage destinations or easily build your own integration

    to destinations in your environment.

    NOTE We'll look at how to develop practical examples of each of these input,lter and output plugins in Chapter 8.

    Logstash design and architecture

    Logstash is written in JRuby and runs in a Java Virtual Machine (JVM). Its archi-

    tecture is message-based and very simple. Rather than separate agents or servers,

    Logstash has a single agent that is congured to perform dierent functions in

    combination with other open source components.

    In the Logstash ecosystem there are four components:

    Shipper: Sends events to Logstash. Your remote agents will generally only

    run this component.

    Broker and Indexer: Receives and indexes the events.

    Search and Storage: Allows you to search and store events.

    Web Interface: A Web-based interface to Logstash called Kibana.

    Logstash servers run one or more of these components independently, which al-

    lows us to separate components and scale Logstash.

    In most cases there will be two broad classes of Logstash host you will probably

    be running:

    Version: v1.3.4 (5f439d1) 8

  • 8/10/2019 The Log Stash Book

    25/220

    Chapter 1: Introduction or Why Should I Bother?

    Hosts running the Logstash agent as an event "shipper" that send your appli-

    cation, service and host logs to a central Logstash server. These hosts will

    only need the Logstash agent.

    Central Logstash hosts running some combination of the Broker, Indexer,Search and Storage and Web Interface which receive, process and store your

    logs.

    Figure 1.1: The Logstash Architecture

    NOTE We'll look at scaling Logstash by running the Broker, Indexer, Search andStorage and Web Interface in a scalable architecture in Chapter 7 of this book.

    Version: v1.3.4 (5f439d1) 9

  • 8/10/2019 The Log Stash Book

    26/220

    Chapter 1: Introduction or Why Should I Bother?

    What's in the book?

    In this book I will walk you through installing, deploying, managing and extending

    Logstash. We're going to do that by introducing you to Example.com, where you're

    going to start a new job as one of its SysAdmins. The rst project you'll be in

    charge of is developing its new log management solution.

    We'll teach you how to:

    Install and deploy Logstash.

    Ship events from a Logstash Shipper to a central Logstash server.

    Filter incoming events using a variety of techniques.

    Output those events to a selection of useful destinations.

    Use Logstash'sKibanaweb interface. Scale out your Logstash implementation as your environment grows.

    Quickly and easily extend Logstash to deliver additional functionality you

    might need.

    By the end of the book you should have a functional and eective log management

    solution that you can deploy into your own environment.

    NOTE This book focusses on Logstash v1.2.0 and later. This was a major,somewhat backwards-incompatible release for Logstash. A number of options

    and schema changes were made between v1.2.0 and earlier versions. If you are

    running an earlier version of Logstash I strongly recommend you upgrade.

    Logstash resources

    The Logstash site(Logstash's home page).

    The Logstash cookbook(a collection of useful Logstash recipes).

    The Logstash source codeon GitHub.

    Logstash's author Jordan Sissel'shome page,TwitterandGitHub account.

    Version: v1.3.4 (5f439d1) 10

    https://github.com/jordansisselhttps://twitter.com/jordansisselhttp://www.semicomplete.com/https://github.com/logstash/logstash/http://cookbook.logstash.net/http://www.logstash.net/http://kibana.org/
  • 8/10/2019 The Log Stash Book

    27/220

    Chapter 1: Introduction or Why Should I Bother?

    Getting help with Logstash

    Logstash's developer, Jordan Sissel, has a maxim that makes getting help pretty

    easy: "If a newbie has a bad time, it's a bug in Logstash." So if you're having

    trouble reach out via the mailing list or IRC and ask for help! You'll nd the

    Logstash community both helpful and friendly!

    TheLogstash documentation.

    TheLogstash cookbook.

    TheLogstash users mailing list.

    The Logstashbug tracker.

    The #logstash IRC channel on Freenode.

    A mild warning

    Logstash is a young product and under regular development. Features are

    changed, added, updated and deprecated regularly. I recommend you follow

    development at theJira support site, on GitHuband review the change logs for

    each release to get a good idea of what has changed. Logstash is usually solidly

    backwards compatible but issues can emerge and being informed can often save

    you unnecessary troubleshooting eort.

    Version: v1.3.4 (5f439d1) 11

    https://github.com/logstash/logstash/https://logstash.jira.com/secure/Dashboard.jspahttps://logstash.jira.com/secure/Dashboard.jspahttps://groups.google.com/forum/?fromgroups#!forum/logstash-usershttp://cookbook.logstash.net/http://logstash.net/docs/latest/
  • 8/10/2019 The Log Stash Book

    28/220

    Chapter 2

    Getting Started with Logstash

    Logstash is easy to set up and deploy. We're going to go through the basic steps of

    installing and conguring it. Then we'll try it out so we can see it at work. That'll

    provide us with an overview of its basic set up, architecture, and importantly the

    pluggable model that Logstash uses to input, process and output events.

    Installing Java

    Logstash's principal prerequisite is Java and Logstash itself runs in a Java Virtual

    Machine or JVM. So let's start by installing Java. The fastest way to do this is via

    our distribution's packaging system, for example Yum in the Red Hat family or

    Debian and Ubuntu's Apt-Get.

    TIP I recommend we install OpenJDK Java on your distribution. If you're run-ning OSX the natively installed Java will work ne (on Mountain Lion and later

    you'll need to install Java from Apple).

    12

  • 8/10/2019 The Log Stash Book

    29/220

    Chapter 2: Getting Started with Logstash

    On the Red Hat family

    We install Java via the yum command:

    Listing 2.1: Installing Java on Red Hat

    $ sudo yum install java-1.7.0-openjdk

    On Debian & Ubuntu

    We install Java via the apt-getcommand:

    Listing 2.2: Installing Java on Debian and Ubuntu

    $ sudo apt-get -y install openjdk-7-jdk

    Testing Java is installed

    We can then test that Java is installed via the javabinary:

    Listing 2.3: Testing Java is installed

    $ java -version

    java version "1.7.0_09"

    OpenJDK Runtime Environment (IcedTea7 2.3.3)(7u9-2.3.3-0ubuntu1

    ~12.04.1)

    OpenJDK Client VM (build 23.2-b09, mixed mode, sharing)

    Version: v1.3.4 (5f439d1) 13

  • 8/10/2019 The Log Stash Book

    30/220

    Chapter 2: Getting Started with Logstash

    Getting Logstash

    Once we have Java installed we can grab the Logstash package. Although Logstash

    is written in JRuby, its developer releases a standalone jar le containing all of the

    required dependencies. This means we don't need to install JRuby or any other

    packages.

    At this stage no distributions ship Logstash packages but you can easily build our

    own using a tool likeFPMor from examples for RPM from hereorhereor DEB

    fromhereor here.

    TIP If we're distributing a lot of Logstash agents then it's probably a good ideato package Logstash.

    We can download the jar le and rename it as logstash.jar:

    Listing 2.4: Downloading Logstash

    $ wget https://download.elasticsearch.org/logstash/logstash/

    logstash-1.3.3-flatjar.jar -O logstash.jar

    NOTE At the time of writing the latest version of Logstash is 1.3.3.

    Starting Logstash

    Once we have the jar le we can launch it with the java binary and a simple,

    sample conguration le. We're going to do this to demonstrate Logstash workinginteractively and do a little bit of testing to see how Logstash works at its most

    basic.

    Version: v1.3.4 (5f439d1) 14

    https://github.com/baseblack/logstash-debhttps://github.com/jbraeuer/logstash-debshttps://github.com/NumberFour/logstash-rpmshttps://github.com/mhorbul/logstash-rpm/https://github.com/jordansissel/fpm
  • 8/10/2019 The Log Stash Book

    31/220

    Chapter 2: Getting Started with Logstash

    Our sample conguration le

    Firstly, let's create our sample conguration le. We're going to call ours sample

    .confand you can see it here:

    Listing 2.5: Sample Logstash conguration

    input {

    stdin { }

    }

    output {

    stdout {debug => true

    }

    }

    Oursample.confle contains two conguration blocks: one called inputand one

    calledoutput. These are two of three types of plugin components in Logstash that

    we can congure. The last type is filterthat we're going to see in later chapters.

    Each type congures a dierent portion of the Logstash agent:

    inputs - How events get into Logstash.

    lters - How you can manipulate events in Logstash.

    outputs - How you can output events from Logstash.

    In the Logstash world events enter via inputs, they are manipulated, mutated or

    changed in lters and then exit Logstash via outputs.

    Inside each component's block you can specify and congure plugins. For ex-

    ample, in the input block above we've dened the stdin plugin which controls

    event input from STDIN. In the output block we've congured its opposite: the

    stdout plugin, which outputs events to STDOUT. For this plugin we've added a

    conguration option: debug. This outputs each event as a JSON hash.

    Version: v1.3.4 (5f439d1) 15

  • 8/10/2019 The Log Stash Book

    32/220

    Chapter 2: Getting Started with Logstash

    NOTE STDIN and STDOUT are the standard streams of I/O in most applicationsand importantly in this case in your terminal.

    Running the Logstash agent

    Now we've got a conguration le let's run Logstash for ourselves:

    Listing 2.6: Running the Logstash agent

    $ java -jar logstash.jar agent -v -f sample.conf

    NOTE Every time you change your Logstash conguration you will need torestart Logstash so it can pick up the new conguration.

    We've used the javabinary and specied our downloaded jar le using the -jar

    option. We've also specied three command line ags: agentwhich tell Logstash

    to run as the basic agent,-v which turns on verbose logging and -f which species

    the conguration le Logstash should start with.

    TIP You can use the -vv ag for even more verbose output.

    Logstash should now start to generate some startup messages telling you it is

    enabling the plugins we've specied and nally emit:

    Version: v1.3.4 (5f439d1) 16

    http://en.wikipedia.org/wiki/Standard_streams
  • 8/10/2019 The Log Stash Book

    33/220

    Chapter 2: Getting Started with Logstash

    Listing 2.7: Logstash startup message

    Pipeline started {:level=>:info}

    This indicates Logstash is ready to start processing logs!

    TIP You can see a full list of the other command line ags Logstash acceptshere.

    Logstash can bea mite slow to start and will take a few moments to get started

    before it is ready for input. If it's really slow you can tweak Java's minimum andmaximum heap size to throw more memory at the process. You can do this with

    the-Xmsand -Xmxags. The-Xmsargument sets the initial heap memory size for

    the JVM. This means that when you start your program the JVM will allocate this

    amount of memory automatically. The-Xmx argument denes the max memory

    size that the heap can reach for the JVM. You can set them like so:

    Listing 2.8: Tuning the Logstash agent

    $ java -Xms384m -Xmx384m -jar logstash.jar agent -v -f sample.

    conf

    This sets the minimum and maximum heap size to 384M of memory. Giving

    Logstash more heap memory should speed it up but can also be unpredictable.

    Indeed, Java and JVM tuning can sometimes have a steep learning curve. You

    should do some benchmarking of Logstash in your environment. Keep in mind

    that requirements for agents versus indexers versus other components will also

    dier.

    TIP There are some resources online that can help with JVM tuning here,here,hereandhere.

    Version: v1.3.4 (5f439d1) 17

    http://www.quora.com/JVM/What-are-some-useful-tips-for-tuning-programs-running-on-the-JVMhttp://www.semicomplete.com/blog/geekery/debugging-java-performance.htmlhttp://stackoverflow.com/questions/564039/jvm-performance-tuning-for-large-applicationshttp://www.caucho.com/resin-3.0/performance/jvm-tuning.xtphttp://cookbook.logstash.net/recipes/faster-startup-time/http://logstash.net/docs/latest/flags
  • 8/10/2019 The Log Stash Book

    34/220

    Chapter 2: Getting Started with Logstash

    Testing the Logstash agent

    Now Logstash is running, remember that we enabled the stdin plugin? Logstash

    is now waiting for us to input something on STDIN. So I am going to type "testing"

    and hit Enter to see what happens.

    Listing 2.9: Running Logstash interactively

    $ java -jar logstash.jar agent -v -f sample.conf

    output received {:event=>#"testing", "@timestamp"=>"2013-08-25

    T17:27:50.027Z", "@version"=>"1", "host"=>"maurice.example.com

    "}>, :level=>:info}

    {

    "message" => "testing",

    "@timestamp" => "2013-08-25T17:27:50.027Z",

    "@version" => "1",

    "host" => "maurice.example.com"

    }

    You can see that our input has resulted in some output: a infolevel log message

    from Logstash itself and an event in JSON format (remember we specied the

    debugoption for the stdoutplugin). Let's examine the event in more detail.

    Version: v1.3.4 (5f439d1) 18

  • 8/10/2019 The Log Stash Book

    35/220

    Chapter 2: Getting Started with Logstash

    Listing 2.10: A Logstash JSON event

    {

    "message" => "testing",

    "@timestamp" => "2013-08-25T17:27:50.027Z",

    "@version" => "1",

    "host" => "maurice.example.com"

    }

    We can see our event is made up of a timestamp, the host that generated the

    event maurice.example.com and the message, in our case testing. You might

    notice that all these components are also contained in the log output in the @data

    hash.

    We can see our event has been printed as a hash. Indeed it's represented internally

    in Logstash as a JSON hash.

    If we'd had omitted the debugoption from the stdoutplugin we'd have gotten a

    plain event like so:

    Listing 2.11: A Logstash plain event

    2013-08-25T17:27:50.027Z maurice.example.com testing

    Logstash calls these formats codecs. There are a variety of codecs that Logstash

    supports. We're going to mostly see theplainand jsoncodecs in the book.

    plain - Events are recorded as plain text and any parsing is done using

    filterplugins.

    json - Events are assumed to be JSON and Logstash tries to parse the event's

    contents into elds itself with that assumption.

    We're going to focus on thejsonformat in the book as it's the easiest way to work

    with Logstash events and show how they can be used. The format is made up of

    a number of elements. A basic event has only the following elements:

    Version: v1.3.4 (5f439d1) 19

  • 8/10/2019 The Log Stash Book

    36/220

    Chapter 2: Getting Started with Logstash

    @timestamp: AnISO8601 timestamp.

    message: The event's message. Here testing as that's what we put into

    STDIN.

    @version: The version of the event format. This current version is 1.

    Additionally many of the plugins we'll use add additional elds, for example the

    stdin plugin we've just used adds a eld called host which species the host

    which generated the event. Other plugins, for example the file input plugin

    which collects events from les, add elds like pathwhich reports the le of the

    le being collected from. In the next chapters we'll also see some other elements

    like custom elds, tags and other context that we can add to events.

    TIP Running interactively we can stop Logstash using the Ctrl-C key combina-tion.

    Summary

    That concludes our simple introduction to Logstash. In the next chapter we're

    going to introduce you to your new role at Example.com and see how you can useLogstash to make your log management project a success.

    Version: v1.3.4 (5f439d1) 20

    http://en.wikipedia.org/wiki/ISO_8601
  • 8/10/2019 The Log Stash Book

    37/220

    Chapter 3

    Shipping Events

    It's your rst day at Example.com and your new boss swings by your desk to tell

    you about the rst project you're going to tackle: log management. Your job is

    to consolidate log output to a central location from a variety of sources. You've

    got a wide variety of log sources you need to consolidate but you've been asked

    to start with consolidating and managing some Syslog events.

    Later in the project we'll look at other log sources and by the end of the project all

    required events should be consolidated to a central server, indexed, stored, and

    then be searchable. In some cases you'll also need to congure some events to be

    sent on to new destinations, for example to alerting and metrics systems.

    To do the required work you've made the wise choice to select Logstash as your

    log management tool and you've built a basic plan to deploy it:

    1. Build a single central Logstash server (we'll cover scaling in Chapter 7).

    2. Congure your central server to receive events, index them and make them

    available to search.

    3. Install Logstash on a remote agent.

    4. Congure Logstash to send some selected log events from our remote agent

    to our central server.

    5. Install Logstash Kibana to act as a web console and front end for our logging

    infrastructure.

    21

  • 8/10/2019 The Log Stash Book

    38/220

    Chapter 3: Shipping Events

    We'll take you through each of these steps in this chapter and then in later chapters

    we'll expand on this implementation to add new capabilities and scale the solution.

    Our Event Lifecycle

    For our initial Logstash build we're going to have the following lifecycle:

    The Logstash agent on our remote agents collects and sends a log event to

    our central server.

    ARedisinstance receives the log event on the central server and acts as a

    buer.

    The Logstash agent draws the log event from our Redis instance and indexes

    it.

    The Logstash agent sends the indexed event toElasticSearch.

    ElasticSearch stores and renders the event searchable.

    The Logstash web interface queries the event from ElasticSearch.

    Figure 3.1: Our Event Lifecycle

    Version: v1.3.4 (5f439d1) 22

    http://www.elasticsearch.org/http://redis.io/
  • 8/10/2019 The Log Stash Book

    39/220

    Chapter 3: Shipping Events

    Now let's set up Logstash to implement this lifecycle.

    Installing Logstash on our central server

    First we're going to install Logstash on our central server. We're going to build

    an Ubuntu box calledsmoker.example.comwith an IP address of10.0.0.1as our

    central server.

    Central server

    Hostname: smoker.example.com

    IP Address: 10.0.0.1

    As this is our production infrastructure we're going to be a bit more systematic

    about setting up Logstash than we were in Chapter 1. To do this we're going to

    create a directory for our Logstash environment and proper service management

    to start and stop it.

    TIP There are other, more elegant, ways to install Logstash using tools likePuppetorChef. Setting up either is beyond the scope of this book but there are

    several Puppet modules for Logstash on the Puppet Forgeand aChef cookbook. I

    strongly recommend you use this chapter as exposition and introduction on howLogstash is deployed and use some kind of conguration management to deploy

    in production.

    Let's install Java rst.

    Listing 3.1: Installing Java

    $ sudo apt-get install openjdk-7-jdk

    Now let's create a directory to hold Logstash itself. We're going to use /opt/

    logstash:

    Version: v1.3.4 (5f439d1) 23

    http://community.opscode.com/cookbooks/logstashhttp://forge.puppetlabs.com/modules?q=logstashhttp://www.opscode.com/chef/http://www.puppetlabs.com/
  • 8/10/2019 The Log Stash Book

    40/220

    Chapter 3: Shipping Events

    Listing 3.2: Creating the Logstash directory

    $ sudo mkdir /opt/logstash

    We'll now download the Logstash jar le to this directory and rename it to

    logstash.jar.

    Listing 3.3: Downloading the Logstash JAR le

    $ cd /opt/logstash

    $ sudo wget https://download.elasticsearch.org/logstash/logstash

    /logstash-1.3.3-flatjar.jar -O logstash.jar

    Now let's create a directory to hold our Logstash conguration:

    Listing 3.4: Creating Logstash conguration directory

    $ sudo mkdir /etc/logstash

    Finally, a directory to store Logstash's log output:

    Listing 3.5: Creating Logstash log directory

    $ sudo mkdir /var/log/logstash

    Now let's install some of the other required components for our new deployment

    and then come back to conguring Logstash.

    Installing a broker

    As this is our central server we're going to install a broker for Logstash. Thebroker receives events from our shippers and holds them briey prior to Logstash

    Version: v1.3.4 (5f439d1) 24

  • 8/10/2019 The Log Stash Book

    41/220

    Chapter 3: Shipping Events

    indexing them. It essentially acts as a "buer" between your Logstash agents and

    your central server. It is useful for two reasons:

    It is a way to enhance the performance of your Logstash environment byproviding a caching buer for log events.

    It provides some resiliency in our Logstash environment. If our Logstash

    indexing fails then our events will be queued in Redis rather than potentially

    lost.

    We are going to useRedisas our broker. We could choose a variety of possible

    brokers, indeed other options includeAMQPand0MQ, but we're going with Redis

    because:

    It's very simple and very fast to set up.

    It's performant.

    It's well tested and widely used in the Logstash community.

    Redis is a neat open source, key-value store. Importantly for us the keys can

    contain strings, hashes, lists, sets and sorted sets making it a powerful store for a

    variety of data structures.

    Installing Redis

    We can either install Redis via our packager manager or from source. I recommend

    installing it from a package as it's easier to manage and you'll get everything you

    need to manage it. However, you will need Redis version 2.0 or later. On our

    Debian and Ubuntu hosts we'd install it like so:

    Listing 3.6: Installing Redis on Debian

    $ sudo apt-get install redis-server

    On Red Hat-based platforms you will need to install the EPEL package repositories

    to get a recent version of Redis. For example on CentOS and RHEL 6 to install

    EPEL:

    Version: v1.3.4 (5f439d1) 25

    http://www.zeromq.org/http://www.amqp.org/http://redis.io/
  • 8/10/2019 The Log Stash Book

    42/220

    Chapter 3: Shipping Events

    Listing 3.7: Installing EPEL on CentOS and RHEL

    $ sudo rpm -Uvh http://download.fedoraproject.org/pub/epel/6/

    i386/epel-release-6-8.noarch.rpm

    And now we can install Redis.

    Listing 3.8: Installing Redis on Red Hat

    $ sudo yum install redis

    NOTE If you want the source or the bleeding edge edition you can downloadRedis directly fromits site, congure and install it.

    Changing the Redis interface

    Once Redis is installed we need to update its conguration so it listens on all

    interfaces. By default, Redis only listens on the 127.0.0.1 loopback interface. We

    need it to listen on an external interface so that it can receive events from ourremote agents.

    To do this we need to edit the /etc/redis/redis.conf (it's /etc/redis.conf on

    Red Hat-based platforms) conguration le and comment out this line:

    Listing 3.9: Changing the Redis interface

    bind 127.0.0.1

    So it becomes:

    Version: v1.3.4 (5f439d1) 26

    http://redis.io/download
  • 8/10/2019 The Log Stash Book

    43/220

    Chapter 3: Shipping Events

    Listing 3.10: Commented out interface

    #bind 127.0.0.1

    We could also just bind it to a single interface, for example our host's external IP

    address10.0.0.1like so:

    Listing 3.11: Binding Redis to a single interface

    bind 10.0.0.1

    Now it's congured, we can start the Redis server:

    Listing 3.12: Starting the Redis server

    $ sudo /etc/init.d/redis-server start

    Test Redis is running

    We can test if the Redis server is running by using the redis-clicommand.

    Listing 3.13: Testing Redis is running

    $ redis-cli -h 10.0.0.1

    redis 10.0.0.1:6379> PING

    PONG

    When theredisprompt appears, then type PINGand if the server is running then

    it should return a PONG.

    You should also be able to see the Redis server listening on port 6379. You will

    need to ensure any rewalls on the host or between the host and any agents allows

    trac on port 6379. To test this is working you can telnet to that port and issue

    the same PINGcommand.

    Version: v1.3.4 (5f439d1) 27

  • 8/10/2019 The Log Stash Book

    44/220

    Chapter 3: Shipping Events

    Listing 3.14: Telneting to the Redis server

    $ telnet 10.0.0.1 6379

    Trying 10.0.0.1...

    Connected to smoker.

    Escape character is '^]'.

    PING

    +PONG

    ElasticSearch for Search

    Next we're going to installElasticSearchto provide our search capabilities. Elas-

    ticSearch is a powerful indexing and search tool. As the ElasticSearch team puts

    it: "ElasticSearch is a response to the claim: 'Search is hard.'". ElasticSearch is

    easy to set up, has search and index data available RESTfully as JSON over HTTP

    and is easy to scale and extend. It's released under the Apache 2.0 license and is

    built on top of Apache's Lucene project.

    When installing the Elasticsearch server you need to ensure you install a suitable

    version. The ElasticSearch server version needs to match the version of the Elas-

    ticSearch client that is bundled with Logstash. If the client version is 0.90.3 you

    should install version 0.90.3 of the ElasticSearch server. Thecurrent documenta-tionwill indicate which version of ElasticSearch to install to match the client.

    TIP Logstash also has a bundled ElasticSearch server inside it that we could use.To enable it see the embedded option of the elasticsearch plugin. For most purposes

    though I consider it more exible and scalable to use an external ElasticSearch

    server.

    Version: v1.3.4 (5f439d1) 28

    http://logstash.net/docs/latest/outputs/elasticsearchhttp://logstash.net/docs/latest/outputs/elasticsearchhttp://logstash.net/docs/latest/outputs/elasticsearchhttp://www.elasticsearch.org/
  • 8/10/2019 The Log Stash Book

    45/220

    Chapter 3: Shipping Events

    Introduction to ElasticSearch

    So before we install it we should learn a little about ElasticSearch and how it

    works. A decent understanding is going to be useful later as we use and scaleElasticSearch. ElasticSearch is a text indexing search engine. The best metaphor

    is the index of a book. You ip to the back of the book1, look up a word and

    then nd the reference to a page. That means, rather than searching text strings

    directly, it creates an index from incoming text and performs searches on the index

    rather than the content. As a result it is very fast.

    NOTE This is a simplied explanation. See thesitefor more information andexposition.

    Under the covers ElasticSearch uses Apache Lucene to create this index. Each

    index is a logical namespace, in Logstash's case the default indexes are named for

    the day the events are received, for example:

    Listing 3.15: A Logstash index

    logstash-2012.12.31

    Each Logstash event is made up of elds and these elds become a document

    inside that index. If we were comparing ElasticSearch to a relational database: an

    index is a table, a document is a table row and a eld is a table column. Like a

    relational database you can dene a schema too. ElasticSearch calls these schemas

    "mappings".

    NOTE It's important to note that you don't have to specify any mappings for op-erations, indeed many of searches you'll use with Logstash don't need mappings,

    but they often makes life much easier. You can see an example of an Elastic-

    1Not the rst Puppet book.

    Version: v1.3.4 (5f439d1) 29

    http://lucene.apache.org/core/http://www.elasticsearch.org/guide/
  • 8/10/2019 The Log Stash Book

    46/220

    Chapter 3: Shipping Events

    Search mappinghere. Since Logstash 1.3.2 a default mapping is applied to your

    ElasticSearch and you generally no longer need to worry about setting your own

    mapping.

    Like a schema, mapping declares what data and data types elds documents con-

    tain, any constraints present, unique and primary keys and how to index and

    search each eld. Unlike a schema you can also specify ElasticSearch settings.

    Indexes are stored in Lucene instances called "shards". There are two types of

    shards: primary and replica. Primary shards are where your documents are stored.

    Each new index automatically creates ve primary shards. This is a default setting

    and you can increase or decrease the number of primary shards when the index

    is created but not AFTER it is created. Once you've created the index the numberof primary shards cannot be changed.

    Replica shards are copies of the primary shards that exist for two purposes:

    To protect your data.

    To make your searches faster.

    Each primary shard will have one replica by default but also have more if required.

    Unlike primary shards, this can be changed dynamically to scale out or make an

    index more resilient. ElasticSearch will cleverly distribute these shards across theavailable nodes and ensure primary and replica shards for an index are not present

    on the same node.

    Shards are stored on ElasticSearch "nodes". Each node is automatically part of an

    ElasticSearch cluster, even if it's a cluster of one. When new nodes are created

    they can use unicast or multicast to discover other nodes that share their cluster

    name and will try to join that cluster. ElasticSearch distributes shards amongst all

    nodes in the cluster. It can move shards automatically from one node to another

    in the case of node failure or when new nodes are added.

    Version: v1.3.4 (5f439d1) 30

    http://untergeek.com/2012/11/05/my-current-templatemapping/
  • 8/10/2019 The Log Stash Book

    47/220

    Chapter 3: Shipping Events

    Installing ElasticSearch

    ElasticSearch's only prerequisite is Java. As we installed a JDK earlier in this chap-

    ter we don't need to install anything additional for it. Unfortunately ElasticSearchis currently not well packaged in distributions but it is easy to download and cre-

    ate your own packages. Additionally the ElasticSearch team does provide some

    DEB packages for Ubuntu and Debian-based hosts. You can nd theElasticSearch

    download page here.

    As we're installing onto Ubuntu we can use the DEB packages provided:

    Listing 3.16: Downloading ElasticSearch

    $ wget https://download.elasticsearch.org/elasticsearch/

    elasticsearch/elasticsearch-0.90.3.deb

    Now we install ElasticSearch. We need to tell ElasticSearch where to nd our Java

    JDK installation by setting theJAVA_HOMEenvironment variable. We can then run

    thedpkgcommand to install the DEB package.

    Listing 3.17: Installing ElasticSearch

    $ export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-i386/$ sudo dpkg -i elasticsearch-0.90.3.deb

    TIP You can also nd tar balls for ElasticSearch from which you can install orcreate RPM packages. There is an example RPM SPEC lehere.

    Installing the package should also automatically start the ElasticSearch server but

    if it does not then you can manage it via its init script:

    Version: v1.3.4 (5f439d1) 31

    https://github.com/tavisto/elasticsearch-rpmshttp://www.elasticsearch.org/download/http://www.elasticsearch.org/download/
  • 8/10/2019 The Log Stash Book

    48/220

    Chapter 3: Shipping Events

    Listing 3.18: Starting ElasticSearch

    $ sudo /etc/init.d/elasticsearch start

    Conguring our ElasticSearch cluster and node

    Next we need to congure our ElasticSearch cluster and node name. ElasticSearch

    is started with a default cluster name and a random, allegedly amusing, node

    name, for example "Frank Kafka" or "Spider-Ham". A new random node name is

    selected each time ElasticSearch is restarted. Remember that new ElasticSearch

    nodes join any cluster with the same cluster name they have dened. So we want

    to customize our cluster and node names to ensure we have unique names. To

    do this we need to edit the /etc/elasticsearch/elasticsearch.ymlle. This is

    ElasticSearch'sYAML-basedconguration le. Look for the following entries in

    the le:

    Listing 3.19: Initial cluster and node names

    # cluster.name: elasticsearch

    # node.name: "Franz Kafka"

    We're going to uncomment and change both the cluster and node name. We're

    going to choose a cluster name oflogstashand a node name matching our central

    serve