-
MAnagement of Security information and eventsin Service
InFrastructures
MASSIFFP7-257475
D3.2.1 - Scenarios analysis and externallanguages
specification
Activity A3 Workpackage WP3.2
Due Date December 2010 Submission Date 2011-02-04
Main Author(s) Herv Debar (TSP)
Version v1.0(Rev : 92) Status Final
DisseminationLevel
CO Nature R
Keywords security languages, event languages, alert
languages
Reviewers Luigi Romano (CINI)
Claudio Soriente (UPM)
Part of the SeventhFramework Programme
Funded by the EC - DG INFSO
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Version history
Rev Date Author Comments
V0.1 2011-01-14 Herv Debar First draft for review
V1.0 2011-02-03 Herv Debar Final version after 2nd review
cycle
V1.0 2011-02-04 Elsa Prieto (Atos) Final review and delivery
2011 by MASSIF Consortium 2 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Glossary of Acronyms
Abbr Abbreviation
BSCW Be Smart - Cooperate Worldwide
CEF Common Event Format
CLF Common Log Format
CSS Cascading style sheets
DoW Description of Work
EC European Commission
EU European Union
FP7 Seventh Framework Programme
FTP File Transfer Protocol
IEFT Internet Engineering Task Force
LEA Log Extraction API
MASSIF MAnagement of Security information and events in Service
InFrastructures
MSS Managed Security Service
MSSP Managed Security Service Provider
OASIS Organization for the Advancement of Structured Information
Standards
ODBC Open Database Connectivity
PU Public Usage
R&D Research & Development
RSS Really Simple Syndication
SCP Secure Copy
SFTP Secure File Transfer Protocol
SIEM Security Information and Event Management
SNMP Simple Network Management Protocl
SSH Secure Shell
WMI Windows Management Infrastructure
W3C World Wide Web Consortium
2011 by MASSIF Consortium 3 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Executive Summary
Deliverable D3.2.1 is one of the first technical productions of
the MASSIF project. The description ofwork specifies that this
document is an analysis of input and output formats from use case
scenarii,and specification of common message formats for these data
streams. This document has thereforetwo objectives, enumerate data
formats and models that have been used by the partners of the
projectin SIEM-related projects, and provide a first glimpse at use
cases, from a data point of view, that willspread knowledge and
understanding among partners on these use cases, and provide a
first evaluationof the importance of the aforementioned data
formats. The document is constituted of 2 parts, Alert andEvent
Languages describing security alerts and events, and use-case
specific data streams describinglog formats specific to the
proposed use cases. This document concludes with an analysis
highligtingseveral characteristics shared between these languages
and event formats, among wich simplicity ofthe information
representation that must be easily readable, timestamping and
modularity of the formatstructure.
2011 by MASSIF Consortium 4 / 61
-
Contents
1 Introduction 111.1 Deliverable objectives . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.2
MASSIF architecture sketch . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . 12
2 Alert and Event Languages 142.1 Languages selection rationale
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
2.1.1 Analysis of Commercial SIEMs . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . 152.1.2 Presentation of log sources
selection . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 The Common Event Format (CEF) . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 162.2.1 Reference . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
162.2.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 162.2.3 Structure . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
Structure overview . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . 17Links with other data formats . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 17Relationship
with MASSIF . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 17
2.2.4 Critical assessment of the format . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 18Advantages . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Issues
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 18Uses . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 18
2.3 The Common Log Format (CLF) . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 182.3.1 Reference . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
182.3.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 192.3.3 Structure . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
Structure overview . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . 19Links with other data formats . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 19Relationship
with MASSIF . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 19
2.3.4 Critical assessment of the format . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 20Advantages . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Issues
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 20Uses . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 20
2.4 The Intrusion Detection Message Exchange Format (IDMEF) . .
. . . . . . . . . . . . . . 212.4.1 Reference . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
2.4.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 212.4.3 Structure . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
Structure overview . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . 21Links with other data formats . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 22Relationship
with MASSIF . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 22
2.4.4 Critical assessment of the format . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 22Advantages . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22Issues
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 22Uses . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 23
2.5 InterFace to Metadata Access Point (IF-MAP) . . . . . . . .
. . . . . . . . . . . . . . . . . 232.5.1 Reference . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
232.5.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 232.5.3 Structure . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
Structure overview . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . 24Links with other data formats . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 25Relationship
with MASSIF . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 25
2.5.4 Critical assessment of the format . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 25Advantages . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25Issues
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 26Uses . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 26
2.6 Incident Object Description and Exchange Format (IODEF) . .
. . . . . . . . . . . . . . . 262.6.1 Reference . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
262.6.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 272.6.3 Structure . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
Structure overview . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . 27Links with other data formats . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 27Relationship
with MASSIF . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 28
2.6.4 Critical assessment of the format . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 28Advantages . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Issues
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 28Uses . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 29
2.7 IP Flow Information Export (ipfix) . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 292.7.1 Reference . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 292.7.2 Objectives . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 292.7.3 Structure . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
Structure overview . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . 29Links with other data formats . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 30Relationship
with MASSIF . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 30
2.7.4 Critical assessment of the format . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 30Advantages . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30Issues
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 31Uses . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 31
2011 by MASSIF Consortium 6 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
2.8 The Syslog Format . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 312.8.1 Reference . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
312.8.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 312.8.3 Structure . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
Structure overview . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . 32Links with other data formats . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 34Relationship
with MASSIF . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 34
2.8.4 Critical assessment of the format . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 34Advantages . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Issues
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 34Uses . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 35
2.9 Windows Management Instrumentation (WMI) . . . . . . . . . .
. . . . . . . . . . . . . . 352.9.1 Reference . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.9.2
Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . 352.9.3 Structure . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Structure overview . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . 36Links with other data formats . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 37Relationship
with MASSIF . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 37
2.9.4 Critical assessment of the format . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 38Advantages . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38Issues
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 38Uses . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 38
2.10 WS-Eventing and WS-Notification . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 382.10.1 Objectives . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
382.10.2 Structure . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 39
Architecture . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . 39Function . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 39Delivery
mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 40Filters . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . 40Links with other data
formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40Relationship with MASSIF . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 40
2.10.3 Advantages of the formats . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 40
3 Use-case specific data streams 433.1 Olympic Games Scenario .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 43
3.1.1 Motivation and description . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 433.1.2 Novell Sentinel Interface:
Syslog data format . . . . . . . . . . . . . . . . . . . . . 44
Description . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . 44Advantages . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . 44Drawbacks
and issues . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 44Examples . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 45
3.1.3 Novell Sentinel Interface: LEA API . . . . . . . . . . . .
. . . . . . . . . . . . . . . 46
2011 by MASSIF Consortium 7 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Description . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . 46Drawbacks and issues . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2 Mobile Money Transfer Service scenario . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 473.2.1 Motivation and
description . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 473.2.2 Mobile Money Service: proprietary data format . . .
. . . . . . . . . . . . . . . . . 47
3.3 Managed Enterprise Service Infrastructures scenario . . . .
. . . . . . . . . . . . . . . . 493.3.1 Motivation and description
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
493.3.2 Tivoli TSOM interface: SNMP data format . . . . . . . . . .
. . . . . . . . . . . . . 503.3.3 Tivoli TSOM interface: Syslog
data format . . . . . . . . . . . . . . . . . . . . . . . 51
3.4 Critical Infrastructure Process Control (Dam) scenario . . .
. . . . . . . . . . . . . . . . . 513.4.1 Motivation and
description . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 523.4.2 Dam scenario: Modbus data format . . . . . . . . .
. . . . . . . . . . . . . . . . . 52
Structure overview (Modbus) . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 53Modbus Advantages . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 55Issues (Modbus) . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 55Modbus Example . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . 55
3.4.3 Dam scenario: WSN and CTP data formats . . . . . . . . . .
. . . . . . . . . . . . 56WSN Example . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 57Advantages (WSN) .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 58Issues (WSN) . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . 58
3.4.4 Links with other data formats . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 58
4 Analysis and Conclusion 594.1 Analysis of alert and event
languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
594.2 Analysis of use case specific data streams . . . . . . . . .
. . . . . . . . . . . . . . . . . 60
2011 by MASSIF Consortium 8 / 61
-
List of Figures
1.1 MASSIF Blueprint Architecture (proposed) . . . . . . . . . .
. . . . . . . . . . . . . . . . . 12
2.1 An example metadata graph . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 242.2 Windows Management
Infrastructure architecture data flow . . . . . . . . . . . . . . .
. . 36
3.1 Log example . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 503.2 General Modbus Frame .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . 533.3 Modbus transaction (error free) . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 543.4 Modbus transaction
(exception response) . . . . . . . . . . . . . . . . . . . . . . .
. . . . 54
9
-
List of Tables
2.1 RSA Envision collectors summary . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 152.2 Included log sources
summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . 412.3 Eliminated log sources summary . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 42
3.1 Money Transfer Message Elements . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 493.2 TIVOLI TSOM SNMP Trap
content example . . . . . . . . . . . . . . . . . . . . . . . . . .
51
10
-
Chapter 1
Introduction
1.1 Deliverable objectives
This deliverable is one of the first technical productions of
the MASSIF project. The description of workspecifies that this
document is an analysis of input and output formats from use case
scenarii, andspecification of common message formats for these data
streams. This document has therefore twoobjectives:
enumerate data formats and models that have been used by the
partners of the project in SIEM-related projects, in order to give
a broad overview of the richness of the field, and prepare
thedefinition of the ontology (MASSIF Deliverable 3.2.2).
provide a first glimpse at use cases, from a data point of view,
that will spread knowledge andunderstanding among partners on these
use cases, and provide a first evaluation of the importanceof the
aforementioned data formats.
As one can see from these two items, data is at the core of the
MASSIF project, since Security Infor-mation and Event Management
is, at the heart, about gathering data, analyzing it, and making
informeddecisions in the ICT security domain. With respect to data
gathering, this document concentrates onthe syntax and semantics of
the information, regardless of location or actual transport
mechanisms.Resilient event collection is handled in workpackage
3.1, scalable event processing engine. The onlyassupmtion of this
document is that whatever format chosen will be available without
restrictions throughWP31. With respect to data analysis, methods
will be studied in WP33 (event collection, parsing andpropagation)
on the sensor side and WP34 (event filtering, aggregation,
abstraction, and correlation) onthe SIEM platform side. We will
thus focus on the syntax and semantic of as many data formats as
feltpertinent by the projects partners.
In accordance with the objectives of the document, we have
segmented it in two main parts, asfollows:
Alert and Event Languages Chapter 2 gathers all formats and
languages that represent transient in-formation, information that
is time-driven and that has to be handled by the MASSIF SIEM
system
11
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
to manage the security status of the monitored system. In this
area, we will focus on languages thatare considered having
standards status, either through their publication mechanism or
because oftheir widespread use.
Use-case specific data streams Chapter 3 describe the use cases
data stream formats. We are par-ticularly interested in describing
the specificities of the content of the data streams, such as
theway they build syslog message contents, as most of the syntax
should be covered in the previouschapters.
1.2 MASSIF architecture sketch
The work on data streams analysis has to be considered in
relationship with the definition of the MASSIFplatform
architecture. While the mandate of this document is not to specify
an architecture for theMASSIF project, we do introduce it with
thoughts on a very simple architecture sketch shown in
figure1.1.
Figure 1.1: MASSIF Blueprint Architecture (proposed)
Figure 1.1 separates the world in two parts, the MASSIF SIEM
Platform plane and the monitoredbusiness system plane. The former
is under full control of the project and the latter should be left
as
2011 by MASSIF Consortium 12 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
undisturbed as possible, or at least the capabilities required
by the MASSIF SIEM system in terms ofmonitoring and countermeasures
should be fixed and acceptable to the business system owners.
Within the monitored system, we have separated three functions,
intrusion detection sensors, busi-ness process components, and
access control. Business process components have as primary
functionto service users; however, they have also auditing
capabilities in the form of log files, and minimal
policyenforcement capabilities like startup and shutdown. Sensors
have as primary function to detect andreport sensitive events,
either attacks or anomalies. Access control and identity management
are secu-rity policy components, whose interaction with the MASSIF
SIEM system will be the primary mean forsecurity response. In the
current security litterature, intrusion prevention systems should
be consideredas belonging to the two last categories.
Within the SIEM platform, we separate the operational decision
support subsystem, handling thealerts in real time, and the model
management subsystem, which evaluates and updates the
decisionsupport system according to its past performance, to the
evolution of the monitored system, and to theevolution of the
global knowledge (vulnerabilities, etc.).
The most relevant part of this architecture for the present
deliverable is the exchanges between thetwo planes, which we model
as follows:
Events (push) This stream describes events being pushed by the
monitored business system to theMASSIF SIEM platform. These events
are typically alerts or logs driven by the interactions that
themonitored business system has with the outside world (users,
updates, etc.) The formats used inthis data stream are described in
section 4.1, alert and event languages.
Events (pull) This stream describes events being requested by
the MASSIF SIEM platform from themonitored business system. This
allows the business system to store data and only make it
avail-able to the MASSIF SIEM if necessary. It is a way for the
SIEM platform to ask questions or verifyinformation that it has on
the monitored system. The formats used in this data stream should
besimilar to the ones described in section 4.1, alert and event
languages.
Configurations (Commands) This stream describes modifications of
the behaviour of the businesssystem that are driven by the MASSIF
SIEM system, mainly for update or response purposes.This stream is
important for alert correlation, but is outside the scope of this
document.
Audits This stream represents the interaction of the model
management subsystem with the monitoredbusiness system. While it is
analytically a different data stream, it might be assimilated to
thecombination of both event (push + pull) streams, and might be
implemented in this way, to simplifythe plane interface management.
This stream is particularly important for model acquisition
andmaintenance, but is outside the scope of this document.
The refined small blue arrows precise the data stream names in
the case of sensors and should betreated as examples only for the
purpose of this deliverable.
This blueprint architecture will further evolve as the
specifications of the MASSIF SIEM prototype aredeveloped.
2011 by MASSIF Consortium 13 / 61
-
Chapter 2
Alert and Event Languages
2.1 Languages selection rationale
Since it is impossible to produce a comprehensive list of all
formats, we have specified selection criteriato include only a
subset of the available data formats. One first need to note that
we are interested informats, not in transport protocols.
Unfortunately, there is a very close association between data
formatsand transport protocols in several cases, which makes it
difficult to exactly understand the motivationsof developers and
users. Another consideration is that we do not need to describe all
formats, but weneed to identify formats that are also generic
representations of information.
The following elements are the foundations of our rationale:
SIEM-market supported We have looked at the SIEM market, and
specifically the adapters that theyprovide. We have specifically
analyzed five major commercial SIEMs, OSSIM and Prelude
(rep-resented in the MASSIF project), as well as Envision from RSA
Security, Novell Sentinell andArcsight, to understand what kind of
data formats they collects. This analysis is further detailed
insection 2.1.1.
Standards-body driven We are interested in using formats that
are supported by open standards or-ganizations, and that are freely
available. In that group, we have selected standards defined by
theInternet Engineering Task Force (IETF), the World Wide Web
Consortium (W3C) and the Organi-zation for the Advancement of
Structured Information Standards (OASIS). Even though they maynot
be in production use today, they do provide an interesting and
collective vision of the problemthat we are addressing, and some of
them have actually been used.
User-supported Finally, we have also drawn from the collective
experience and knowledge of theprojects partner, particularly the
commercial users and use case providers, to complement andconfirm
the first two criteria.
14
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
2.1.1 Analysis of Commercial SIEMs
Commercial SIEM vendors have a strong marketing incentive to
collect information from as many datasources as possible, in order
to market their products as a data warehouse for logging and
compliance.They also have a strong technical incentive to limit the
number of protocols they understand, in order tosimplify not only
development but also integration. Therefore, we expect from the
documentation of thecapabilities of the SIEM products many data
sources but few protocols.
A summary of the list of connectors for the RSA EnVision SIEM is
presented in table 2.1. It lists 186products but only about 15
different connectors. We have counted in table 2.1 the number of
times eachconnector type appears in the documentation1. This
summary shows that a majority of log sources areconnected via
Syslog. The three other important mechanisms are acquisition of log
files via FTP, ODBCand SNMP; however, SNMP does not even mention if
it is about traps; or which management informationbases are
involved. The other connectors are dedicated to a specific set of
tools (e.g. Checkpoints LEAor Windows WMI).
Connector identity Number of instances Percentage
Syslog 95 51%
Log File FTP 25 13%
ODBC 25 13%
SNMP 20 11%
File Reader 4 2%
Agentless Windows 4 2%
Other connectors 13 7%
Total number of interfaced products 186 100%
Table 2.1: RSA Envision collectors summary
Novell Sentinel documents connectivity to at least 61 products,
using 11 different connectors. Theidentification of the collectors
is extremely similar to the one shown in table 2.1. Even though we
do nothave available the same level of detail, we surmise that the
results would be quite similar.
One of the major issues when dealing with SIEM tools is the lack
of separation between the dataformat and the transportation
protocol. In fact, the operation of these products requires
understandingnot only of the protocol, but also of the content and
semantic of the message. That is why ArcsightsSIEM has published
its interface specification, the Common Event Format, wishing for
wide adoption bythe community of security tools vendors. While this
has not seen the light, it provides an interesting andimportant
viewpoint at the way SIEM vendors see their data providers
today.
Finally, one needs to note that Prelude, one of the SIEM tools
we are looking at in MASSIF, is usingthe IETF standard IDMEF for
its data format, even though it is not using the companion IDXP
protocol.
1Whenever a product listed several connectors, we selected the
most represented one.
2011 by MASSIF Consortium 15 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
2.1.2 Presentation of log sources selection
One also needs to realize that this analysis does not give us
information about deployment in the field,or only in an approximate
way. We have therefore added a third element, the experience of the
partnersin the field, to evaluate the data sources and reinforce
our selection criteria. Table 2.2 presents ourselection of log
sources that are included in the alert and event languages
description.
As one can see from table 2.2, our selection points us to 8
different alert and event languages.Beyond the ubiquitous syslog,
we have included languages that are important either because of
theirstandard status, and because they will help us reach the goals
of the project, even though they are notcurrently used in SIEM
environments (to the best of our knowledge). When analyzing the
existing SIEMenvironments, we have also eliminated the description
of log sources from this deliverable. The reasonsfor not selecting
these sources are presented in table 2.3.
We will now proceed to the description of the alert and event
languages, following as much aspossible an homogeneous template.
The description in itself is kept short, as the reader is refered
toalready existing documentation. We have rather focused on our
experience with these data sources,and their relationship to the
project.
2.2 The Common Event Format (CEF)
2.2.1 Reference
The Common Event Format (CEF)2 is specified and provided without
charge by Arcsight Inc3, a SIEMvendor, as part of its strategy to
foster interoperability between its SIEM vendor and sensors
vendors.
2.2.2 Objectives
The Common Event Format (CEF) is an open log management standard
that improves the interoper-ability of security-related information
from different security and network devices and applications.
CEFhas been designed to enable technology companies and customers
to use a common event log formatso that data can easily be
collected and aggregated for analysis by an enterprise management
system.
2http://www.arcsight.com/collateral/CEFstandards.pdf
3http://www.arcsight.com/
2011 by MASSIF Consortium 16 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
2.2.3 Structure
Structure overview
CEF is an extensible, text-based, high-performance format
designed to support multiple device typesfrom both security and
non-security devices and applications in the most simple manner
possible, unlikeother standards that target a single component of
the security infrastructure, are tied to a specific trans-port
protocol, or are designed specifically for applications and cannot
support todays high-performance,real-time security
requirements.
To simplify integration, the syslog message format is used as a
transport mechanism. However, if anevent producer is unable to
write syslog messages, it is still possible to write the events to
a file.
The basic grammar of the format includes the self-explanatory
fields:
CEF:Version|Device Vendor|Device Product|Device
Version|Signature
ID|Name|Severity|Extension
An example of a CEF message taken from the documentation is:
Sep 19 08:26:10 zurich
CEF:0|security|threatmanager|1.0|100|worm
successfully stopped|10|src=10.0.0.1 dst=2.1.2.2 spt=1232
Links with other data formats
CEF is fairly close to syslog in spirit, and also share
similarities with the Security Device Event Exchange(SDEE)4, a
joint effort between Cisco and SourceFire to standardize events
coming out of network-basedintrusion detection sensors.
Relationship with MASSIF
This format should be considered in the light of competition.
Owning the base data format is a wayto lock customers into a
specific SIEM platform, in this case Arcsights, because of the
investment indeveloping translation agents for custom logs and in
deploying these agents in the field. It might beuseful to have at
least import capabilities from CEF into MASSIF.
4http://www.cisco.com/en/US/docs/security/ips/specs/CIDEE_Specification.htm
2011 by MASSIF Consortium 17 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
2.2.4 Critical assessment of the format
Advantages
The Common Event Format promotes interoperability between
various event (or log) generating devices.Although each vendor has
its own format for reporting event information, these event formats
often lackthe key information necessary to integrate the events
from their devices.
The ArcSight standard attempts to improve the interoperability
of infrastructure devices by aligningthe logging output from
various technology vendors.
The Extension Dictionary from the CEF provides a broad set of
predefined extension keys whichcovers most event log
requirements.
Issues
Custom extension keys are recommended for use only when no
reasonable mapping of the informationcan be established for a
predefined CEF key. While the custom extension key mechanism can be
usedto safely send information to CEF consumers for persistence,
there are certain limitations as to whenand how to access the data
mapped into them.
Data submitted to ArcSight Logger using custom key extensions is
retained in the system; however,it is not available for use in the
Logger reporting infrastructure.
Uses
Use of the CEF format is limited to Arcsights deployments,
despite the lobbying efforts deployed.
2.3 The Common Log Format (CLF)
2.3.1 Reference
The Common Log Format (CLF) and its sibling the Extended Common
Log Format (ECLF) are specifiedby the W3C community5 and by the
Apache developper community6. This format falls into the categoryof
de-facto standards; while it is widely adopted by web servers,
there is no normative reference.
5http://www.w3.org/Daemon/User/Config/Logging.html#common-logfile-format
6http://httpd.apache.org/docs/2.2/logs.html#common
2011 by MASSIF Consortium 18 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
2.3.2 Objectives
The Common Log Format is used by web servers, in particular the
Apache web server, to trace allrequests processed by the server. It
is generally shared by all log files (access.log, error.log, and
others).While the Apache web server offers the possibility to
customize the log format, the users tend to keepthe default
configuration, using either the simple CLF format, or its extension
the ECLF format, whichshares the same initial description.
2.3.3 Structure
Structure overview
The CLF format stores the following information:
IP address of the origin of the request as presented to the
server. If the requesting browser is behinda proxy, the address of
the proxy will show up in the logs.
identd identity of the client as specified in RFC 1413[8].
userid of the requester as determined by HTTP
authentication.
Timestamp of the request.
Request line presented by the client, including the method, the
URI and the protocol.
Status code that was returned to the client, indicating how the
server was able to fulfil the request.
Size of the object returned to the client.
The ECLF format includes in parenthesis, after the information
provided by CLF, additional informa-tion provided by the client,
such as the referign URL and user agent identifiyng the clients
browser.
Links with other data formats
This format is similar in spirit to syslog (one line of
timestamped textual information), but is tailored forweb
servers.
Relationship with MASSIF
We expect that all web servers providing information to the
MASSIF platform will use this format(s).
2011 by MASSIF Consortium 19 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
2.3.4 Critical assessment of the format
Advantages
The CLF format is very easy to use and very informative. Even
though it limits itself to HTTP headerinformation, it synthetizes
the important aspects of the activity of the web server, from the
point of viewof security: who asked what, when, and how did the
server react. It is extremely compact and thusefficient in terms of
processing. Being widely adopted by web servers developers and
proxy developers,it provides a solid basis for analysis and
detection of malicious activity aiming to subvert the web
serverthrough the use of the HTTP protocol.
Issues
The CLF format does suffer from several issues, that have an
impact on the detection and diagnosing ofattacks:
Multiplicity of lines Since the HTTP server may serve multiple
requests for a single page view, a com-plete diagnosis may require
the analysis of multiple lines which are not necessarily sharing
anidentifying token.
Lack of payload information The log file does not contain HTTP
payload information. This means thatfor methods such as POST, the
complete information is not available for diagnosis. This may be
aserious limitation for diagnosing infections such as XSS or SQL
injection, for example if content ispushed into comments in dynamic
web sites.
Lack of server-side information The log file does not contain
information identifying the web server(such as the virtual server
accessed). This is a serious limitation in identifying the exact
target ofthe attacker.
Uses
The CLF format is extremely used for web servers.
2011 by MASSIF Consortium 20 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
2.4 The Intrusion Detection Message Exchange Format (IDMEF)
2.4.1 Reference
The Intrusion Detection Message Exchange Format (IDMEF) is
normalized by the Internet EngineeringTask Force (IETF) as RFC
4765[5].
2.4.2 Objectives
The Intrusion Detection Message Exchange Format (IDMEF)[13] is
intended to be a standard data for-mat that automated intrusion
detection systems can use to report alerts about events that they
deemsuspicious. The development of this standard format aims at
enabling interoperability among commer-cial, open source, and
research systems, allowing users to mix-and-match the deployment of
thesesystems according to their strong and weak points to obtain an
optimal implementation. It standardizesmessages between a sensor
providing security analysis and detecting threats, and a manager
whichreceives and treats these messages. In the MASSIF context, the
manager should be either the SIEMplatform itself or a gateway to
it.
2.4.3 Structure
Structure overview
IDMEF is built as an UML class diagram of components. The
standard defines two types of messages,Alert (for security
information) and Heartbeat (for management information). A message
is an aggre-gation of components, modeling various entities that
are part of an intrusion-detection sensor. At thetop level, a
message requires a timestamp (CreateTime in IDMEF), a meaning
(Classification in IDMEF)and a generating sensor (Analyzer in
IDMEF). The two other major components are the target and thesource
of the attack. Each of these blocs has a complex structure, that
attempts to capture the variousfacets that characterize a component
of an information system. One example of the elementary compo-nents
that compose these larger blocks is the notion of Node, which is
found both in Analyzer, Sourceand Target, which models a
machine.
2011 by MASSIF Consortium 21 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Links with other data formats
IDMEF per se does not have links with other formats. However,
several tools including Prelude providemechanisms for parsing log
formats, for example syslog or clf, and transform these log formats
intoIDMEF messages. This parsing includes and requires knowledge
not only of the source format but alsoof its semantic, in order to
provide meaningful conversion.
Relationship with MASSIF
The IDMEF format is the back-end format of Prelude. It is also
used by 6cure and Tlcom SudParis fortheir research activities, to
represent and transmit alert information.
2.4.4 Critical assessment of the format
Advantages
Semantic IDMEF is extremely conscious of the semantic of the
information it manipulates, and doesmuch more that providing a
syntax. Furthermore, it provides rationales and explanations to
limitinterpretation by developers and thus reduce ambiguity. IDMEF
also includes many constants thatstrongly type objects. While the
manner in which these constants are defined may not be the best,the
idea of strongly typing objects is very important in contributing
to strong and clear semantic.
Modularity IDMEF is built of a set of components and thus is
extremely modular. It also providesfacilities for referencing
components instead of including them in the message, which
contributesto the efficiency in transfering and sharing identical
information.
Extensibility IDMEF provides facilities for including structured
information in a message, under theform of the AdditionalData blob.
This facility enables including original messages within IDMEF,
orinformation that becomes available at a later stage.
Issues
Dissemination Even though IDMEF is an RFC, it is only an
informational one and it has not beenwidely picked up by the
security product developers, as sensor developers prefer simpler
and lessconstrained solutions, and as SIEM developers have prefered
to own their base formats.
2011 by MASSIF Consortium 22 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
XML IDMEF is an XML format, thus it is quite verbose. While for
transport purposes it compressesquite well, it should not be used
for storing information, nor for developing DB schemas. Also,
thenormative reference is the XML DTD and not the XML schema, thus
type checking is less precise.
extensibility IDMEF is extensible through the use of XML blobs.
The idea is nice and useful, but thereare currently no
possibilities for creating and sharing standard or useful patterns
out of these blobs.
Uses
The IDMEF format is used mostly in the research community as a
standard back-end for intrusion de-tection and alert correlation
research projects and communities. It is also used by the Prelude
SIEMenvironment7 as its back-end data format (although the
companion transport protocol IDXP is not usedby Prelude).
2.5 InterFace to Metadata Access Point (IF-MAP)
2.5.1 Reference
trustedcomputing.org
http://www.trustedcomputinggroup.org/developers/trusted_network_connect/
Specification document of IF-MAP 2.0 [11]
Specification document of IF-MAP Metadata for Network Security
[12]
2.5.2 Objectives
IF-MAP is an interface specification between a Metadata Access
Point (MAP) Server and entities thateither publish metadata or that
subscribe to metadata from the MAP. The entities are called
IF-MAPclients, while the Server is referred to Metadata Access
Point (MAP) or as IF-MAP Server. The latterprovides functionalities
to publish metadata, to search through the stored metadata and
enable clientsto subscribe to specific data and be notified on the
event of data changes.
As IF-MAP aims to enable the structured collection and provision
of data, it is not only a language todescribe (security) events.
Nevertheless, a specification of a metadata language for network
security ispart of IF-MAP [12]. As IF-MAP has been created by the
TNC working group of the Trusted ComputingGroup, its foremost
purpose is the gathering of information that can be used in order
to apply accessdecisions in a networking environment. Thus the
metadata comprises elements like registered address
7http://www.prelude-ids.org/
2011 by MASSIF Consortium 23 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Figure 2.1: An example metadata graph
bindings, authentication status, endpoint policy compliance
status, endpoint behavior, and authorizationstatus. But the
specification is open and the process is not finished, thus
allowing to influence thedefinition of models for metadata
describing any kind of information.
2.5.3 Structure
Structure overview
The IF-MAP specification comprises of two single documents yet.
One is the general descriptionand SOAP binding
TNC_IFMAP_v2_0r36.pdf, also referred to as IF-MAP 2.0 [11]. The
other is thespecification of IFMAP Metadata for Network Security
which is v1.0 revision 25 at the time of writ-ing this document
[12]. Additionally, for a quick overview, we propose reading the
IF-MAP FAQ underwww.trustedcomputing.org.
The session based communication between a MAP client and server
is always initiated by the clientand is based on SOAP. The commands
comprise different kinds of publish (update, delete etc.),
sub-scribe (e.g. notification poll) and search.
The data model of IF-MAP comprises two types of data. The
identifier (e.g. identities of severaltypes, mac-address,
ip-address) and the metadata which can be related to each other by
a link. Fig-ure 2.5.3 visualises the data model used in IF-MAP
where identifiers are represented by ovals, metadatais represented
by rectangles, and links are represented by lines connecting
identifiers.
2011 by MASSIF Consortium 24 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Links with other data formats
The metadata description language is XML, thus any event
description based on XML should be easilyintroduced.
Relationship with MASSIF
The IF-MAP specification allows a publish and subscribe model
for the information collection and pro-cessing. This could have a
major impact to the different tasks of MASSIF as it might
facilitate an interfacefor the security information. This does not
apply to a single use case only but refers to all four use casesand
could even enable a combination of security information of the use
cases and the different SIEMtools in order to enable convergence
and collaboration as well as a uniform presentation of the
MASSIFappliances.
2.5.4 Critical assessment of the format
As part of these points have been described in previos
subsections, this sections provides bullet pointsmainly.
Advantages
Provision of an interface for various kinds of security
information
A central database for information based on one protocol
A simple publish/subscribe data collector
Standard enables integration of application & system input
& output from different vendors.
Opportunity to create a vocabulary explicitly for the needs of
MASSIF and
thereby have an influence on the standardisation process
IF-MAP is intrincically defined to be extensible
Close contact of SIT with FHH (open source IF-MAP server irond)
and Infoblox (IF-MAP serverIBOS and IF-MAP starter kit) and
opportunities of cooperation (user group) and dissemination
(though Infoblox who are activelyadvertising every adoption of
IF-MAP)
2011 by MASSIF Consortium 25 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Issues
As the specification of the metadata is not concluded or only
consists of NAC information, respec-tively, there is no
fully-fledged vocabulary. Nevertheless, one could add additional
metadata typesthrough the use of other tags.
The standardisation of IF-MAP is not finished, so the
specification might evolve during the runof MASSIF. Standardisation
with the IETF is planned for summer 2011 but usually takes
severalyears.
Uses
As the metadata definition does not yet exceed that of network
security information, normal applicationsaccording to the TCG
are:
Federation between remote access and network access control
(NAC).
Integration of NAC with endpoint monitoring and e.g. data leak
detection.
Integration of physical access control with NAC.
Federation of authentication information, single sign
on/off.
Real time information gathering and processing.
There are a lot of potential applications, specifically
interesting to the goals of MASSIF. The TCG men-tions applications
in the field of smart grid and cloud security for reasons, that
enable IF-MAP to facilitateSIEM integration, such as aggregating,
correlating and distributing of data from various applications
andsystems.
2.6 Incident Object Description and Exchange Format (IODEF)
2.6.1 Reference
The Incident Object Description and Exchange Format (IODEF) is
normalized by the Internet Engineer-ing Task Force (IETF) as RFC
5070[4].
2011 by MASSIF Consortium 26 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
2.6.2 Objectives
The Incident Object Description Exchange Format (IODEF) is a
format for representing computer secu-rity information commonly
exchanged between Computer Security Incident Response Teams
(CSIRTs).It provides an XML representation for conveying incident
information across administrative domainsbetween parties that have
an operational responsibility of remediation or a watch-and-warning
over adefined constituency. The data model encodes information
about hosts, networks, and the services run-ning on these systems;
attack methodology and associated forensic evidence; impact of the
activity; andlimited approaches for documenting workflow. The
structured format provided by the IODEF allows forincreased
automation in processing of incident data; decreased effort in
normalizing similar data fromdifferent sources; and a common format
on which to build interoperable tools for incident handling
andsubsequent analysis, specifically when data comes from multiple
constituencies.
2.6.3 Structure
Structure overview
The IODEF implementation is specified as an Extensible Markup
Language (XML) document type def-inition. The data model is
composed of nineteen classes that describe the data related to the
incident(e.g. incident ID, related activity, time, assessment,
history, etc). The data model serves as a transportformat; it does
not attempt to dictate a definition for an incident, it rather
assumes a broad understandingof an incident that is flexible enough
to encompass most operators. Since describing an incident for
alldefinitions requires an extremely complex data model, the IODEF
intends to be a framework to conveycommonly exchanged incident
information, ensuring that there are ample mechanisms for
extensibilityto support organization-specific information and
techniques to reference the information kept outside themodel.
Links with other data formats
The data model of the Intrusion Detection Message Exchange
Format (IDMEF) influenced the design ofthe IODEF. The classes of
the data model can be extended through the use of extensible
classes, whichprovide the ability to have new atomic or XML-encoded
data elements in all of the top-level classes ofthe Incident class
and a few of the more complicated subordinate classes.
Similarly, while the IODEF supports different languages, the
data model relies heavily on standard-ized enumerated attributes
that can crudely approximate the contents of the document. With
this ap-proach, a CSIRT should be able to make some sense of an
IODEF document it receives even if the textbased data elements are
written in a language unfamiliar to the analyst.
2011 by MASSIF Consortium 27 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Relationship with MASSIF
2.6.4 Critical assessment of the format
Advantages
The overriding purpose of the IODEF is to enhance the
operational capabilities of CSIRTs. Communityadoption of the IODEF
provides an improved ability to resolve incidents and convey
situational aware-ness by simplifying collaboration and data
sharing.
Implementing the IODEF in XML provides numerous advantages. Its
extensibility makes it ideal forspecifying a data encoding
framework that supports various character encodings, such as UTF-8
andUTF-16. Likewise, the abundance of related technologies (e.g.,
XSL, XPath, XML-Signature) makes forsimplified manipulation.
The data model supports multiple translations of free-form text.
The intent is to allow the identicaltext to be encoded in different
instances of the same class, but each being in a different
language. Thisapproach allows an IODEF document author to send
recipients speaking different languages an identicaldocument.
Issues
XML is fundamentally a text representation, which makes it
inherently inefficient when binary data mustbe embedded or large
volumes of data must be exchanged.
In order to support the changing activity of CSIRTs, the IODEF
data model will need to evolve alongwith them. Internationalization
and localization is of specific concern to the IODEF, since it is
only throughcollaboration, often across language barriers, that
certain incidents be resolved. The IODEF supportsthis goal by
depending on XML constructs, and through explicit design choices in
the data model.
The domain of security analysis is not fully standardized and
must rely on free-form textual descrip-tions. The IODEF attempts to
strike a balance between supporting this free-form content, while
stillallowing automated processing of incident information.
As the data encoded by the IODEF might be considered privacy
sensitive by the parties exchangingthe information or by those
described by it, care needs to be taken in ensuring the appropriate
disclosureduring both document exchange and subsequent processing.
Similarly, care must be taken by the parserto properly authenticate
the recipient of the document and ascribe an appropriate confidence
to the dataprior to action.
2011 by MASSIF Consortium 28 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Uses
We do not have specific information about the actual use of the
IODEF by FIRST or CERT organizations.
2.7 IP Flow Information Export (ipfix)
2.7.1 Reference
The Internet Protocol Flow Information Export (IPFIX)
requirements are normalized by the Internet En-gineering Task Force
(IETF) as RFC 3917[10]. The specifications are normalized in the
RFC 5101[2].
2.7.2 Objectives
The Internet Protocol Flow Information Export (IPFIX) has been
created from the need of a standard forexporting Internet Protocol
flow information collected from routers, probes and other devices
used bymediation systems, accounting/billing systems and network
management systems. The IPFIX standarddefines how IP flow
information has to be formatted and transferred from an exporter to
a collector. Pre-viously, many data network operators were relying
on the proprietary Cisco Systems Netflow standardfor traffic flow
information export. The IPFIX Working Group chose the Netflow
version 9 as basis for thestandardization. The working group
submitted the IPFIX Protocol Specification to the IESG for
approvalin 2006.
2.7.3 Structure
Structure overview
IPFIX defines a flow as any number of packets observed in a
specific timeslot and sharing a number ofproperties, like "same
source, same destination, same protocol". The IPFIX protocol
defines a precisearchitecture for flow data information exporting.
This architecture includes an observation point forcollecting IP
packets belonging to a specific observation domain. A metering
process filters data packetsand aggregates information about these
packets; this information defines the Flow Records. The FlowRecord
contains metrics related to packet header data, timestamping,
sampling, classification. FlowRecords are sent by the IPFIX
exporter to an IPFIX collector, in charge of receiving and
cataloguingIPFIX packets; exporter and collector are in
many-to-many relation and work on a push based paradigm.
2011 by MASSIF Consortium 29 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
The IPFIX data format makeup is transmitted by means of template
records to the collector; theycould be standard or user-defined.
Template Records are an n-uple of type-size couples, used to
defineentirely the structure and the semantic of a specific set of
metrics sent to the collector. The collectordiscerns different Data
Records by means of their Template ID. Data Records are composed of
a certainnumber of Information Elements, representing the
attributes description.
Links with other data formats
IPFIX is not strictly related to other data formats, apart from
Cisco Systems NetFlow 9, its predecessorbefore the standardization.
Despite this isolation, IPFIX data format could contain information
for feedingan IDMEF message parser/sender: IP source and
destination addresses, IP of target machines, times-tamps, data
information. The format translation needs a proper IPFIX collector,
in charge of extractingand classifying needed information.
Relationship with MASSIF
IPFIX messages and protocol architecture supply information sent
by several network devices, routers,sensors and critical nodes and
machines, like network management systems. These different
devicesare present in turn in almost all the scenarios.
2.7.4 Critical assessment of the format
Advantages
Modularity The IPFIX architecture and its many-to-many paradigm
is operatively modular and fits per-fectly the needs of MASSIF for
a distributed data metering system and for collecting data
fromremote sites.
Flexibility The IPFIX standard, by means of Template Records,
provides solutions to extend the datamessage format with user
defined fields, for example for introducing non-standard
InformationElements. Moreover it allows the definition of the
messages structure. The standard works ondifferent transmission
protocols like TCP, UDP or SCTP.
Interoperability The IPFIX protocol is standard and can rely on
a widespread number of compliantdevices from several vendors,
reducing the number of ad-hoc solutions.
Extensibility IPFIX information is not limited to flows: network
behavior, performance behavior, appli-cation behavior, host
behavior, security analysis are some of them.
2011 by MASSIF Consortium 30 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Issues
Encryption Analysis of encrypted packets is a relevant issue for
a proper data inspection. In encryptedscenarios, IP packets fields
are encrypted and unobservable at several layers, so some
metrics,related for example to protocol headers, cannot be
evaluated.
Hardware requirements Probes must be deployed on every link to
be monitored. Moreover deep in-spection on high bandwidth networks
is not tolerated by a simple router device.
Collector flooding Since the protocol is push based the
collector could suffer of excessive load comingfrom the probes. A
careful exporting configuration must be considered.
Uses
The IPFIX format is largely implemented and adopted by generic
network devices, like routers, andnetwork analysis devices provided
by several vendors. IPFIX compliant devices are used as supportfor
effective network measurement, providing vital information on the
health of the managed networks;the collection of network
information can be used for several purposes: the standard provides
a strongback-end for security functionalities, like Intrusion
Detection.
2.8 The Syslog Format
2.8.1 Reference
The Syslog Protocol is normalized by the Internet Engineering
Task Force (IETF) as RFC 5424[6].
2.8.2 Objectives
The need for a new layered specification has arisen because
standardization efforts for reliable andsecure syslog extensions
suffer from the lack of a Standards-Track and transport-independent
RFC.Without this, each other standard needs to define its own
syslog packet format and transport mechanism,which over time will
introduce subtle compatibility issues. The goal of this
architecture is to separatemessage content from message transport
while enabling easy extensibility for each layer.
2011 by MASSIF Consortium 31 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
2.8.3 Structure
Structure overview
This protocol utilizes a layered architecture, which allows the
use of any number of transport protocolsfor transmission of syslog
messages. It also provides a message format that allows
vendor-specificextensions to be provided in a structured way. The
syslog protocol does not provide acknowledgmentof message delivery.
Though some transports may provide status information,
conceptually, syslog is apure simplex communication protocol.
The syslog message has the following ABNF[3] definition:
SYSLOG-MSG = HEADER SP STRUCTURED-DATA [SP MSG]
HEADER = PRI VERSION SP TIMESTAMP SP HOSTNAME
SP APP-NAME SP PROCID SP MSGID
PRI = ""
PRIVAL = 1*3DIGIT ; range 0 .. 191
VERSION = NONZERO-DIGIT 0*2DIGIT
HOSTNAME = NILVALUE / 1*255PRINTUSASCII
APP-NAME = NILVALUE / 1*48PRINTUSASCII
PROCID = NILVALUE / 1*128PRINTUSASCII
MSGID = NILVALUE / 1*32PRINTUSASCII
TIMESTAMP = NILVALUE / FULL-DATE "T" FULL-TIME
FULL-DATE = DATE-FULLYEAR "-" DATE-MONTH "-" DATE-MDAY
DATE-FULLYEAR = 4DIGIT
DATE-MONTH = 2DIGIT ; 01-12
DATE-MDAY = 2DIGIT ; 01-28, 01-29, 01-30, 01-31
; based on month/year
FULL-TIME = PARTIAL-TIME TIME-OFFSET
PARTIAL-TIME = TIME-HOUR ":" TIME-MINUTE ":" TIME-SECOND
[TIME-SECFRAC]
TIME-HOUR = 2DIGIT ; 00-23
TIME-MINUTE = 2DIGIT ; 00-59
TIME-SECOND = 2DIGIT ; 00-59
TIME-SECFRAC = "." 1*6DIGIT
TIME-OFFSET = "Z" / TIME-NUMOFFSET
2011 by MASSIF Consortium 32 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
TIME-NUMOFFSET = ("+" / "-") TIME-HOUR ":" TIME-MINUTE
STRUCTURED-DATA = NILVALUE / 1*SD-ELEMENT
SD-ELEMENT = "[" SD-ID *(SP SD-PARAM) "]"
SD-PARAM = PARAM-NAME "=" %d34 PARAM-VALUE %d34
SD-ID = SD-NAME
PARAM-NAME = SD-NAME
PARAM-VALUE = UTF-8-STRING ; characters '"', '\' and ']'
; MUST be escaped.
SD-NAME = 1*32PRINTUSASCII except '=', SP, ']',
%d34 (")
MSG = MSG-ANY / MSG-UTF8
MSG-ANY = *OCTET ; not starting with BOM
MSG-UTF8 = BOM UTF-8-STRING
BOM = %xEF.BB.BF
UTF-8-STRING = *OCTET ; UTF-8 string as specified
; in RFC 3629
OCTET = %d00-255
SP = %d32
PRINTUSASCII = %d33-126
NONZERO-DIGIT = %d49-57
DIGIT = %d48 / NONZERO-DIGIT
NILVALUE = "-"
Syslog message size limits are dictated by the syslog transport
mapping in use. There is no upperlimit per se. Each transport
mapping defines the minimum maximum required message length
support,and the minimum maximum must be at least 480 octets in
length.
The TIMESTAMP field is a formalized timestamp derived from
[RFC3339].The HOSTNAME field identifies the machine that originally
sent the syslog message.The APP-NAME field should identify the
device or application that originated the message. It is a
string without further semantics. It is intended for filtering
messages on a relay or collector.The PROCID field is a value that
is included in the message, having no interoperable meaning,
except that a change in the value indicates there has been a
discontinuity in syslog reporting. Thefield does not have any
specific syntax or semantics; the value is implementation-dependent
and/oroperator-assigned.
The MSGID should identify the type of message. For example, a
firewall might use the MSGIDTCPIN for incoming TCP traffic and the
MSGID TCPOUT for outgoing TCP traffic. Messages with thesame MSGID
should reflect events of the same semantics. The MSGID itself is a
string without furthersemantics. It is intended for filtering
messages on a relay or collector.
2011 by MASSIF Consortium 33 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
STRUCTURED-DATA provides a mechanism to express information in a
well defined, easily parseableand interpretable data format. There
are multiple usage scenarios.
Links with other data formats
Relationship with BSD Syslog, RFC 3164[9].
Relationship with MASSIF
Given its widespread use, we expect many of the use cases to
partially rely on it. Beyond the project,supporting syslog is an
absolute requirement for commercial success of a SIEM platform, be
it as soft-ware or as a managed security service.
2.8.4 Critical assessment of the format
Advantages
The syslog format tries to provide a solid basis that allows
code to be written once for each syslog featurerather than once for
each transport. Without this format, each other standard would need
to define itsown syslog packet format and transport mechanism,
which over time will introduce subtle compatibilityissues.
Issues
The protocol may content the NULL value as control characters.
However, invalid UTF-8 sequences maybe used by an attacker to
inject ASCII control characters. Similarly, message truncation can
be misusedby an attacker to hide vital log information.
There is no mechanism in the syslog protocol to detect message
replay. An attacker may record aset of messages that indicate
normal activity of a machine. At a later time, that attacker may
removethat machine from the network and replay the syslog messages
to the relay or collector.
Some messages may be lost because there is no mechanism to
ensure delivery, and the underlyingtransport may be unreliable
(e.g., UDP).
Syslog can generate unlimited amounts of data. The transfer of
this data over UDP is generallyproblematic, since UDP lacks
congestion control mechanisms.
The syslog protocol does not have mechanisms to provide
confidentiality for the messages in transit.
2011 by MASSIF Consortium 34 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Network administrators must take the time to estimate the
appropriate capacity of the syslog collector.An attacker may
perform a Denial of Service attack by filling the disk of the
collector with false messages.
Uses
Syslog is in widespread use, both for UNIX operating system
hosts and for networking equipments.
2.9 Windows Management Instrumentation (WMI)
2.9.1 Reference
Windows Management Instrumentation (WMI) is the Microsoft
implementation8 of Web-based Enter-prise Management (WBEM), which
is an industry initiative to develop a standard technology for
access-ing management information in an enterprise environment.
WMI uses the Common Information Model (CIM)9 industry standard
to represent systems, applica-tions, networks, devices, and other
managed components. CIM is developed and maintained by
theDistributed Management Task Force (DMTF). The Managed Object
Format (MOF)10 language is usedto create new CIM class.
2.9.2 Objectives
The main target of WMI is to provide a standard to share
management information between managementapplications windows-based
throughout the network. The aim of this set of specifications is to
establish auniform model that allows working in different
environments and interact with other existing managementstandards
to access information from any source, such as DMI (Desktop
Management Interface) orSNMP.
8http://msdn.microsoft.com/en-us/library/aa384642(v=VS.85).aspx9http://www.dmtf.org/standards/cim
10http://msdn.microsoft.com/en-us/library/aa823192%28v=vs.85%29.aspx
2011 by MASSIF Consortium 35 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
2.9.3 Structure
Structure overview
The Microsoft WMI implements the three-tiered model of the WBEM
architecture for working with man-agement data that in this case
includes the following components: a standard mechanism for
storingobject definition (a CIM-compliant object repository), a
standard protocol for collecting and distributingmanagement data
(such as COM/DCOM), and one or more Win32 dynamic-link libraries
(DLLs) thatfunction as WMI data providers.
Diagram shows the data flow in the WMI architecture11:
Figure 2.2: Windows Management Infrastructure architecture data
flow11http://msdn.microsoft.com/en-us/library/ff566343%28v=VS.85%29.aspx
2011 by MASSIF Consortium 36 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
It is important to highlight that WMI is an object model and not
a language. Several scripting lan-guages, such as VBScript or
Windows PowerShell, can be used in WMI to manage the different
windows-based servers locally and remotely.
The Windows Management Instrumentation defines the objects,
methods and properties which areneeded to access to the management
information data from the different parts of the operating
system.The model that WMI uses to store this information is the
standard Common Information Model (CIM).
According to the CIM Specification 2.312, there are three
different levels of classes in the CIM modelfor storing
information: the Core, Common and the Extended classes.
The core model define an information model that applies to all
areas of management
The common model applies to information that is common to
particular management areas (suchas systems, applications, networks
and devices) but which is independent of a particular
imple-mentation or technology.
The extension schemas are extensions to the common model for a
specific technology, for examplefor different operating systems
such as Microsoft Windows or Unix.
On the other hand, according to the CIM definition provided by
the DMTF, CIM is composed of aspecification and a schema. The
specification defines the details for integration with other
managementmodels, while the schema provides the actual model
descriptions.
The specification can be described in Unified Modeling Language
(UML), Managed Object Format(MOF), or Extensible Markup Language
(XML). But to create and describe classes in the CommonInformation
Model (CIM), the Managed Object Format (MOF)13 is the most used and
popular language.
Links with other data formats
WMI is an implementation of the Web-Based Enterprise Management
(WBEM) and is fully compliantwith the Common Information Model
(CIM), defined by the DMFT, which is based upon UML.
MOF, the language that is used for describing the CIM classes,
is based on the Interface DefinitionLanguage (IDL).
It is possible to use Windows Remote Management (WinRM) instead
the Distributed ComponentObject Model (DCOM) to obtain remote WMI
management data using the WS-Management SOAP-based protocol that
are formatted in XML.
Relationship with MASSIF
In the Olympic Games scenario there are Windows systems where
WMI might be used to grab the logsbut at the present, they are
enforced using the standard format and moved to syslog.
12http://www.dmtf.org/sites/default/files/standards/documents/DSP0004V2.3_final.pdf13http://www.dmtf.org/sites/default/files/standards/documents/DSP0004_2.6.0_0.pdf
2011 by MASSIF Consortium 37 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
2.9.4 Critical assessment of the format
Advantages
WMI is widely present in windows-based applications so it is a
common way to access and share man-agement information from local
and remote computers. Besides, there is a variety of scripting
languages(such as VBScript or Perl), that can be used in enterprise
applications and administrative scripts to obtainWMI data or take
actions through WMI.
CIM is a model that permits both a common model that applies to
all areas and particular extensionsto define different management
information for systems, networks, applications, devices and
services.This feature allows building semantically rich management
information that will be exchange throughoutthe network.
Issues
The WMI log files are being replaced by Event Tracing for
Windows (ETW) .Some vulnerability on applications that use Windows
Management Instrumentation can be found.
For example in some applications, due to insufficient security
protections on WMI providers, a localattacker could gain elevated
privileges on the local system and use them to take control of
it.
Uses
WMI scripts and applications are used to obtain and exchange
management information on windows-based systems. These scripts
allow performing administrative tasks on parts of the operating
systemsas well as share management data with different products.
Some of the products can be MicrosoftSystem Center Operations
Manager or Windows Remote Management (WinRM).
2.10 WS-Eventing and WS-Notification
2.10.1 Objectives
WS-Eventing[1] and WS-Notification[7] are two competing
specifications to standardize message for-mats and Web services
interfaces for subscription management and notification delivery in
event notifi-cation systems in WS-based systems. A WS-based event
notification system utilizes Web services tech-
2011 by MASSIF Consortium 38 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
nologies to deliver event notifications and manage
subscriptions. In such a system, a SOAP-formattedsubscription is
sent to an event producer Web service, requesting a certain kind of
event notifications toone or more event consumer Web services. As
events occurr, the event consumer Web services canreceive
SOAP-formatted notification messages. The notification messages can
be transported throughintermediary and use different transportation
mechanisms.
2.10.2 Structure
Architecture
The architectures presented in WS-Eventing and WS-Notification
are remarkably similar irrespective oftheir incompatibility. In
fact, subsequent versions of each specification have converged
towards eachother, borrowing concepts from the other to mitigate
their own deficiencies.
WS-Eventing and WS-Notifications both process identical WS-based
architecture and follow Pub-lisher/Subscriber design. Both define
subscriber and subscription manager entities. The event sinkdefined
in WS-Eventing is comparable to the notification consumer defined
in WS-Notification. Thesubscribers are separated from notification
consumers such that notification consumers are required tohandle
only the received notification messages. They are not required to
know the message broker lo-cation and manage subscriptions.
WS-Eventing does not separate the publisher from the event
source.The event source in WS-Eventing has both functions of the
notification producer and publisher definedin WS-Notification.
Function
WS-Eventing defines five operations, namely Subscribe, Renew,
GetStatus, Unsubscribe and Subscrip-tionEnd. The Subscribe
operation is used to create a subscription for an event sink. The
Renew, Get-Status and Unsubscribe operations are provided by
subscription managers to subscribe to their existingsubscriptions.
If an event source terminates unexpectedly, a SubscriptionEnd
message is generatedand sent to the address specified in the
subscription request. If that address is not presented in
thesubscription request, this SubscriptionEnd message is not
generated.
WS-Notification has comparable operations for the above five
operations. Even though it does notdefine GetStatus and
SubscriptionEnd operations, they can be implemented using the
(optional) WS-ResourceFramework since WS-Notification can treat
subscriptions as WS-Resources in
WS-ResourceFrameworkspecification.
2011 by MASSIF Consortium 39 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Delivery mode
Both WS-Eventing and WS-Notification can use push, pull and
wrapped mode to deliver notificationmessages. The wrapped mode
deliver can encapsulate several notification messages on to one
forefficient delivery. The pull mode enables the event sink or
notification manager to check an event sourceperiodically for
relevant events. In push mode, the event source waits for an
acknowledgement for thenotification message it sends.
Filters
WS-Notification defines three types of message filters namely
TopicExpression, ProducerProperties andMessageContent. A subscriber
can use any or all of these filters. WS-Eventing allows at most one
filterin subscription requests. The default filter is a
content-based filter using XPath expressions in a specifieddialect
that evaluates to a Boolean value as a filtering criteria.
WS-Eventing does not specify a way tofilter messages using
ProducerProperties of publishers.
Links with other data formats
WS-Eventing and WS-Notification specifications are composable
with other WS-* specifications. Hencethey only defines the key
publishers/subscriber functions and rely on other WS-*
specifications to providevarious value additions such as security,
reliability and transactions. For instance, WS-Security can beused
with WS-Eventing or WS-Notification to provide secure delivery of
messages.
Relationship with MASSIF
Both specifications are candidates for receiveng events from web
services platforms.
2.10.3 Advantages of the formats
Both specifications provide means to develop distributed event
notification systems utilizing exitingWeb services technology which
intrinsically provides vendor-independent, platform independentand
programming language independent interoperability.
They are composable with other WS-* specifications to provide
various value additions such assecure delivery, reliability and
transactions.
Fits well with Asynchronous Web services Invocation paradigm
2011 by MASSIF Consortium 40 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Data source Characteristics Rationale summary
SIEM Standard Experience
CEF Y(1) N Y CEF is an interesting glance at data collection
from animportant SIEM vendor and is a public specification.
CLF Y(all) Y Y CLF is a major log format for web servers, being
sup-ported by Apache out of the box. It can be directly inte-grated
in many SIEMs, e.g. Prelude and RSA.
IDMEF Y(1) Y N While IDMEF is not widely used in the community,
andits important overhead may prevent its further diffusion,it does
provide a reference viewpoint for modeling alertinformation. At
least 2 MASSIF partners have experiencewith IDMEF.
IF-MAP N Y Y IF-MAP is a recent newcomer and has industrial
backing,although outside the SIEM community so far. One MAS-SIF
partner has experience with IFMAP.
IODEF N Y N IODEF addresses a different community than the
classicSIEM world, so provides an additional, alternative
view-point about decision support modeling, that has to
ourknowledge no equivalent, and that is important for theMASSIF
decision support components.
IPFIX N Y Y IPFIX is becoming increasingly important in the
network-ing world, where it may provide an alternative or a
com-plement for syslog.
Syslog Y(all) Y Y This is the major data source. It is clearly
used a lot inSIEMS, has standards backing and is used by
profes-sionals. It is the de-facto data source standard for theATOS
use case and for many network operators. Whilethe analysis of
syslog messages needs to be refined toreally understand the
content, it does provide a first entrypoint for syntactic and
semantic analysis.
WMI Y Y Y WMI is one of the major interfaces for managing
Microsoftwindows systems, and as such is a way to retrieve
in-formation from them, that is of interest to the
MASSIFproject.
WS-Eventing N Y Y While these languages are currently rarely
included inSIEM environments, the focus of MASSIF on
businessprocesses attack detection makes these languages
im-portant.
Table 2.2: Included log sources summary
2011 by MASSIF Consortium 41 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
Data source Characteristics Rationale summary
SIEM Standard Experience
ODBC Y N N While ODBC is used as a collection mechanism, it
shouldbe considered with caution. We believe that its use is
ori-ented to Windows environments, and WMI provides a bet-ter
alternative. Also, it is purely about transport and doesnot provide
us with information about the data, thus is con-sidered out of
scope of this deliverable.
SNMP Y Y N While SNMP is cited as a collection mechanism by
sev-eral SIEMs, its use seems to be limited to transportingdata.
The management information bases used by SIEMSwould have been in
scope, but SIEM products do not pub-licly document this, and the
transport protocol only is outof the scope of this deliverable.
Log file pull Y N Y Several methods for pulling out log files
are mentioned inSIEMs documentations, such as FTP, SFTP, SSH or
SCP.This does not provide information about the content of
theinformation handled thus does not fall into the scope of
thisdeliverable.
Table 2.3: Eliminated log sources summary
2011 by MASSIF Consortium 42 / 61
-
Chapter 3
Use-case specific data streams
3.1 Olympic Games Scenario
3.1.1 Motivation and description
The Olympic Games SIEM definition follows business drivers, that
is, definition is tight to the specifictechnology that the customer
(the Local Organizing Committee) decides. Usually this decision
followssponsorship interests.
Hence, events processing languages in the Olympic Games Scenario
is tight to the specific SIEMproduct development context. The
choice of the language events processing protocol will influence
theinternal representation of the events data, transmission and
storage but, by all means, it is usually tightto the specific SIEM
product. Current contexts are based in the Novell SIEM product
(i.e. Novell Sentinel6.1 in the Vancouver Winter Olympic Games
project) and only two different protocols where used in thelast
Olympic Games: Syslog and LEA.
The Olympic Games SIEM uses the Novell Sentinel product. Novell
Sentinel 6.11 delivers real-time monitoring and remediation for
automated security and compliance. With a single view of
securityand compliance events across the enterprise, Sentinel 6.1
combines identity management and securityevents management for
real-time. Sentinel 6 streamlines labor-intensive and error-prone
processes,cuts costs through automation, and enables you to deliver
a more rigorous security and complianceprogram.
1http://www.novell.com/products/sentinel/
43
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
3.1.2 Novell Sentinel Interface: Syslog data format
Description
Syslog2 (see section 2.8) is a standard for logging program
messages. It allows separation of thesoftware that generates
messages from the system that stores them and the software that
reports andanalyzes them. It also provides devices, which would
otherwise be unable to communicate, a means tonotify administrators
of problems or performance.
There are three main topics when defining the Olympic Games
related events and languages:
1. How to collect data transmission, syslog, wmi, snmp, etc
2. How to parse the data format, spaces and commas
3. How to make sense out of the collected data meaning/logics of
the fields posed by the monitoredapplication/system
Mapping these three topics into Novell Sentinel 6.1 we get the
following Novell components:
Sources are systems that are being monitored.
Connectors define connectivity protocols. Only two different
protocols where used in the last OlympicGames: Syslog and LEA.
Collectors define parsing rules and mapping of the internal data
presentation into Sentinel taxonomy.Collectors examples used in the
Olympic Games were Windows (through Snare agents), Source-fire,
Nortel switches/routers or Sophos Antivirus.
Advantages
Syslog provides flexibility when dealing with different SIEM
products and obviously is a widely extendedlog format.
Syslog is the preferred (de facto) format in the Olympic Games
scenario.
Drawbacks and issues
We have used Syslog as native log function built-in in the
network devices, e.g. switches/routers, IDS,FW appliances, etc.
These devices can not speak IDMEF or similar.
2http://www.syslog.org/
2011 by MASSIF Consortium 44 / 61
-
MASSIF - FP7-257475D3.2.1 - Scenarios analysis and external
languages specification
When monitoring Windows systems we might used WMI to grab the
logs, but still we enforced usingthe standard format and moved to
syslog by implementing Snare agents on each windows
systemtranslating Eventlog into Syslog.
Examples
The following are examples of valid syslog messages. A
description of each example can be found belowit. The examples are
based on similar examples from RFC 3164[9] and may be familiar to
readers. Theotherwise-unprintable Unicode BOM is represented as
"BOM" in the examples.
Example 1 - with no STRUCTURED-DATA
1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47
- BOM'su root' failed for lonvick on /dev/pts/8
In this example, the VERSION is 1 and the Facility has the value
of 4. The Severity is 2. The messagewas created on 11 October 2003
at 10:14:15pm UTC, 3 milliseconds into the next second. The
messageoriginated from a host that identifies itself as
mymachine.example.com. The APP-NAME is su andthe PROCID is unknown.
The MSGID is ID47. The MSG is su root failed for lonvick...,
encoded inUTF-8. The encoding is defined by the BOM. There is no
STRUCTURED-DATA present in the message;this is indicated by - in
the STRUCTURED-DATA field.
Example 2 - with no STRUCTURED-DATA
1 2003-08-24T05:14:15.000003-07:00 192.0.2.1 myproc 8710 - -
%% It's time to make the do-nuts.
In this example, the VERSION is 1. The Facility is 20, the
Severity 5. The message was createdon 24 August 2003 at 5:14:15am,
with a -7 hour offset from UTC, 3 microseconds into the next
second.The HOSTNAME is 192.0.2.1, so the syslog application did not
know its FQDN and used one of itsIPv4 addresses instead. The
APP-NAME is myproc and the PROCID is 8710 (for example, this
couldbe the UNIX PID). There is no STRUCTURED-DATA present in the
message; this is indicated by - inthe STRUCTURED-DATA field. There
is no specific MSGID and this is indicated by the - in the
MSGIDfield. The message is %% Its time to make the do-nuts.. As the
Unicode BOM is missing, the syslogapplication does not know the
encoding of