Cartographer orBuilding a Next Generation
Management Framework
Bobby KrupczakChief Scientist
Krupczak.org, [email protected]
http://www.krupczak.org/cartographer
Overview
Background
Overview of network mgmt today
Cartographer
Yet another management framework
Software technology
Demo
Who Am I?
BS CISE from UF 1989
Worked in industry on SNMP
MS CS GaTech 1993
Co-founder of Empire Technologies
PhD CS GaTech 1997
Sold Empire to Concord 1999
Krupczak.org 2003
Management Model
Mgmt info is virtual representation
Managers, agents exchange mgmt info
Mgmt is therefore:
Inspection of
Alteration of
Creation of
Deletion of mgmt info
First-Generation
Dumb, lightweight (hopefully) agents
Heavyweight, complex, smart managers
Traditional command-control
Scaling becomes issue
Analogous to CEO managing entire enterprise
2nd-Generation
Push intelligence outwards towards agent
Empire/SystemEDGE, RMON
Increase scaling, reduce reaction time
Some delegation, middle-managers, remote pollers
Exception-management, event
de-duplication, root-cause
2nd-Generation (continued)
Agents still work in isolation (stovepipes)
Distribution overhead and agent administrative footprint still non-trivial
SNMPv1, v2c, v3 now deployed
Agent backlash?
CEO now has bank VPs but still manage/controls the enterprise
Cartographer
Discover, track relationships between components in distributed system
Dependencies between network, system, applications
Include network services as well as higher-layer abstractions
Agent based
Topography not topology
Others have examined this approach though mostly in academic research papers
Cartographer (II)
Model relationships using dependency graph borrowed from graph theory branch of mathematics
Systems represented as vertexes
Dependencies represented as edges
Directed graphs
System is server if it provides service to some client
System is client if it consumes service
Example Dependency Graph
What Do We Do With Data?
Discover, analyze dependencies
Diagnose and troubleshoot faults
Security spinoff
Monitor, test, & compare service experiences
Work bottom-up
But I Already Know My Network
You may be surprised what you find
Distributed systems are highly dynamic, not static
Automating management necessitates capturing this info and encoding it
What Do We Do? Discover/Analyze
Discover dependencies via:
OS and app configuration/etc, .ini, and Windows registry
System APIs
Dynamically via protocol endpointsIPv4 and IPv6
Classify into ~ 30 different types
Inbound/outbound/transit
Per-system, per-user, per-app
What We Do? Discover/Analyze (II)
Dependencies tell us what a machine is doing
Validate configuration and operation
Discover misconfiguration
Seed automatic configuration for monitoring
If DB server => automatically monitor components
What Do We Do? Diagnose
Who/what is impacted?
If key app dies => know who is impacted
Determine root cause/impact
Given fault, which clients are affected?
Given a client, what faults are affecting it?
We know service A depends on X,Y,ZIf A fails, examine X, Y, Z
What Do We Do? Security Spinoff
Track dependencies and interactions longterm
Develop model of typical behavior/role of system/app
Deviations from baseline could indicate issues
Social networking for computers
If my machine starts communicating with those in China . . . .
What Do We Do? Compare Service Experience
Do you see what I see?
Use dependency data to automatically test services
Global, centralized testing
Per-system active testing
Per-system passive monitoring
Detect localized hot-spots
Pinpoint infrastructure problems
What Is Next Generation About This?
Started with observations about how human corporations work
CEO sets broad policies and goals
Employees implement them, solve problems, run the show
Managers and agents become peers
Further push intelligence and command/control downward and outward
P2P architecture utilized
Every agent acts in dual role
Peer-to-Peer
Not based on polling and storing of data in central repository
Not to say this isn't important
Agents self-organize into p2p overlay networks
Exchange information with peers
Run distributed algorithms
Self-propagate, self-update
What Is A Peer?
Systems are peers if they both utilize same service from same server
Many p2p overlays
Increase scaling (unlimited?)
Reduce reaction time
Analyze more up-to-date info
Example P2P Overlay
New Management Framework?
Why re-invent the wheel?
Could make existing IMF work given enough tape and glue
SNMPvX still too cumbersome, inefficient
Protocol limitations
ASN.1/BER too brittle and prone to interoperability problems
WBEM/CIM too heavyweight, complexSpend all day modeling, not managing
Some existing work applying XML to IMF
XML Management Protocol
Framework in addition to just a protocol
SMI, protocol, MIBs
Borrow from and extend the IMF as much as possible
Utilize XML for:
Data modeling (SMI)
Specification (MIBs)
Transfer syntax (protocol)
Everything is text
More XML
ASN.1 could have been used?
More XML tools,
More widely adopted than ASN.1
XML schemas for structured document
Modeling
Parsing
Conversion
Validating
Still need to test interoperability
XMP SMI
http://xmlns.krupczak.org/xsd/xmptypes-1.0.xsd
Start with SNMP SMI
Enhance only where necessary
Do away with OIDs
Tuple of MIB-name, object-name, key
MIB-2 ifInOctets From:
1.3.6.1.2.1.2.2.1.10.1To:
mib2.ifInOctets.if0
XMP SMI (II)
SMI type enhancements
Added several data types and promoted several textual conventions
Everything 64-bit min, although with XML, numbers can be larger w/o breaking 2/3 of framework
With BER, changing from 32-64 bit breaks SMIs, MIBs, software
Textual conventions specify additional semantics; overloading is poor engineering
Promote several to standard types
XMP SMI (III)
Added extendedBoolean type
True, False, Unknown
Added unsupportedVariable so agent can answer queries honestly and completely
Avoid use of inheritance and poloymorphism complexities (ala CIM)
Scalar and tabular objects
XMP SMI (IV)
Tables are relations
Support relational table operations
How to marry table permissions with object permissions?
Need a lot more work on MIB specification & schema
XMP Protocol
http://xmlns.krupczak.org/xsd/xmp-1.0.xsd
XMP Protocol (II)
Connection-oriented
Avoid much of intricacies of UDP-based protocols
What intricacies?
More efficient for larger data xfers
No need for MIB tricks
No need for object ordering
No built-in race conditions in large tables
Original rationale for SNMP/UDP valid then, not now?
XMP Protocol (III)
Entity initiates session
Also closes session
Stay connected as long as needed
RPC like semantics
Request/response semantics
Initiator makes requests
Is this a manager?
XMP Protocol (IV)
Message types borrowed from SNMP
GetRequest (scalars)
Response (scalars, tables)
SetRequest (scalars)
TrapFirst two objects are core.trapType and core.sysObjectID
Information
Example GetRequest
Example Response
XMP Table Operations
SQL-like
SelectTableRequest
InsertTableRequest
DeleteTableRequest
UpdateTableRequest
No overloading, no side-effects
Example SelectTableRequest
No GetNext/GetBulk
No GetNext/GetBulk needed for table traversal
GetNext yields very little information and no additional semantics
But how do I walk a MIB?
You don't
In practice, walking only yields syntactic information
Tables, Keys
For scalars, no real instance identifier needed
For tables, relation keys
Keys can be strings, numbers, variable-length
No explicit notion of ordering
No need?
XMP Encapsulation in SSL/TCP
Utilize SSLv3/TLSv1 for privacy and authentication
Cartographer utilizes its own CA to create/sign X509v3 certs
Each entity embeds own CA
Agent -> Agent requires two-way authentication
Manager does not need to provide cert
TCP/UDP 5270
XMP MIBs
Virtually compatible with SNMP SMI
Implemented MIB-2 in XMP
Can implement others
HostMIB, SysApplMIB
How MIBs are specified still under development
XML schema
Tables, objects, keys
Borrow from relational DB theory and SQL
XMP MIBs (II)
MIB names must be unique within universe of XMP
Within a MIB, object names must be unique
Can utilize private-enterprise numbers to help with uniqueness
Krupczak.org is 16050
Core MIB contains agent-engine stats and config
Cartographer MIB implemented
But How Do I Make Money?
License model:
Open source
Closed source
Dual-license
Traditional closed-source company
Market for management software mature and consolidating
Unlikely to gain much traction
Crippleware
Example OSS Companies
Example open-source companies:
Sendmail (OSS, add-on software and services)
Snort (dual license?)
Asterisk (dual license)
OpenNMS (OSS, services)
JBoss – sold for $400m to RedHat
MySQL – sold for $1B to Sun
An Island or Ecosystem?
Tremendous investment in existing products & frameworks
Add XMP as new management protocol to existing platforms
OpenNMS
MRTG
ZenOSS? Integration in research phase
Others?
Integration (continued)
SNMP/XMP gateway?
Not under active consideration
Very difficult computer science problem
Backport to SNMP, WBEM
Not under active consideration
More likely than gateway approach
Technologies, Platforms, Engines
Agent written entirely in C
No need to install interpreters, VMs, DLLsIn past lifetime, having to install Java on all systems was large
barrier
Goal is to run agent out of box
Very small footprintFootprint less than 3% is upper-bound
Engine is 66k lines of C-code
Plugins 9k to 16k lines of C-code
Ship with libs/DLLs if needed
Platform support
Solaris 9+ Sparc (64-bit)
Solaris 9+ x86
Linux 2.4+ on x86 (32, 64-bit)
Windows 2000/XP/2003/Vista/2008
Win32 and Win64
Agent uses as few libs as possible
Libxml
Pthreads
Openssl
Iconv, zlib
Big Picture
Agent Pieces/Parts
Licenses
Agent engine, GPLv2
MIB-2 plugin, GPLv2
Example plugin, GPLv2
Cartographer plugin, closed source, shrinkwrap software license
Java GUI, closed source, shrinkwrap software license
See release notes and install instructions
Roadmap
1.0 released in November 2008
Framework
Infrastructure
1.1 release in Spring/Summer 2009
Bug fixes, additional platforms
MIB schema, SMI work
More MIB data
More intelligence
A lot more work on events
More Roadmap
2.0 TBD
Self-propagation (already do self-updating)
Distributed decision making
Root cause, impact
Automatic testing/measurement
More integration
Demo – Cartographer Main
Dependency View
Dependency Query
Dependency Query
Dependency Query (Asterisk)
Process Query
Process Query
Endpoint Query
Endpoint View
MRTG Integration
ONMS Integration