HACQIT: Hierarchical Adaptive Control of QoS for Intrusion Tolerance James E. Just James C. Reynolds Karl Levitt 13 February 2001 Server GW Switch FW Monit or & Adapt er To Critical Users VPN Sensors Controls Primary Nodes Backup Nodes Decoys/Fishbowls Server Server Server Server Server
82
Embed
HACQIT: H ierarchical A daptive C ontrol of Q oS for I ntrusion T olerance
FW. To Critical Users. GW. VPN. Switch. Primary Nodes. Monitor & Adapter. Backup Nodes. Sensors. Decoys/Fishbowls. Controls. Server. Server. Server. Server. Server. Server. HACQIT: H ierarchical A daptive C ontrol of Q oS for I ntrusion T olerance. James E. Just - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
HACQIT:Hierarchical Adaptive Control of QoS for Intrusion Tolerance
James E. Just
James C. Reynolds
Karl Levitt13 February 2001
Server
GWSwitch
FW
Monitor &
Adapter
To Critical Users VPN
Sensors
Controls
Primary Nodes
Backup Nodes
Decoys/Fishbowls
Server
ServerServer
Server
Server
Outline
• Team• HACQIT idea• Goals• Architecture• Status• Plans• Current Capabilities• Questions and Issues
Quorum component modification, monitor/adapter development, integration– J. Just
– J. Reynolds
– L. Clough
– R. Maglich, E. Lawson
• UC Davis – attack modeling, sensing and response options– K. Levitt
– R. Pandey, F. Wu
– J. Rowe
– M. Tylutki
The HACQIT Idea
• Utilize robust hierarchical control of QoS and other fault tolerance techniques to deliver critical COTS services to critical users while under attack– Significantly raise adversary work factors
– Focus on useful military applications
– Policy driven
• Leverage current and new technologies– QoS/Quorum – DeSiDeRaTa, AQuA, QCS, others
– IA&S – wrappers, intrusion and integrity sensors, active monitoring & response, randomization, VPNs, attack modeling (Jigsaw concepts), honeypots
• Second round of “attack & solution space” exploration– Test applications include Dynbench, Apache web server,
Notepad – (Exchange or sendmail in works)– Refining architecture & design -- HACQIT requirements and
component responsibilities being refined– DeSiDeRaTa code conquered (or at least subdued)– Cross project coordination underway
• Attack and response modeling begun• New code developed for:
– Secure Task Manager and heartbeat monitor– Sensor manager, e.g., Tripwire and wrappers– Response managers: e.g., firewall and auditing– Policy driven wrappers– HACQIT monitor/adapter
Plans
• Continue coordination and leveraging activities• Development
– Continue design, rapid experimentation (risk reduction), and research efforts through February
– Write specification for first real prototype in March– Develop solid prototype for June evaluation– Evaluation – Red Team, informal hacker exposure via
internet, other– Go into next cycle
• Leverage– SCC – firewall on NIC (ADF)– Draper – Gateway “cleaner”
• Other activities– IFIP WG 10 Dependability Benchmarking
Current Demonstration Purpose
• Use of policy driven wrapper technology to intercept suspicious calls and initiate failover
• Use of Quality of Service manager to effect switchover
• Use of diversity among primary and backup to reduce likelihood of renewed attack against backup
• Lab capability to test protective measures against actual attacks and mitigate their effects on simulated users
HACQIT Demonstration Configuration
• HACQIT primary NT running Apache and/or Exchange and/or MS Word
• HACQIT backup is Linux
• Outside of HACQIT cluster firewall are three client workstations, Good and Continuing, two “weak” legitimate user clients, and Bad, source of malicious attacks
• Bad could be inside or outside the Enclave that is protected by the second firewall– Bad is inside the enclave firewall for convenience
– Eventually a user will be outside the enclave firewall
• Secure channel to the client– Currently VPN is not used
– Eventually will use VPNs or IPSec.
Hub
Continuous User (control)
Good User (victim)
PETER
MARSHA
JAN
BOBBY
ALICE (SWITCH)
Hub
HACQIT LAB
192.168.250.251
192.168.250.252
192.168.150.155
192.168.200.205
192.168.200.200
192.168.150.150
Bad User (attacker)192.168.250.253
HACQIT Lab
HACQIT Software Implementation
COTS/GOTS
Modified
Original
Legend
FMC
Firewall
Tripwire
Host Control
STM
IM
HCI
Hub
Switch
Program Control
Monitor/Adapter
Primary Backup
Host Control
TCP DUMP
Current Demonstration Scenario
• Users (Good and Continuing) running NT in enclave– Processes on Good and Continuing simulate client demands on
Apache– Other processes will simulate demands on Exchange or saving
Word files on primary
• Automated attack launched from Bad to take over Good and then attack web server on Primary (NT)– Exploit vulnerabilities in Apache or Exchange or Word – Attack executes a program and/or modifies file
• Primary attack detected & mitigated by Apache wrapper– Wrapper communicates with the Monitor/Adaptor on our out-of-
band machine
• Monitor/adaptor starts Apache (under Linux) on the backup and tells firewall controller to switch IP address from the primary to the backup
• Exchange or Word could start using Wine or VMWare technology
New Issues
• Can we capture or redirect client requests so that users really are not interrupted during a migration?
• What does this mean for real time? How does Desi do failover without the DynBench applications losing something? Do they stop processing until the connection is reestablished?
• How can we save state and migrate an application? Do different types or classes of applications have different requirements? What is HACQIT’s ability to cover these different classes?
• Can we add a new user? We might want to disable all sessions and start over with trusted connections
Incremental Implementation Approach (I)
• Several capability levels are envisioned
• Lower levels are specified
• Level 0 – Insider attack and simple migration– No firewall on HACQIT cluster
– Only critical application is Apache web server (no Microsoft Exchange)
– Simulations of user web server activities would be running on both Good and Continuing
– Attack from weak client Good against Apache web server on primary
– Sense compromise via wrapper integrity checker on Apache which then communicates with Monitor/Adapter (M/A)
– M/A migrates Apache web server from NT primary to NT backup
Incremental Implementation Approach (II)
• Level 1 – Outsider attack with simple, cross platform migration and increased sensing– Firewall(s) added to HACQIT cluster
– Added simulations of user web server activities should be added to legitimate user machine outside the enclave
– Bad attacks weak client, Good, to compromise it and set up attack from Good against Apache web server on NT primary
– Same sensing and communication by wrapper as above
– M/A migrates web server to Linux
– M/A turns on increased auditing on firewall/gateway as another response
Incremental Implementation Approach (III)
• Level 2 – Uninterrupted Critical User during Above Attack and Migration– Demonstrate uninterrupted user via change ARP table
– Note: we may still lose the response to the last user request for web services
• Level 3 – Block the Attack– Identify the IP address of the attacker and block
attacker at firewall or router, e.g., add blocking command via OPSEC interface or change a rule to shut out attacker’s access to primary
– At some point we’d like to know if the attack was from a compromised weak client or an insider – probably not at this capability level
Incremental Implementation Approach (IV)
• Level 4 – Multiple Critical Applications– Add mail server (Microsoft exchange or some mail server that is
cross platform)
• Level 5 – Critical Application with State– Save state of critical application and migrate
• Level 6 – Same Machine Failover– Use wrappers to ensure non-compromise of OS and failover
critical application to the same machine (i.e., start up clean application on same Primary and kill attacked process)
• Level 7 – Remediation of Compromised Primary
• Level 8 – Other Types of Diversity & Use of Decoys
• Level 9 -- Randomization of Responses
• Note: Levels 4-9 are relatively independent and can be done in parallel or in a different order
Architectural Explorations
• Major focus of HACQIT is to develop Intrusion Tolerant – oops! – I mean Organically Assured and Survivable architecture
• Our levels of capability demonstrations suggest the Subsumption architecture (Brooks, 86)
• Brooks developed the architecture for his famous robot projects but many of our requirements are the same– Need certain amount of “stupid” reactive behavior– Need guaranteed fast response
• Brooks implemented each layer in his architecture as a deterministic finite state machine with simple I/O
• No world model is depended on• Communication from higher to lower levels is done
through suppression and injection
Traditional Architecture
Subsumption Architecture
HACQIT Mapping to Subsumption Architecture from Levels 0-2 Capabilities
• Pre-Level 0 capability features could be lowest layer like Brooks’ “Avoid” module– Unauthorized process on primary boosts CPU utilization above
threshold: kill process, move critical service to backup– Unauthorized modification of file: move critical service to backup
• Level 0 capabilities would be second layer– Wrapper intercepts suspicious call: move critical service to backup– Diversity advantage: Backup runs different OS than primary– TCPDump is turned on after suspicious call is intercepted
(heightened awareness)
• Level 1 would be third layer– Migration is effected without interrupting critical users (change
ARP table)
• Level 2 would be fourth layer– Source address of attack is identified– Address blocked by change to firewall policy
Higher Levels of Capability
• Multiple critical applications• Failover which saves state• Failover on the same machine• Forensics• All may be too complex, long-lived, or require
global information in order to implement• We’re looking at DICAM as architecture for these
functions
DICAM Control
Technology Transfer Exposure Opportunity
• NSWC (Mike Masters) is technology transfer target for Quorum– Annual demonstration in September– Security and intrusion tolerance are of interest– Willing to discuss inclusion of HACQIT in
demonstration – leverages Quorum technologies
– Need to start coordination planning in April
Issues
• Looking for interested potential users for feedback (PACOM, NSWC, other?)
• Need help in getting ACOA server software
• Reuse of research prototypes
Thank you.
Questions?
Backup 1:Towards a Formal Methodology for Responding to Integrity and DOS
IntrusionsJim Just - Teknowledge
Karl Levit, Jeff Rowe, Marcus Tylutki,
Nicole Carlson, Steven Templeton,
Mark Heckman -- UCD
Paradigm for Responding to Integrity Intrusion
REPEAT UNTIL ATTACK SYMPTOMS DISAPPEAR
Detect integrity violation on a critical file
Switchover to backup server; restore prior version of critical file on primary
Use Jigsaw model to determine possible causes and sources of attack
Deploy sensors and responders as determined by model
If attack persists block with responders
Paradigm for Responding to Internal DOS Attack
REPEAT UNTIL ATTACKSYMPTOMS DISAPPEAR
Detect denial of service violation on primary server
Switchover to newly created process on server; kill process causing denial of service
Use Jigsaw model to determine possible causes and sources of attack
Deploy sensors and responders on server and on firewall as determined by model
If attack persists block with responders
Connection Spoofing Attack
• Multiple stage
• Attacker establishes a TCP connection to a host (server) H exploiting a trust relationshiop (through .rhosts) between H and some other host H1.
• Attack involves– denial of service on H1
– Connection number guessing
– Planting a trojan horse on H
• Many variants are possible
• Detection is assumed to occur when .rhosts file on H is erroneously modified
Scenario Attacks: an example
kafka sarte
spock
RSH trust relation: sarte trusts kafka, will execute programs for kafka
Scenario Attacks: an example
(1)Spock launches synflood attack against kafka
kafka sarte
spock
Scenario Attacks: an example
kafka sarte
(2)Spock probes sarte for starting sequence number on RSH port
spock
Scenario Attacks: an example
kafka sarte
spock
(3) Spock sends syn packet to TCP/RSH on sarte w/ source forged to be kafka.
Scenario Attacks: an example
kafka sarte
(4) Sarte sends syn/ack to kafka
spock
Scenario Attacks: an example
kafka sarte
(5)Kafka drops packet due to DoS
spock
Scenario Attacks: an example
kafka sarte
(6) Spock sends forged ack packet to sarte, w/ guessed sequence number.Data in packet,“cat + + >> /.rhosts”adds “all hosts” to sarte’s .rhosts file.
spock
Scenario Attacks: an example
kafka sarte
(7) the attacker rsh’s into sarte as root and installs a sniffer to collect passwords.
spock
Scenario Attacks: an example
kafka sarte
(8) Using one of these he telnets into kafka.
spock
Scenario Attacks: an example
kafka sarte
(9) Once on kafka, the attacker exploits a buffer overflow in amd to gain root privileges.
spock
(10)Attacker then, copies credit card number file back to spock.
HACQIT Actions in Responding to Connection Spoofing Attack
• Detect change to .rhosts file on primary
• Switchover to backup
• Restore previous version of primary, which is now the backup
• Use Jigsaw model of attacks to identify possible causes of integrity problem– Change is legitimate by “clean” process-- no integrity problem
– Change is by an unauthorized process
– Change is by a legitimate rcommand
– Change is by an unauthorized rcommand
HACQIT response to Connection Spoofing (cont)
• HACQIT checks for erroneous processes -- finds none; so conclude change is legitimate or due to an rcommand
• HACQIT starts monitoring for rcommands
• Attack persists, but now on backup with arrival of rcommand
• HACQIT temporarily blocks rcommand until verification
• HACQIT monitoring detects symptoms of connection spoofing attack -- sequence number guessing, DOS on a host
• If traceback to true source s is possible, connections from s are blocked; otherwise, degraded mode (no rcommands)
Example w/ Capabilities
Connection Spoof
Address Forging
ExecuteCommands
Seq # Probe
Packet Spoofing
Synflood
Seq. Number Guess
Prevent Connection Response
RSHActive
Forged Src Address
Spoofed Packet
RSH Connection
SpoofSpoofed
Connection
RemoteLogin
cat + + >> /.rhosts
Remote Execution
Example attack composed of multiple concepts and capbilities
NFS Mount Attack-- Overview
• Certain partitions (directories) of an NFS system running on server H are exported.
• An attacker on host Ha performs information gathering commands remotely on H to identify exported partitions and their owners, e.g. user U.
• Once having this information, attacker creates an account for U on Ha.
• The last step is the erroneous account U mounting an exportable partition.
NFS Mount Attack -- as attack specification
1. rcpinfo –p Target-IP Attacker learns that host H uses NFS daemon and that host H uses an
NFS daemon. The preconditions specify that attacker A has a remote network access to target host H and that host H has IP address Target-IP.
2. Showmount –e Target-IP Attacker learns that host H exports hard disk partition P via NFS.
Preconditions deal with IP addresses and exported services not changing from step 1.
3. showmount –a Target-IP Attacker learns that partition P is locally mounted by H; the
preconditions of the previous steps are unchanged.
NFS Mount Attack (cont.)
4. finger @Target -IP
Attacker learns that user U is currently connected to H and that the ID for user U is Userid. Among the preconditions is that host H provides the finger service.
5. create-account(U, Userid)
The precondition assures that the attacker has an account on some host Ha. After this step, there is an account for U on Ha. Note there are alternatives to this step, such as modifying the password file.
6. mount –t Target-Partiion /mnt
The attacker can now access the directory of U. The preconditions are that A is connected to Ha and that U is the owner of some directory in the exported partition P.
HACQIT response to NFS Mount Attack
• HACQIT detects modification to a critical file• Switchover to backup server; restore file on old primary
which becomes backup• Through model of NFS determine possible causes of
modifiation:– By a legitimate user– By an legitimate user, but spoofed– …
• HACQIT increases monitoring for NFS• Detects a “write” to critical file• Correlates “write” from a user u with information
gathering on the server and for user u
HACQIT response to NFS attack (cont)
• Temporarily, HACQIT operates in degraded mode, disallowing “writes” from unauthenticated users
• Through Jigsaw model of NFS attacks and NFS vulnerability analysis
HACQIT determines “mount” export problem and corrects configuration
Denial of Service Attack
• Compromised client launches a “synflood” attack on sever
• Temporarily, HACQIT blocks all packets from client
• HACQIT identifies possible responses to flooding attack– Block packets at firewall from client
– Kill half-open connections as they appear
– …
• HACQIT chooses 1st response, as it is quickly deployed
• HACQIT identifies user and processes on client responsible for attack, and disables them
Backup 2: Selected HACQIT July PI Slides
HACQIT Schedule
1.3 Integrity Sensor/Maintenance
1.7 Experimentation
1.6 Integration
1.5 Component Enhancement
1.2 System Design
1.1 Program Coordination
1.6.1 Component Integration1.6.2 System Integration
1.5.3 Instrumented Connectors
1.5.2 Fault Tolerance Subsystem
1.2.1 Architecture1.2.2 Design
1.5.1 Migration Subsystem
1.5.5 Sample Application
1.3.2 Wrappers1.3.1 Lightweight Integrity Sensors
1.2.3 Interface Specification
1.4.1 Host Level
1.4.2 Cluster Level
1.5.4 QCS Extensions
Specification Extension
Testing
Integrity Measures
CDL Extensions
Location TransparencyIntegrity Measures
Testing
Sensing
TestingControl Interfaces
Human Oriented Client Server
NRT Distributed Application
Options
2.0 New ITS Technology Integration
3.0 Auto-Generation of Integrity Comp
4.0 Diagnosis and Recovery Extensions
Year 1 Year 2 Year 3 Year 4
1.4 Adpative Response Development
Milestones
• Year 1– Applications
• Office• Email• Collaboration• Intranet web server
– Control• Specification based performance and integrity• Replication and switchover
• Year 2– Applications
• Simple planning application• Network-based military planning• Real time application
• Options– Integration of new ITS technologies– Automatic generation of integrity monitors– Extensions for diagnosis and recovery
Note that these milestones are more aggressive than the
official SOW. Depending on the results of detailed design
effort, some adjustments may be necessary
Technology Transfer
• Who needs intrusion tolerant server capabilities for critical users and services – user pull– Government -- military and civilian– Commercial -- large corporations, ISPs and others who offer out-
sourced application services
• Development and maintenance organizations for above– Government development efforts (e.g., IO COP, GCCS)– Government ACTDs (e.g., AIDE, ACOA, CINC 21?)– Commercial security product/service providers (including
• Distributed hierarchical control paradigm for enclave and wide area protection
• Separation is key requirement
• Anything not explicitly permitted is forbidden
• Intrusion resistance to support intrusion tolerance
• System boundary includes some protection for and control of weak clients
• Keep footprint small -- more active control than redundancy
• Focus on known vulnerable areas, e.g., weak client attacks (a la recent attack against Microsoft)
• Adaptive responses are key research area
General Use Case
• Development & setup: policies, application specs., etc• Operations: Assume a backup (hot or cold)• Detect a performance problem in critical application
– Switchover to backup, increase auditing and sensing levels, determine if cause is an attack, then expunge attack from the primary and block future occurrences of the attack, return
• Detect an integrity problem in critical application (including data files), operating system, or other critical process that indicates an undetected intrusion– Switchover to backup, expunge the attack from the primary, block
future occurrences of the attack, return
• Detect an intrusion:– If intrusion does not constitute a threat to the critical application, then
start a procedure to expunge the attack, if necessary, and block future occurrences, return;
– If attack threatens the critical application, then switchover to backup, expunge the attack from the primary, block future occurrences of the attack, return;
All Programsin the sytemexchange info.With the Name Server
Path Information
Host-ProgramAllocation,Program Actions
Host Information (e.g. Host load)
QoS Spec
QoS Spec
QoS Spec
Program Allocations,Program Actions
Path Load Information,Violations, Latencies
Host LoadInformation
Program Commands
Shell Commands Host Data
Host Information
Path, Programlatencies
Timestamps
Path Information
HardwareMonitors
HardwareMonitors
Desiderata Software Control Flow
SWITCH
UTA Network
Nujersy ( SUN sparc 5)Viper ( PC; WIN NT; Pentium)
Desidrta ( SUN Ultra)Texas ( SUN Ultra)
Virginia ( SUN sparc 5)
Stealth ( PC; WIN NT; Pentium)
Mustang ( PC; WIN NT; Pentium)
PC
NS
QM
SB HA
HB
RM
HM SD
HCI
SD
SD
SD
HM
HM
HM
NS
SB
HB
Name Server
System Broker
Host Broker
QM
SD
HM
PC
RMHA
Startup Daemon
Host Monitor
Host Analyser Resource Manager
QoS Manager
Program Control
MIDDLEWARE - TESTBED HARDWARE MAPPING
Radarconsole
HCI Human -Computer Interface
Mapping of Desiderata Middleware to Distributed Hardware
Desiderata’s Real-time Path Paradigmse
nsor
s
actu
ator
s
engagement
situation assessevent
eventmonitor & guide
Dynbench Suite of Real-time Paths
Sensor FM SensorFilter EDM
SensorED
AM
SensorAction
MGMSensorMG
Radar Display
Actuator
Scenario file Doctrine file
Doctrine file
EG
Command fileUser
Situation Assess
Engagement
Monitor&Guide
Dynbench Subsystems (Paths)
• Situation assessment– Filter Manager (FM): receives radar tracks from the
sensor and divides them among the filter programs
– Filter: correlates point data of track into equations of motion of the track body
– Evaluate and Decide Manager (EDM): distributes workload among ED programs
– Evaluate and Decide (ED): determines if current position of radar track is within critical region
Dynbench Subsystems (Paths)
• Engagement– Action Manager (AM): receives threat tracks from ED and divides
them among the action programs
– Action: receives threat tracks from AM and commands actuator
– Actuator: receives action and executes
• Monitor and Guide– Monitor and Guide Manager (MGM): receives threat tracks and
interceptors from ED and divides them among the MG programs
– Monitor and Guide (MG): receives threat tracks along with interceptors, updates position of interceptor according to the position of the threat track