Department of Electrical and Computer Engineering Kekai Hu, Harikrishnan Chandrikakutty, Deepak Unnikrishnan, Tilman Wolf, and Russell Tessier Department of Electrical and Computer Engineering University of Massachusetts Amherst High-Performance Hardware Monitors to Protect Network Processors from Data Plane Attacks
30
Embed
High-Performance Hardware Monitors to Protect Network Processors from Data Plane Attacks
High-Performance Hardware Monitors to Protect Network Processors from Data Plane Attacks. Outline. The problem: software attacks on network routers Routers now include programmable processors Our solution: include a monitor for the processor Software to generate the monitor information - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Department of Electrical and Computer EngineeringUniversity of Massachusetts Amherst
High-Performance Hardware Monitors to
Protect Network Processors from Data Plane Attacks
Russell Tessier 2
Outline The problem: software attacks on network
routers • Routers now include programmable processors
Our solution: include a monitor for the processor Software to generate the monitor information Implementation on DE4 NetFPGA board Experimental results
Russell Tessier 3
Computer Networks Networks provide
connectivity between end-systems
Success of the Internet: hourglass architecture
Success is also a problem: diverse apps, diverse systems
Changing requirements for the network layer
Layered protocol stack
Physical layer
Link layer
Network layer
Transport layer
Application layer
IP
UDP TCP
HTTP
TLS/SSL
DNS BGP
SIP
Ethernet
DSL FDDI
1000BASE-T
SONET/SDH802.11a/b/g/n
RS-232
...
...
...
...
Example protocols
Russell Tessier 4
Network Architecture Extensions to current
Internet Customization of data
plane Requires router systems
with ability to adapt
`
End system:- IP security- TCP termination
Server:- Content-based switching- Firewall- SSL termination- IP security
co-located witheach processor core• Core reports hash of
each executed instruction
Monitoring graph repre-sents correct behavior• Obtained from offline
analysis of binary• Deviations trigger reset
Change of software easy• Just need matching
monitoring graph
networkprocessor
core
instruction memory
data memory
packet buffer
processing code
network interface
comparison logic
mon. memory
mon. graph
ne
two
rk p
roce
sso
r
ha
rdw
are
mon
itor
hash of processinginstruction
reset/recovery
processing codebinary
NFA monitoringgraph
DFA monitoringgraph
NFA-to-DFA transformation
off
line
an
aly
sis
run
time
op
era
tion
Russell Tessier 10
Offline Analysis of Processing Binary Executed instruction reported by core as 4-bit hash
• Hash combines address, opcode, registers• Hash allows for compact representation of information
Monitoring graph• Each instruction represented as a state• Edges correspond to execution of instruction• Control-flow operations lead to multiple possible next states […] 49c: 97c20010 lhu v0,16(s8) 4a0: 00000000 nop 4a4: 2c420033 sltiu v0,v0,51 4a8: 1440000a bnez v0,4d4 4ac: 00000000 nop 4b0: 3c026666 lui v0,0x6666 4b4: 34430191 ori v1,v0,0x191 4b8: 97c20010 lhu v0,16(s8) […]
and branch information Powerset construction to perform NFA-
to-DFA transformation. Memory initialization file is loaded into
memory
MIPS-GCCcompiler
BenchmarkSource code
Memory generator module
NFA to DFA conversionmodule
InstructionInfo
BranchInfo
Instruction Info
Branch Info
Memory initialization file
Russell Tessier 20
Evaluation Monitoring speed
• Single memory access
• Lookup into fixed-size register file
Memory size of monitor• More states due to NFA-to-DFA conversion• More states due to multiple entries in memory for certain states• In practice, overhead is below 10%
Very fast and compact hardware monitor
Benchmarks
21
• NpBench [1]– Modern network applications– Three specific functional groups
• Traffic management and quality of service group• Security and media processing group• Packet processing group
[1] B.K. Lee and L.K. John, "NpBench: a benchmark suite for control plane and data plane applications for network processors," in Proc of 21st International Conference on Computer Design, vol., no., pp. 226- 233, 13-15 Oct. 2003.
Russell Tessier 22
Prototype Implementation on FPGA Small overhead compared to processor core (4096 states):
Correct operation: attack packet detected and dropped
Russell Tessier 23
Attack with Defense in Place Attack packet dropped, router continues to operate
Russell Tessier 24
Throughput
24
Throughput performance of the network processor with security monitor• CM protocol and IPV4 application
Russell Tessier 25
Multicore Monitor Dynamic workloads pose problem for
hardware monitor• Processing may differ between packets• Monitors need to match processing
Mapping between processors and monitors• 1-to-1 mapping requires frequent reload of monitor• Any-to-any mapping costly to implement• Clusters with n-to-m mapping provide balance
Interconnect is configured dyna-mically depending on workload• Mapping between core and
monitor
core
monitor
core
monitor
core
monitor
core
monitor...
...
core core core
monitor monitor monitor monitor
...
...
core core
monitor monitor monitor
...
core core
monitor monitor monitor
...
...
...
...
Russell Tessier 26
System Architecture of Clustered System Multiple cores can access multiple monitors
• Dynamic configuration of crossbar
Secure loading of monitors through external interface
Proc Proc Proc... Proc Proc Proc
...
Proc Proc Proc...
n Processors
Inter-coreInterconnect
External Memory
Crossbar Crossbar Crossbar
Mon Mon Mon...
m Monitors
Mon Mon Mon... Mon Mon Mon...
... ...
AESCentralized
MonitorMemory
Control Processor
...
External Interface
ControlSignals
Network Interface
Russell Tessier 27
Cluster Design Simple implementation of clustered monitor
• Dynamic configuration through programming of demultiplexers
NP Core
32
Hash
MonitorSelect
3
4Hash_1
R/R fromMonitor_1 to 6
Reset/Recover
NP Core
Hash
Hash_2
NP Core
Hash
Hash_4
Monitor_1
Hash 4
FromHash_1 to 4
Reset/Recover
1
Monitor_2 Monitor_3 Monitor_4 Monitor_5 Monitor_6
Proc Select
2
NP Core
Hash
Hash_3
1
4
Russell Tessier 28
Dual-Ported Monitor Implementation Memory of monitor can be shared between two monitors
• Effective use of dual-ported memory• Two monitoring graphs can be used in parallel
One-hotEncoding
HashComparison
4-bit HashFunction
16
4
32
1
ProcessorInstruction
Reset/Recover
NextStateSelect
4
GraphSelect
K-1
K
1
Next State
Valid Hashes
One-hotEncoding
HashComparison
4-bit HashFunction
4
16
NextStateSelect
GraphSelect
1
K
4
14
Next StateValid Hashes 14
1
Reset/Recover
ProcessorInstruction
32
Monitor 1
Monitor 2
Monitoring Graph 2
Monitoring Graph 1
Russell Tessier 29
Prototype Implementation on FPGA
Resources cost for a multi-core system (4 cores, 6 monitors):
Russell Tessier 30
Conclusions Current and future Internet needs to meet new demands Programmable routers provide packet processing platform
• Systems problem: security vulnerabilities• Attacks can be launched within data plane (i.e., not control access)• Monitor-based hardware defense mechanism is effective
Hardware monitor design and prototype• Uses compact DFA (less than 10% more states than NFA)• Verification with single memory access per instruction• Defense shown for Harvard architecture attack
Technique extended to multicore network processors Promising defense against attacks on network infrastructure