Research to Support Robust Cyber Defense Fred B. Schneider Study commissioned for Dr. Jay Lala DARPA Information Technology Office
Dec 17, 2015
Research to Support
Robust Cyber Defense
Fred B. Schneider
Study commissioned for Dr. Jay LalaDARPA
Information Technology Office
2
Study Committee
Jim Anderson, University of North CarolinaStephanie Forrest, University of New MexicoCarl Landwehr, National Science FoundationTeresa Lunt, Palo Alto Research CenterMike Reiter, Carnegie-Mellon UniversityFred B. Schneider, Cornell University (chairman)Kishor Trivedi, Duke University
3
Study Process
Tarek Abdelzaher, Univ Virginia
Massoud Amin, EPRI Anish Arora, Ohio State Univ Steve Bellovin, ATT Ken Birman, Cornell Univ Alan Demers, Cornell Univ Steve Goddard, Univ Nebraska
Mohamed Gouda, Univ Texas
Ted Herman, Univ Iowa Erica Jen, Santa Fe Institute Chandra Kintala, Avaya Simon Levin, Princeton Univ Alfred Spector, IBM Rsch Wietse Veneme, IBM Rsch
Two meetings in Washington, DC Briefings from subject-matter experts
4
Study Goals
Strategy for defense: [Prior]: Prevention eliminates vulnerabilities. [Near term]: Render ongoing attacks ineffective
through dynamic changes to state. [Longer term]: Alter vulnerabilities (viz co-
evolution). [Eventually]: Self-repair of identified problems.
Identify research areas to enable the design and implementation of networked computer systems that tolerate attacks and failures by automatically changing state or structure during execution.
5
Presentation Outline
Where is industry heading. A characterization of robustness. New points of leverage. Complementary research.
6
Industry Context: IBM
IBM perceived customer concerns:– Total Cost of Ownership.
Solution: self-managing / self configuring systems
– Emphasis on Quality of Service.Solution: self-optimizing / scalability + isolation
– Flexibility to deploy new applications.
New IBM initiative: autonomic computingNot concerned with Byzantine failures or highly malicious attacks.
7
Industry Context: Microsoft
Microsoft perceived customer concerns:– Total Cost of Ownership.
Solution: automatic patch and upgrade
– Harness the network.Solution: interoperability and transparency
– Security [Bill Gates internal memo on “Trustworthy Computing”, approx Jan 16, 2002]
Trade assurance for bugs and complexity (?)
New Microsoft initiative: .NETNot concerned with Byzantine failures or highly malicious attacks.
8
Industry Context: Power Grid
EPRI perceived concerns:– Reliability of grid
Propagation of effect. Operation with reduced capacity cushion.
– Move to decentralized, market-based control.
Separate delivery channel and control channels.
Little concern about about hostile attacks (either in delivery channel or control channel).
9
Summary:
Industry versus DoD Needs
Attacks
Failures
malicious
random
benign Byzantine
DoD NeedsIndustry Direction
10
Addressing DoD Needs:
Dimensions of Robustness [S. Levin]
Redundancy Modularity
Diversity
Robustness =
The time is right to exploit new opportunities!
11
Addressing DoD Needs:
New Research Opportunities
– Temporal and spatial run-time diversity.– Scalable redundancy.– Self-stabilization.– Natural robustness via biological
metaphors and systemic effects.
12
Research Thrust:
Run-time Diversity
Limited success to date: Obtaining diversity manually is expensive.
– Multiplies costs associated with: design implementation test
– Integration and interoperation expensive.
Obtaining diversity automatically has not been explored aggressively.– Modern compiler technology could help here.– Run-time environments also possible leverage points
13
Creating Diversity at Run-time
Run-time diversity is associated with– randomness -or- – non-determinacy.
The impact will depend on where it is applied:– application level programs– system level programs– generation of application/system.
14
Run-time Diversity in Cryptography
Recent crypto advances introduce:
Spatial diversity: Different components hold different, but related, secrets.– Compromising one doesn’t compromise all.
Temporal diversity: Secret state changed from time to time.– Limits adversary’s abilities after compromises.
15
Example of Spatial Diversity in Cryptography:
Function Sharing
s4s3s2s1
serverservice
Public K / private k = [s1, s2, s3, s4]
m
sig combine({pr1, pr2, pr3})verify(K, m, sig) succeeds
pr1 pr2 pr3
16
Example of Temporal Diversity in Cryptography:
Forward-Secure Signatures
i ki
ki+1i+1
(roll forward)
Private keyTime period
verify(K, m, i+1, sign(ki+1, m)) succeeds
verify(K, m, i, sign(ki+1, m)) fails
Public key
K
K
17
Example of Spatial and Temporal Diversity:
Proactive Function Sharing
s4s3s2s1
t4t3t2t1
serverservice
Public K / private k = [s1, s2, s3, s4]
Public K / private k = [t1, t2, t3, t4]
18
Run-time Diversity in Cryptography:
Next Steps
Deploy principles of crypto run-time diversity (both spatial and temporal) in the construction of distributed services.
Leverage existing crypto diversity more broadly:
Practical multi-party computation? (= “Spread-spectrum” computing.)
19
Research Thrust:
Scalable Redundancy
Redundancy has been widely studied as a method to achieve fault tolerance:– Replication of servers– Redundant routing
The key problem now is scalability.
20
Scalable Redundancy:
Central Challenge
Scalable methods for handling redundancy provide new—often weaker—types of guarantees:– Probabilistic– Eventual consistency– Monotonic convergence
How to build systems with these new guarantees?– Transform weak guarantees into stronger ones?– Settle for combinations of the new guarantees?
21
Example of Scalable Redundancy:
Epidemic and Gossip Protocols
Key characteristic: Information exchanges involve randomly or opportunistically chosen gossip partners.
Resulting protocols are:– fault-tolerant– scalable, and– self-organizing
The few actual deployments are promising:– Xerox PARC Clearinghouse Replicated Database– MIT Lazy Replication– Xerox Bayou database system– Astrolabe distributed spreadsheet
22
Example of Scalable Redundancy:
Quorum Systems
Key characteristic: Operations access quorums of servers. Quorums can be a subset of all servers.
quorum quorum
23
Scalable Redundancy:
Next Steps Accommodate weaker properties of scalable
redundancy technologies in higher-level apps.
Use realistic network topologies:– Irregularity in interconnection.– Clustering and non-uniform link bandwidths.
Understand and exploit interactions with QoS:– Implement QoS guarantees using gossip protocols.– Leverage existing QoS guarantees in gossip
protocols. Understand and exploit threshold phenomena.
24
Research Thrust:
Self-Stabilization
Key characteristic: System eventually transitions to normal operating states in response to arbitrary transitions (to arbitrary states).
fault/attack
good
bad
Self-stabilization expands the diversity of states from which a system can operate.
More states: Fewer assumptions. Fewer vulnerabilities.
25
Self-Stabilization:
Hallmarks of Systems
Highly decentralized: Convergence is an “emergent property” and error states are tolerated without being detected.
Forgetful: State is regenerated; old state is forgotten.
26
Self-Stabilization:
Promise of Success The few actual deployments are promising:
– SUN’s Netra Proxy Server– MS Research Aladdin Lookup Service– DEC/Compaq Autonet Configuration Protocols
Self-stabilization well suited to network protocols, where transient disruptions are already tolerated by upper system levels.
27
Self-Stabilization:
Next Steps
How might self-stabilization be extended? Convergence from only some configurations. Distinguish state components (e.g. keys, secrets, models of
reality) and have only some converge. Scalability?
System size, convergence time, severity of transient. Dimensions of containment:
Space: bound infection / contamination. Time: speed for convergence. Safety: how badly is function degraded during repair.
Composition and control: Go beyond control structure to abstract data types, etc. Develop basis for compositional construction.
28
Research Thrust:
Natural Robustness
Biological and other robustness metaphors…– Work at multiple levels:
Time scale (lifetime of organism vs species). Structure (cell vs organism vs eco-system).
– Hallmarks of such robustness: Robustness at one level translates into robustness
at a different level. Highly decentralized: Convergence is an “emergent
property.” Widespread use of diversity. Adaptive and always evolving. Use disposable components.
29
Natural Robustness:
Leveraging Systemic Effects
Natural robustness gains much from systemic effects. So can we.– Epidemiology
Logarithmic delays
– Percolation theory Critical point phenomena Bimodal behaviors
– Graph theory Small-world phenomena
30
Natural Robustness:
Promise of Success
The few actual deployments are promising:– Artificial immunology applied to cyber-security, robotics,
and data mining. Convergence: biology computing
– Trends in computing have biological interpretations: Software Rejuvenation (e.g. Apache web server).
– Biology making greater use of computing: Gene-expression analysis, phylogenetic tree reconstruction,
cell signaling models, minimal cell project, smart matter.
31
Natural Robustness:
Next Steps (1) Pair new results from biology with robustness
challenges in computer networks. – Exploit information about software evolution.
E.g., Phylogenetic trees for predicting vulnerabilities.– Intra-cellular signaling and cascades (chemostaxis). – Inter-cellular signaling networks (e.g., immune
systems).– Genetics:
Genetic buffering. Individual gene repairs. Evolutionary mechanisms (genotype/phenotype
mappings). – Ecosystem modeling:
Diversity, keystone species, patch models, allometry, resource flows.
32
Natural Robustness:
Next Steps (2)
Further utilize systemic effects in networked systems:– Epidemic and gossip protocols.– Survivability of computer networks.– Propagation of power failures in
electrical grids.– Epidemiological approaches to
computer viruses.
33
Robust Cyber Defense: Complementary research (1)
Support for on-the-fly system change:– Software rejuvenation (refresh data or
environment)– Control structure/data rep change– Adaptive fault-tolerance (ftol asmpt change)– Self-healing real-time schedulers
Enhanced detection:– Growing memory size, enables rollback to a
previous state– Application-specific monitoring
34
Robust Cyber Defense Complementary Research (2)
Machine learning– Reinforcement learning (to adjust
parameters in accordance with new information or feedback).
– Genetic programming (to evolve small software components).