Top Banner
Architecting for the Cloud Len and Matt Bass Elasticity
187

Architecting for the cloud elasticity security

Aug 27, 2014

Download

Software

Len Bass

This is day 3 of the course Architecting for the Cloud
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Architecting for the cloud elasticity security

Architecting for the Cloud

Len and Matt Bass

Elasticity

Page 2: Architecting for the cloud elasticity security

Link to Yesterday’s lectures

http://www.slideshare.net/lenbass/architecting-for-the-cloud-scabilityavailability

Page 3: Architecting for the cloud elasticity security

Topics

Scalability is about acquiring resources but once they are acquired, they still must be used.

Elasticity is about how to use the resources.

This requires understanding

• Concurrency

• State

and their interactions

3

Page 4: Architecting for the cloud elasticity security

What is concurrency? • Concurrency means

performing several activities simultaneously

• Concurrency is used to improve performance.

4

Page 5: Architecting for the cloud elasticity security

How do concurrent activities come to be?

• Explicitly through your code creating a new thread or process.

• Implicitly through some support system creating a new thread or process – Operating system – Web server – Database management system

• Implicitly through the infrastructure creating a new virtual machine – Elasticity in the cloud – During deployment of your system

5

Page 6: Architecting for the cloud elasticity security

Key concepts

• Atomicity – An atomic operation cannot be divided. It is all or nothing.

• Time – It takes time to perform an operation.

• Computation • Messages transferred over a network • Reading/writing information from a disk (rotating or solid state)

• Dependency – Coordination among concurrent activities is necessary if

they are sharing resource or results

• Problems arise because operations take time and can be interrupted. I.e. are not atomic.

6

Page 7: Architecting for the cloud elasticity security

Synchronous vs asynchronous

• Synchronous coordination between two concurrent processes means that process A sends a message for process B and waits for a response.

• Asynchronous coordination means that process A does not wait for a response.

– It can poll for a response

– A response from process B can be sent as an event.

• In either case, coordination takes time and so coordination is not an atomic operation.

7

Page 8: Architecting for the cloud elasticity security

Some problems with concurrent activities

• Time stamps. • Many protocols involve putting a time stamp on messages

for error detection and ordering purposes.

• Time stamps are often used to identify log messages used for debugging problems.

• In some environments, e.g. stock market, trades must be satisfied in the sequence in which they arrive.

• Race conditions – two processes are simultaneously accessing the same resource.

• Inconsistency – If two activities are being performed simultaneously, data may become inconsistent.

8

Page 9: Architecting for the cloud elasticity security

Clock synchronization

• Suppose two different computers are connected via a network. How do they synchronize their clocks?

• If one computer sends its time reading to another, it takes time for the message to arrive.

• NTP (Network Time Protocol) can be used to synchronize time on a collection of computers. – Accurate to around 1 millisecond in local area networks

– Accurate to around 10 milliseconds over public internet

– Congestion can cause errors of 100 milliseconds or more.

9

Page 10: Architecting for the cloud elasticity security

Suppose NTP is insufficiently accurate

• Financial industry is spending 100s of millions of dollars to reduce latency between Chicago and New York by 3 milliseconds. – Well within error range of NTP

• GPS time is accurate within – 14 nanocseconds (theoretically) – 100 nanoseconds (mostly)

• Timestamp messages with GPS time – Used by electric companies to measure phase angle – Used by Google to coordinate time across all of their distributed

systems. – Requires specialized hardware and installation not yet cheaply

available.

10

Page 11: Architecting for the cloud elasticity security

Example of a race condition • Suppose withdrawals are being made from a bank

account. If there are two users simultaneously withdrawing, the following sequence can occur.

11

User 1 User 2 Acct amount

1000

Read account (1000) 1000

Read acount (1000) 1000

Withdraw 100 (900) 1000

Write new amount (900) 900

Withdraw 100 (900) 900

Write new amount (900) 900

Page 12: Architecting for the cloud elasticity security

Example of inconsistency

• A cache is frequently used to keep data locally rather than requiring it to be fetched for each request. Web browsers, for example, cache web pages.

• For every request, the sequence is 1) look in cache to see if the request can be satisfied with the contents of the cache

2)If no, then retrieve information and return it to the requester and place it in the cache.

• Now suppose the web page is changed at its source

• Retrievals of the web page from the cache will retrieve an out of date version of the web page.

12

Page 13: Architecting for the cloud elasticity security

Solutions bring new problems

• One technique to prevent race conditions is to lock critical resources.

• Can lead to deadlock – two processes waiting for each other to release critical resources – Process one gets a lock on row 1 of a data base – Process two gets a lock on row 2. – Process one waits for process 2 to release its lock on

row 2 – Process two waits for process 1 to release its lock on

row 1 – No progress.

13

Page 14: Architecting for the cloud elasticity security

Yet more problems

• Locks are logical structures maintained in software or in persistent storage.

• Getting a lock across distributed systems is not an atomic operation. – It is possible that while requesting a lock another

process can acquire the lock. This can go on for a long time (it is called livelock if there is no possibility of ever acquiring a lock)

• Suppose the virtual machine holding the lock fails. Then the owner of the lock can never release it.

14

Page 15: Architecting for the cloud elasticity security

Is there a solution?

• The general problem is that you want to manage synchronization of data across a distributed set of servers where up to half of the servers can fail.

• Paxos is a family of algorithms that use consensus to manage state concurrency. Complicated and difficult to implement.

• An example of the problems – Choose one server as the master that keeps the

“authorative” state. – Now master server fails. Need to

• Find a new master • Make sure it is up to data with the authoritative state.

Page 16: Architecting for the cloud elasticity security

Luckily

• Several open source systems are now available that

– Implement Paxos or an alternative consencus algorithm

– Are reasonably easy to use.

• Two such systems are

– Memcached – discussed at the end of this lecture

– Zookeeper – discussed in tomorrow’s lecture.

Page 17: Architecting for the cloud elasticity security

In general

• Introducing concurrency will improve performance but also introduces problems.

• Concurrency is a constant consideration when architecting for the cloud. – Coordinating activities across concurrent processes is

difficult and prone to many errors. – Allowing for failure complicates coordination of

activities.

• Systems are available to provide concurrency for small amounts of data without your having to worry about the details.

17

Page 18: Architecting for the cloud elasticity security

Topics

In order to understand how to achieve elasticity you must understand

• Concurrency

• State

and their interactions

18

Page 19: Architecting for the cloud elasticity security

Recall Load Balancer

• Client makes a request that is routed to a server through a load balancer

Page 20: Architecting for the cloud elasticity security

Message sequence – client makes a request

Servers

Clients

Load Balancer

Page 21: Architecting for the cloud elasticity security

Message sequence- request arrives at load balancer

Servers

Clients

Load Balancer

Page 22: Architecting for the cloud elasticity security

Message sequence – request is send to one server

Servers

Clients

Load Balancer

Page 23: Architecting for the cloud elasticity security

Message sequence – reply goes back to client

Servers

Clients

Load Balancer

Page 24: Architecting for the cloud elasticity security

Message sequence – now client makes second request – does it matter which server it goes to?

Servers

Clients

Load Balancer

? ? ?

Page 25: Architecting for the cloud elasticity security

“Sticky” http requests

• Normally load balancer will route requests depending on load of servers attached to it.

• This is why it is called “load balancer” • Client can request to be always routed to same

server. This is done by making a “sticky” http request.

• Dangerous for two reasons: – Server may be overloaded and response delayed – Server may have failed and no response is

forthcoming.

• We assume non sticky http requests.

Page 26: Architecting for the cloud elasticity security

Suppose message is routed to an arbitrary instance.

• Understanding what happens requires a digression into state.

• A computation has two inputs

– Instructions

– Data

• The data input of a computation is called the state.

Page 27: Architecting for the cloud elasticity security

How does this work with functions?

• Consider a function that counts how many times it is called.

• Option 1:

int countv1()

{

int i = 0; //declare i and initialize it to 0.

i = i + 1; //add 1 to the last value of i

return i;

}

• The function count remembers i from one call to the next.

• State is maintained inside the function – it is stateful

27

Page 28: Architecting for the cloud elasticity security

Option 2

int countv2(int i)

{

int a;

a = i + 1; //add 1 to the last value of i

return a;

}

• The function count does not remember the value of i from one call to the next.

• The client must pass the last value returned.

• State is passed into the function. The function is stateless

28

Page 29: Architecting for the cloud elasticity security

Option 3

int countv3()

{

int a;

a = dbase_get (“count”); //retrieve current value

a = a + 1; //add 1 to the last value of a

dbase_write(“count” a); //save current value

return a;

}

• The count is stored in a database.

• Neither the client nor the function remembers the value.

• The function is stateless.

29

Page 30: Architecting for the cloud elasticity security

What is the difference?

• In option 1, the function kept track of the count value.

• In option 2, the client must keep track of the count value.

• In option 3, the count value is kept in an external database.

• In each case, the state (count value) must be kept somewhere.

30

Page 31: Architecting for the cloud elasticity security

Suppose the functions are packaged as processes in virtual machines

Option 1 Option 2 Option 3

Countv2 Countv3 Countv1

Client

DB

Page 32: Architecting for the cloud elasticity security

Processes communicate via messages

• Message from client to process is call

• Message from process back to client is return of a value

32

Page 33: Architecting for the cloud elasticity security

Now suppose each process has two clients – what is computed by option

1?

Countv1

Page 34: Architecting for the cloud elasticity security

What is computed by option 2?

Countv2

Page 35: Architecting for the cloud elasticity security

What is computed by option 3?

Countv3

DB

Page 36: Architecting for the cloud elasticity security

Where state is kept matters

• Option 1 – counts number of times called by either client. Process remembers value

• Option 2 – counts number of times called by each client. Client remembers value

• Option 3 – counts number of times called by either client. Database remembers value.

Options 1 & 3 calculate different things than option 2.

36

Page 37: Architecting for the cloud elasticity security

Now suppose each process has two instances– remember the load balancer

Countv1 Countv1

Load balancer distributes messages to servers

Page 38: Architecting for the cloud elasticity security

What is computed by option 1?

38

Countv1 Countv1

Page 39: Architecting for the cloud elasticity security

What is computed by option 2?

Countv2 Countv2

Page 40: Architecting for the cloud elasticity security

What is computed by option 3 ?

Countv3 Countv3

DB

Page 41: Architecting for the cloud elasticity security

Now what do the options compute?

• Option 1 – each instance of the function countv1 computes how many times it was invoked

• Option 2 – each instance of the function countv2 computes how many times each client invoked either instance

• Option 3 – the database contains the number of times either instance was invoked by either client.

41

Page 42: Architecting for the cloud elasticity security

What have we seen?

• When there was one instance of a client and one instance of the count process- all three versions were identical

• When there were two clients and one instance of the count process– two versions were the same, one was different

• When there were two clients and two instances of the count process– all three versions produced different results.

42

Page 43: Architecting for the cloud elasticity security

Message so far

• How state is managed is important and will lead to different results when there are multiple instances of clients or functions.

• Now we return to elasticity

• Remember the sequence?

43

Page 44: Architecting for the cloud elasticity security

Message sequence – client makes a request

Servers

Clients

Load Balancer

Page 45: Architecting for the cloud elasticity security

Message sequence- request arrives at load balancer

Servers

Clients

Load Balancer

Page 46: Architecting for the cloud elasticity security

Message sequence – request is send to one server

Servers

Clients

Load Balancer

Page 47: Architecting for the cloud elasticity security

Message sequence – reply goes back to client

Servers

Clients

Load Balancer

Page 48: Architecting for the cloud elasticity security

Message sequence – now client makes second request – does it matter which server it goes to?

Servers

Clients

Load Balancer

? ? ?

Page 49: Architecting for the cloud elasticity security

It depends where state is kept

• If state is kept in the client, then it does not matter since the client keeps track of the calls

• If state is kept in a database then it does not matter since the results are kept external to the servers

• If state is kept in the server then it does matter since sending message back to server 1 will give different result than sending it to server 2.

Page 50: Architecting for the cloud elasticity security

Keeping servers stateless enables elasticity

• A new instance of a server can be – Created/stopped

– Registered /unregistered with the load balancer

– Placed in/removed from service

without Requiring the client to be aware of which server instance it is interacting with

Requiring that clients be notified if a server is taken out of service

Page 51: Architecting for the cloud elasticity security

Types of State

• Session state

• Client side state

• Server side

• Persistent

Page 52: Architecting for the cloud elasticity security

What is a session?

• A session typically refers to a series of interactions between one client login to a system and the termination of that login – whether through logging out or through timing out.

• A session can also span multiple logins. E.g. Netflix keeps track of where you are in a movie and returns you to that location the next time you log in.

Page 53: Architecting for the cloud elasticity security

Session State

• Session state is information that persists for a session. We are considering a single login here. The multiple login case is a special case of persistent state.

• What happens when you login – When you successfully login to a service, the service

returns a code that identifies you. This is the session ID.

– Other information can also be included such as MAC address (to prevent man in the middle attacks).

– It is typically managed on the client side. Your browser does all of this.

Page 54: Architecting for the cloud elasticity security

Client Side State

• This can be difficult if there is significant state to save, however

– This means you’ll need to pass all of this state with each request

– This requires more network overhead

• This also means you’ll need to store data on the client machine

– This can have security implications

Page 55: Architecting for the cloud elasticity security

Stateful Services

• If your services are stateful that makes scalability more difficult

• If you’re able to design your system such that the services are stateless you’ll make scaling much easier

• If an operation is dependent on the results of a previous operation it’s more difficult to make services stateless

Page 56: Architecting for the cloud elasticity security

Management of state between services and persistent tier

• Non client side state can be either kept in the services or in a persistent store.

• The choice depends on the volume of data, the latency involved, the synchronization needs for the servers and the time the state is expected to persist.

Page 57: Architecting for the cloud elasticity security

Important latency numbers

• Main memory reference 100 ns • Send 1K bytes over 1 Gbps network 0.01 ms • Read 4K randomly from SSD. 15 ms • Read 1 MB sequentially from memory 0.25 ms • Round trip within same datacenter 0.5 ms • Read 1 MB sequentially from SSD 1 ms (4X memory) • Disk seek 10 ms (20x datacenter roundtrip) • Read 1 MB sequentially from disk 20 ms (80x memory, 20X SSD) • Send packet CA->Netherlands->CA 150 ms

57 * dean-keynote-ladis2009_scalable_distributed_google_system

Page 58: Architecting for the cloud elasticity security

Implications of latency numbers

• State stored in persistent storage (disk or SSD) will take longer to fetch than state stored in memory.

• State stored in a different datacenter will take longer to access than state stored locally, especially across continents.

• Persistent store is typically replicated both for performance (latency) reasons and for availability (failure) reasons.

• => keeping data consistent across different occurrences of it is important but difficult.

Page 59: Architecting for the cloud elasticity security

Topics

In order to understand how to achieve elasticity you must understand

• Concurrency

• State

and their interactions

59

Page 60: Architecting for the cloud elasticity security

Keeping data consistent

• We will discuss persistent data consistency when we discuss databases.

• Memcached is an open source tool that provides in-memory synchronization of data across different instances of a service.

Page 61: Architecting for the cloud elasticity security

• Now consider these layers deployed onto multiple servers.

Layers of a service

Business logic for the service

Memcached

Page 62: Architecting for the cloud elasticity security

Memcached in multiple servers

• Memcached keeps small amount of state in all servers consistent.

• At a small cost in latency as long as they are in same physical location.

Memcached Memcached

Business logic

Business logic

Page 63: Architecting for the cloud elasticity security

When to use Memcached

• Data must be synchronized among servers.

• Memcached takes care of concurrency issues

• Data is relatively small – One object < 1MB

– Total memory used per server depends on how much you are willing to give it per server since it is stored in memory, not on a persistent store

• Lifetime of the data should not exceed time any of the servers are alive. I.e. if all the servers die, then the data disappears.

Page 64: Architecting for the cloud elasticity security

Summary

• The cloud doesn’t guarantee elasticity

• You’ll need to design your system to be elastic

• State management, your storage solution, and consistency, are all factors that you’ll need to consider

Page 65: Architecting for the cloud elasticity security

QUESTIONS?

Page 66: Architecting for the cloud elasticity security

Architecting for the Cloud

Introduction to Security

Page 67: Architecting for the cloud elasticity security

Agenda

• What is security?

• Understanding the threat

• Architectural approaches to security

• Designing for security

• Summary

Page 68: Architecting for the cloud elasticity security

Agenda

• What is security?

• Understanding the threat

• Architectural approaches to security

• Designing for security

• Summary

Page 69: Architecting for the cloud elasticity security

Your Experience

• Think about your past experience

– How have you thought about security?

– What steps have you (or your organization) taken to protect the system?

• Do you remember Assignment 2?

– Security was equivalent to having a login feature or encryption

Page 70: Architecting for the cloud elasticity security

Security … What is it?

• What do we mean when we say security?

• In your experience what does this mean?

Page 71: Architecting for the cloud elasticity security

Let’s Look at some Examples

Page 72: Architecting for the cloud elasticity security

Fort Knox

• Fort Knox is a US Army post in Kentucky

• In addition to housing various US Army functions it is also the home to a gold bullion depository

– 5000+ tons of gold housed there

Page 73: Architecting for the cloud elasticity security

Security

• What is the business asset that needs protection in this case?

• What does protect mean here?

Page 74: Architecting for the cloud elasticity security

What About the CIA?

• The Central Intelligence Agency (CIA) is a US civilian intelligence organization

• Primary purpose is to collect information about foreign governments, corporations, and individuals

• It uses this information to influence public policymakers

– It does at times engage in tactical operations as well

Page 75: Architecting for the cloud elasticity security

Security

• What is the business asset that needs protection?

• What does protect mean in this case?

Page 76: Architecting for the cloud elasticity security

Power Distribution

• What would security mean if you have a system that manages the power grid?

Page 77: Architecting for the cloud elasticity security

Business Context

• The business need differs from one context to another

• Organizations have assets they need to protect

• They need to protect these assets for different reasons – Business continuity

– Liability reasons

– Regulation

– Protection of IP

– …

Page 78: Architecting for the cloud elasticity security

Security – A Set of Concerns

• The related concerns are typically classified as “security” concerns

• In software these concerns are typically:

– Confidentiality

– Data integrity

– Non repudiation

– Availability

Page 79: Architecting for the cloud elasticity security

Confidentiality

• The property that reflects the extent to which:

– Data and services are only available to those that are authorized to access them

• Is this a concern for a Museum? How about a Financial Institution?

Page 80: Architecting for the cloud elasticity security

Integrity

• This property can also refer to data or services

• It reflects the extent to which data or services can be delivered as intended

• E.g. hopefully the grade that we have recorded for you in this course is correct …

Page 81: Architecting for the cloud elasticity security

Non Repudiation

• Nonrepudiation is refers to the ability to guarantee that the sender can not later repudiate or deny having sent the message

• It can also refer to the guarantee that the recipient cannot later deny having received the message

• When might this be important?

Page 82: Architecting for the cloud elasticity security

Availability

• This is the property that reflects the extent to which the system will be available for legitimate use

• A denial of service attack is meant to disrupt the availability of a system

Page 83: Architecting for the cloud elasticity security

Protection Against What?

• Now that we understand the business asset, what are we protecting against?

• In order to appropriately protect our system we need to understand the threat

• Let’s look at example exploits …

Page 84: Architecting for the cloud elasticity security

Agenda

• What is security?

• Understanding the threat

• Architectural approaches to security

• Summary

Page 85: Architecting for the cloud elasticity security

Threat Sources?

• Insider threats

• Physical threats

• Social engineering

• External attacks

Page 86: Architecting for the cloud elasticity security

Who is Leveraging These Techniques?

• The art of hacking has gone from an individual activity to a highly coordinated and sophisticated effort

– It can now be quite lucrative as well

• Today many legitimate and illegitimate organizations routinely launch attacks

– Just run a port scan detector on your system

• Let’s look at the progression of exploits

Page 87: Architecting for the cloud elasticity security

Progression of Exploits

• Mischievous individuals:

– The first generation of hackers were technical youth performing mischievous acts

• Revenue generation: a proof of concept

– These were the first example of hacking for money

– Still small scale

• Organized crime

– These were criminal organizations involved in larger scale criminal activity

• Widespread adoption – The infrastructure needed to launch Cyber attacks is now widespread

– The barrier to entry has been lowered

– Legitimate entities enter the game

• Advanced persistent threats

Page 88: Architecting for the cloud elasticity security

Hackers – First Generation

• In the 1990s hackers were by and large not malicious

• They were in it for the challenge

• Notable hackers – Kevin Mitnick

– Chen Ing-Hau

– Jeffery Lee Parson

– Sven Jaschan

Page 89: Architecting for the cloud elasticity security

Kevin Mitnick

• Broke into dozens of computer networks – Pac Bell

– DEC

– MCI

– Digital

– …

• Wasn’t in it for financial gain

• Largely used “social engineering” techniques

• Arrested twice 1988 and again in 1999

Page 90: Architecting for the cloud elasticity security

Mitnick 1995

Page 91: Architecting for the cloud elasticity security

Mitnick’s Techniques

• Largely used “social engineering” to gain access to passwords and insider information

• Used this information to gain access to target system

• Mitnick claims that he never “hacked” a system (still a point of controversy)

Page 92: Architecting for the cloud elasticity security

Chen Ing-Hau

• University student that created and released the CIH virus in 1999

– Wrote the virus to “make a fool of the software vendors”

• Virus that would render the computer essentially inoperable on a specified date

• Became one of the most widespread viruses

• Some version of this virus have showed up multiple times

Page 93: Architecting for the cloud elasticity security

CIH Virus

• Exploited vulnerability in Windows 95, 98, & ME

– Along with an issue in various BIOS chipsets

• Would overwrite the first megabyte of the hard drive and attempt to overwrite flashdrive

• Result rendered the pc inoperable

Page 94: Architecting for the cloud elasticity security

Jeffery Lee Parson

• Was 18 when he confessed to be the creator of Blaster worm

• A Chinese “cracking” collective reverse engineered a MS patch

• Parson created a worm to exploit a buffer overflow issue

• Affected DCOM’s RPC service

– Worm could spread without users opening an attachment

Page 95: Architecting for the cloud elasticity security

Blaster Worm

• In addition to changing RPC service it would

– Change registry to launch msblast.exe

• Worm would launch a distributed denial of service attack from infected computers

– Attack was against windowsupdate.com

• Sent messages to Bill Gates

Page 96: Architecting for the cloud elasticity security

Sven Jaschan

• Authored Sasser and Netsky worms

• Claims to have written them to remove Mydoom and Bagle worms

• Worms were responsible for 70% of the infections in 2004

Page 97: Architecting for the cloud elasticity security

Netsky

• Sent out as an email attachment

• Contained insults aimed at the author of Mydoom and Bagle

• Other symptoms included “beeping” in the early morning hours of specific dates

Page 98: Architecting for the cloud elasticity security

Sasser

• Would connect to computers through a particular port that was often open by default

• Exploited a buffer overflow

• Would shut the computer down after displaying a shutdown timer

Page 99: Architecting for the cloud elasticity security

Cyber Criminals – Proof of Concept

• After the turn of the century a new breed emerged

• They took the techniques employed by the mischievous youth and used them for monetary gain

• These were the first real “cyber criminals”

– Ferid Essebar

– Attilla Ekici

– Jeanson James Ancheta

Page 100: Architecting for the cloud elasticity security

Ferid Essebar & Attilla Ekici

• The two people behind Zotab computer worm

• Worm affected CNN, ABC News, NY Times, US Dept of Homeland Security, …

• Intention was to facilitate credit card forgery scams

Page 101: Architecting for the cloud elasticity security

Zotab

• Exploited vulnerability in Windows 2000

• Caused the computer to restart continuously

• Files would be created with every reboot

• Spyware was installed on the system

– The spyware remained after the virus was removed

• The goal was to facilitate scams (for money)

Page 102: Architecting for the cloud elasticity security

Jeanson James Anacheta

• First person to be arrested for controlling a large number of hijacked computers

• Created a large Botnet

– Network of bots or “software robots”

• Offered his collection of bots for hire

• Leveraged rxbot to increase his network

Page 103: Architecting for the cloud elasticity security

Rxbot

• Contained a proxy server

• Server can be spawned by a remote attacker

• Typically used for denial of service attacks

Page 104: Architecting for the cloud elasticity security

Cyber Gangs

• “Organized” crime gets involved

• Coordinated attacks against high value targets

• Often involve groups and large sums of money

• Examples

– Yaron Bolondi

– Maria Zarubina

– Albert Gonzalez

Page 105: Architecting for the cloud elasticity security

Yarib Bolondi

• Part of a gang that attempted to steal £220 million from Japanese bank

• Used keylogging to gain access to bank’s computers

• Software is installed on employees computers

– Via malware or other virus

Page 106: Architecting for the cloud elasticity security

Maria Zarubina

• Part of a gang that used cyber attacks as a means for extortion

• Attacked British “bookmakers”

– Agreed to stop attacks if ransom was paid

• Used denial of service attacks to shut down gambling sites

• Would then threaten additional attacks unless payment was made

Page 107: Architecting for the cloud elasticity security

Albert Gonalez

• Responsible for largest credit card theft in history

• Stole and resold more than 170 million cards

• Used SQL injection to introduce “malware backdoors”

– These allowed packet sniffing attacks

• Targets included Target, TJ Max, Dave & Busters, 7-eleven, JC Pennys, …

Page 108: Architecting for the cloud elasticity security

ARP Spoofing

• Used to attack an ethernet network

• Allows attacker to “sniff” data on a LAN and modify or stop the traffic

• Attacker sends a spoofed ARP message to Ethernet LAN

• “Man in the middle” attack

– Attackers computer masquerades as destination computer and gets intended traffic

Page 109: Architecting for the cloud elasticity security

Advanced Persistent Threat

• Today we’ve started to see a new class of threat emerge

• These threats are against specific high value targets • They are characterized by coordinated activity taking

place of a long period of time – The individual actions may seem isolated

• The perpetrator doesn’t act on the exploit until sufficient penetration has been achieved

• Has anyone heard of Stuxnet? • How about Gauss or Flame?

Page 110: Architecting for the cloud elasticity security

Software as a Weapon

• In 2010 Iran announced they put their nuclear program on hold

– No one was sure why

• It turns out the reason was that more than 1000 centrifuges in their uranium enrichment facilities were destroyed

• How were these centrifuges destroyed?

– By the first known weapon that was 100% software

Page 111: Architecting for the cloud elasticity security

Stuxnet

• Stuxnet was a worm that infected SCADA systems made by Siemens

– Think power plant and power distribution control systems

• It was capable of

– Increasing the pressure inside nuclear reactors

– Switching off oil pipelines

• Additionally it would report that the systems were operating normally

Page 112: Architecting for the cloud elasticity security

Sophisticated Attack

• Why is stuxnet special?

• First, it didn’t use a forged security clearance

– It used a genuine security clearance that was stolen

• Second, it had a specific target

– It infected many systems worldwide but remained dormant until it found the systems controlling the intended target

• Third, it exploited not 1, but 20 zero day vulnerabilities

Page 113: Architecting for the cloud elasticity security

Response

• Iran responded to the attack with an open call for hackers to join the Iranian Revolutionary Guard

• Iran now has reportedly amassed the 2nd largest online army in the world

Page 114: Architecting for the cloud elasticity security

Side Note

• Stuxnet is now open source

• This is code that is capable of crashing power plants and disrupting oil pipelines

• Go to youtube and search for stuxnet

– You’ll get many videos of people dissecting stuxnet …

Page 115: Architecting for the cloud elasticity security

Advanced Persistent Threats

• Stuxnet is an example of what we call “Advanced Persistent Threats”

• In some cases exploits are not opportunistic reactions to discovering a vulnerability

• They are coordinated multipronged attacks that can take place over an extended period of time

Page 116: Architecting for the cloud elasticity security

Coordinated Attack

• Intruders will look for some way to find access to a system

• They will then try to move laterally until they are able to access the intended target

• This can take days, weeks, months, or even years

Page 117: Architecting for the cloud elasticity security

Email

Page 118: Architecting for the cloud elasticity security

What’s the Point?

• Almost all of these incidents exploited vulnerabilities

• These vulnerabilities came along with the commercially available software used in the attacked systems

• Vulnerabilities continue to exist in the software that we use

Page 119: Architecting for the cloud elasticity security

Vulnerabilities

• Many organizations (legitimate and illegitimate) try to find these vulnerabilities

– CERT is an example of such an organization

• Organizations like CERT would inform the developers of the software of the vulnerability

• Historically companies were slow to react

• CERT didn’t want to release it publically without a fix being available

• So CERT would notify the organization and then release the vulnerability publically after a given time elapsed

Page 120: Architecting for the cloud elasticity security

X Day Vulnerabilities

• Vulnerabilities are characterized by the time since they were made public

– 1 day vulnerabilities were released 1 day ago

• The newer the vulnerability the less likely it is to be patched

• Zero day vulnerabilities are those that the manufacturer doesn’t yet know about

– Clearly these are the most attractive to attackers

Page 121: Architecting for the cloud elasticity security

Vulnerability Market

• A market has emerged for these vulnerabilities

• If you discover a vulnerability you can sell it

• The value of the vulnerability is determined by:

– The “day” of the vulnerability

– The number of instances of the software containing the vulnerability

Page 122: Architecting for the cloud elasticity security

Selling The Vulnerability

• Many entities buy these vulnerabilities – Governments (including the US)

– Organized crime syndicates

– Individuals

• Prices range from $10 - $250,000 or more – Depending on the exclusivity of the sale as well as the value of the exploit

• Check out: – http://www.forbes.com/sites/andygreenberg/2012/03/23/shopping-for-zero-days-an-

price-list-for-hackers-secret-software-exploits/

– http://www.zdnet.com/blog/security/black-market-for-zero-day-vulnerabilities-still-thriving/2108

Page 123: Architecting for the cloud elasticity security

Exploit Auction Houses

• There are now auction houses that sell vulnerabilities (or exploits)

– Like the ebay of exploits

– In fact exploits were originally sold on ebay

• It’s actually legal to sell these exploits

– Even though the attacks themselves may be illegal

Page 124: Architecting for the cloud elasticity security

Exploit as a Service (EaaS)

• Believe it or not you can now get a service to manage your attacks

• One issue if you’re going to launch an attack is finding a “bulletproof” provider – A provider willing to host a malware server

• These services will provide “exploit kits” and manage the hosting

• In some cases they even offer analytics for the consumer’s campaigns (think google analytics)

Page 125: Architecting for the cloud elasticity security

Widespread Adoption

• All of this has lowered the barrier to entry for exploiting vulnerabilities

• There are large numbers of people with the means and motive to attack any system online

• Furthermore secure practices are often not followed

– See next slide

Page 126: Architecting for the cloud elasticity security

Many Systems Remain Vulnerable

• Remember the issues with Open SSL that surfaced in early 2014? – Despite widespread news reports, many systems continue to be vulnerable

• June 2014 survey of TLS vulnerabilities

Page 127: Architecting for the cloud elasticity security

Cloud Related Issues

• In many respects security in the cloud is not different from security for a traditional system

• Some threats are magnified, and some additional threats exit

• We’ll look at: – VM sprawl

– Insecure interfaces or API

– Malicious insiders

– Shared resources

Page 128: Architecting for the cloud elasticity security

VM Sprawl

• VM creations is quick and easy – It can be done in seconds without procuring hardware,

administrative knowledge, or securing permissions

• As a result it’s done often – Sometimes for transient needs

• Once created the VM is often forgotten about – It might still exist even if it is no longer doing any work

• Keeping track of the existing VMs is difficult to do – It requires different processes than tracking physical assets

• This results in something called VM Sprawl

Page 129: Architecting for the cloud elasticity security

Consequences of VM Sprawl

• VM Sprawl is bad for many reasons

• First, it imposes additional overhead on the overall solution

– The VM still costs money even if it is offline

• Second, it is less likely to be included in the normal maintenance efforts

– Updates and patches might not be applied

• As a result the VM can remain vulnerable

Page 130: Architecting for the cloud elasticity security

Insecure Interfaces or API

• IaaS and PaaS providers expose a set of API • These API are used by customers to:

– Provision – Manage – Orchestrate – Monitor – …

• The security of the cloud is dependent on the security of these API

• These API must be designed in a way to resist accidental and malicious attempts to circumvent policy

Page 131: Architecting for the cloud elasticity security

3rd Party API

• We not only need to trust the expertise and procedures of the cloud providers but 3rd party vendors as well

• Organizations often layer capability on top of the provided API in order to add value to the consumer e.g. – Deployment tools

– Monitoring aggregation tools

– Data management services

– …

• The security of these providers also needs to be trusted

Page 132: Architecting for the cloud elasticity security

How Does This Work?

User 3rd Party Service Cloud Provider

Page 133: Architecting for the cloud elasticity security

Malicious Insiders

• Malicious insiders are a known and significant threat to corporate security

– E.g. former and disgruntled employees

• When deploying your application on the cloud you need to worry about employees of the cloud provider as well

Page 134: Architecting for the cloud elasticity security

Shared Resources

Page 135: Architecting for the cloud elasticity security

Shared Resources

• When software running in a process within a VM can elevate privileges sufficiently they can “escape” the bounds of the VM

• This is called “guest to host VM escape”

• Once this happens the software is able to control all of the instances within that hypervisor

Page 136: Architecting for the cloud elasticity security

Hypervisor Vulnerabilities

• The most commonly used hypervisors have all been exploited

• Vulnerabilities continue to be discovered in all of the major hypervisor software

– Discovered by both the good guys and bad guys

• Do a Google search on VM Escape for the latest vulnerabilities …

Page 137: Architecting for the cloud elasticity security

Addressing Security Issues

• The strategies for dealing with security issues typically fall into one of three categories

– Secure coding practices

– Processes and policy

– Architectural approaches

Page 138: Architecting for the cloud elasticity security

Secure Coding Practices

• Looking at the source of the vulnerabilities it may seem that secure coding practices will solve the problem

• While this is true to some extent as we said these vulnerabilities exist in most commercially available software

• We must therefore assume that our software is to some extent insecure

• It’s also the case that we will miss issues

• Inevitably the software will have defects, will be used in a context other than what was intended, or will be used with software that it wasn’t intended to work with

Page 139: Architecting for the cloud elasticity security

Processes and Policy

• A large aspect of dealing with security includes processes and procedures

• The security of the system is impacted by things like: – Physical security

– IT policy governing computers on the network

– Updating and patching procedures

– Organizational structure and access policies

• Defining appropriate practices is a key component to security

Page 140: Architecting for the cloud elasticity security

Agenda

• What is security?

• Understanding the threat

• Architectural approaches to security

• Designing for security

• Summary

Page 141: Architecting for the cloud elasticity security

Security Strategies

• Security strategies fall in one of several categories – Policy/process

– Secure coding practices

– Architectural

• We will now look at some architectural strategies

• The thing to keep in mind is that you cannot easily eliminate all vulnerabilities – Some of the approaches are aimed at minimizing vulnerabilities

– Some are aimed at reducing the impact if the vulnerabilities are exploited

Page 142: Architecting for the cloud elasticity security

Resisting Attacks

• Resisting attacks is analogous to securing the perimeter

• Strategies for resisting attacks include:

– Encryption

– Checking data integrity

– Limiting exposure

– Limiting access

Page 143: Architecting for the cloud elasticity security

Encryption

• Applied to data and communications can help maintain confidentiality

• Can be symmetric

– Both parties use the same key

• Or asymmetric

– Public/private key

Page 144: Architecting for the cloud elasticity security

Encryption

• What kind of attack would encryption protect against?

• What kind of attack would it not protect against?

• What kind of security concern would it address?

Page 145: Architecting for the cloud elasticity security

Data Integrity

• Encoding data with checksum or hash results can help ensure the data has not been tampered with

• This additional data can be encrypted along with or independently from the original data

Page 146: Architecting for the cloud elasticity security

Data Integrity

• Think about data integrity concerns in the context of some of the recent attacks

– Stuxnet

– Gauss

– …

• These techniques can be important for detecting an attack

– Additional techniques might be needed to recover

Page 147: Architecting for the cloud elasticity security

Limiting Exposure

• Attacks depend on exploiting weaknesses to gain access to data and services

• Limiting access to the attack surface limits risk*

• The following are approaches to limiting exposure

* Manadhata 2006

Page 148: Architecting for the cloud elasticity security

Client Data Storage

• Problem: many applications store data at potentially untrusted clients.

– These clients could tamper with the data

• Solution: this pattern uses encryption to store security-critical data client-side

Page 149: Architecting for the cloud elasticity security

Client Data Storage II

• Manual inspection of this data could reveal details of the application that could be used to compromise the site

Page 150: Architecting for the cloud elasticity security

Client Input Filters

• Problem: in many cases clients execute outside the control of the system developer.

– These clients can be tampered with to behave in an untrustworthy manner

• Solution: treat all data provided by clients as suspect

Page 151: Architecting for the cloud elasticity security

Client Input Filters II

• Perform (or re-execute) data validity checks on the server

• Exam headers and URLs for malicious code

• Text input should be checked for scripts

• Calculated fields should be re-computed on the server

• Considerations: – Should use a symmetric key as it’s less computationally expensive

– Storage of the key should not be stored in a file

Page 152: Architecting for the cloud elasticity security

Trusted Proxy

• Problem: it may be necessary to expose inadequately protected aspects of the system to untrusted users

• Solution: create a trusted proxy that acts as a buffer between the component and the users

Page 153: Architecting for the cloud elasticity security

Trusted Proxy II

• This proxy intercepts and filters all communication

• In that way it can compensate for the lack of protections

• Typically two options – Filter requests for bad input

– Recreate a new request with only the essential parts of the old request

Page 154: Architecting for the cloud elasticity security

Single Access Point

Problem: a system is more difficult to secure if it has multiple entry points

• With multiple entry points: – You may need to separately secure multiple applications

– You may have duplicate authentication logic to maintain

– Unix is an example with multiple entry points

– Different services can be set up on different machines

Page 155: Architecting for the cloud elasticity security

Single Access Point II

• The solution is to create a single point of entry

• A session is then created

• This allows global tracking of session state and authorization information

• There is a single “gateway” or “check point” through which user’s login is validated

Page 156: Architecting for the cloud elasticity security

Single Access Point III

• Which aspects of security does this pattern address?

• What are some of the implications of using this pattern?

Page 157: Architecting for the cloud elasticity security

Partitioned Application

• Problem: large complex applications often require root privileges in some portions of the application

– If these elements are compromised the entire system is at risk

• Solution: partition the large application into smaller elements each adhering to least privilege principle

Page 158: Architecting for the cloud elasticity security

Partitioned Application II

• This becomes more difficult to manage

• Additionally performance can suffer as interprocess communication increases

• Additional points of entry are introduced

– Even though the impact of being compromised is diminished

Page 159: Architecting for the cloud elasticity security

Password Propagation

• Problem: most applications manage user data under a single database account

– Thus if the single account is compromised all user data can be accessed

• Solution: the users password is required with each backend database request

Page 160: Architecting for the cloud elasticity security

Password Propagation II

• This is essentially an instance of application partitioning

• The front end will cache the password and provide it with each back end request

Page 161: Architecting for the cloud elasticity security

Limiting Access

• You can think of this as “securing the perimeter”

• This is a widely used approach of limiting access to data and services

• The following are examples of techniques for limiting access

Page 162: Architecting for the cloud elasticity security

Session

• Background: Systems need to keep track of user’s login status, level of authorization, and so forth

– The Singleton pattern is often used for this

– This pattern can be difficult to use when the system support concurrent logins

• The solution is to create a “session” object to hold these global variables

Page 163: Architecting for the cloud elasticity security

Session II

• This session object is accessible by all components of the application

• This facilitates having a common interface for accessing this information

– Easier to implement and maintain than having a number of variables passed around

Page 164: Architecting for the cloud elasticity security

Roles

• Background: when an application supports many types of users security becomes more complicated

– It can be difficult to track and maintain all of the things that every user has access to

• It eases implementation issues if a smaller number of “roles” are created

• Each role has a given set of rights

Page 165: Architecting for the cloud elasticity security

Roles II

• What kinds of security does this address?

• Implications?

Page 166: Architecting for the cloud elasticity security

Account Lockout

• Problem: there is an increased number of password guessing tools to compromise systems requiring user authentication

• Solution: lock the user account after some number of incorrect attempts

• How it works: – The system records each incorrect login attempt

– When a predetermined number of attempts is reached the account is locked

– Each time there is a correct login the account is reset

Page 167: Architecting for the cloud elasticity security

Account Lockout II

• Issues:

– Doesn’t address the situation where different user IDs are used

– Usability can be adversely affected

– Availability can be adversely affected

• Can facilitate denial of service

Page 168: Architecting for the cloud elasticity security

Detecting Attacks

• Detect Intrusion

• Detect Denial of Service

Page 169: Architecting for the cloud elasticity security

Minefield

• Problem: hackers are likely familiar with the vulnerabilities of various configurations

– Once they figure out your setup they’ll know how to get in

• Solution: change your setup to a non-standard configuration

Page 170: Architecting for the cloud elasticity security

Minefield II

• Even small changes can increase the effort enough to discourage hackers

• You can do things like:

– Alter file structure

– Rename common administrative commands

– Instrument commands to alert administrators

– Add booby traps that will recognize tampering

Page 171: Architecting for the cloud elasticity security

Secure Assertion

• Problem: the activities performed by a malicious intruder may look legitimate at the local level

– E.g. transferring money from an account

• Solution: create a framework for reporting specific activities that violate assertions

Page 172: Architecting for the cloud elasticity security

Secure Assertion II

• The application developer is in a position to determine activities that may be suspicious

– They can create assertions

• If the application is being developed in an environment that supports exceptions, assertion violations could be reported in a similar fashion

• The violations could be collected globally to provide additional insight on the current activities

Page 173: Architecting for the cloud elasticity security

Recovering From Attacks

• Availability tactics

– We will discuss these in a future class

• Auditing

– Keeps a trail of the users and their actions

– Helps to maintain a record of the attack

Page 174: Architecting for the cloud elasticity security

Network Address Blacklist

• Problem: all systems with an online presence are subject to attack

– Locking individual accounts doesn’t address systemic attacks

• Solution: block network addresses that are the source of attack

Page 175: Architecting for the cloud elasticity security

Network Address Blacklist II

• The server will monitor requests from clients

– Any suspicious requests will be logged

– If there are repeated suspicious requests the address is blocked

• One question is where to implement the check

– Network (e.g. firewall) or application

• Performance as list grows can be an issue

• Can still be subject to denial of service attack

Page 176: Architecting for the cloud elasticity security

Agenda

• What is security?

• Understanding the threat

• Architectural approaches to security

• Designing for security

• Summary

Page 177: Architecting for the cloud elasticity security

So How Do We Decide?

• There are many options, which ones are required?

• What are the side effects of selecting these security mechanisms?

Page 178: Architecting for the cloud elasticity security

Fit for Purpose

• It is (hopefully) clear that each of these techniques addresses a different concern

• What concerns does your organization have?

– This depends on the business assets that need protection

– And the ways in which these assets could be compromised given the system

Page 179: Architecting for the cloud elasticity security

Threat Modeling

Threat Modeling and Analysis in a nutshell:

– Identify the business asset to protect

– Brainstorm the known threats to the system

– Rank the threats by decreasing risk

– Chose techniques to mitigate the threats

– Chose appropriate technologies from the identified techniques

Page 180: Architecting for the cloud elasticity security

Business Asset

• The reason for security is to protect some aspect of the business

• You need to identify those aspects of the business that need protection

• You also to determine what “protection” means

Page 181: Architecting for the cloud elasticity security

Brainstorm Threats

• Given a particular design what might happen to compromise the business asset?

• You should think about these from two perspectives

– Likelihood

– Impact

• At this point you don’t worry about if they need mitigation

Page 182: Architecting for the cloud elasticity security

Rank the Threats

• Based on the likelihood and the impact you can determine the “risk exposure”

– Look at risk management techniques

• Prioritize the risks according to the exposure

• Determine the threshold that require mitigation

Page 183: Architecting for the cloud elasticity security

Mitigation Techniques

• Look for generic patterns that will mitigate the risks

• Mitigate means lower the risk exposure to a tolerable level

– You lower the exposure by reducing the likelihood or reducing the impact

– A tolerable level means below the threshold defined previously

Page 184: Architecting for the cloud elasticity security

Choose Technologies

• Basically you need to map the generic pattern to some concrete solution

• This is where you factor in the costs

• Costs could come in terms of level of effort to implement

• Costs could also come in terms of tradeoffs

– You might need to iterate these steps

Page 185: Architecting for the cloud elasticity security

Consider Trade Offs

• Most of these mechanism adversely impact performance

– Blindly selecting these capabilities can bring the system to a standstill

• They also have an impact on the flexibility of the system

• Balancing concerns is key

Page 186: Architecting for the cloud elasticity security

References

• STRIDE: http://msdn2.microsoft.com/en-us/library/aa302419.aspx

• Hinton, Hondo, Hutchison: Security Patterns within a Service Oriented Architecture IBM 2005

• Hafiz, Johnson Security Patterns and their Classification Schemes

• Thomas Erl Service Oriented Architecture Chapters 4 and 11

• SEI/CERT OCTAVE: Operationally Critical Threat, Asset, and Vulnerability Evaluation: http://www.cert.org/octave

• Manadhata et al. Measuring the Attack Surfaces of Two FTP Daemons 2006

Page 187: Architecting for the cloud elasticity security

Questions??