What Is Cloud Computing?
Wikipedia Definition Cloud computing is a concept of using the Internet to allow
people to access technology-enabled services It allows users to consume services without knowledge of
control over the technology infrastructure that supports them
NIST Definition 5 essential characteristics 3 cloud service models 4 cloud deployment models
The NIST Cloud Definition Framework
Community Cloud
Private Cloud
Public Cloud
Hybrid Clouds
Service Models
Essential Characteristics
Software as a Service (SaaS)
Platform as a Service (PaaS)
Infrastructure as a Service (IaaS)
Resource Pooling
Broad Network Access Rapid Elasticity
Measured Service
On Demand Self-Service
Deployment Models
On-demand service Get computing capabilities as needed automatically Broad Network Access Services available over the net using desktop, laptop, PDA,
mobile phone Resource pooling Provider resources pooled to server multiple clients Rapid Elasticity Ability to quickly scale in/out service Measured service Control, optimize services based on metering
Essential Characteristics
Cloud Service Models
Software as a Service (SaaS) We use the provider apps User doesn’t manage or control the network, servers, OS,
storage or applications Platform as a Service (PaaS) User deploys their apps on the cloud Controls their apps User doesn’t manage servers, IS, storage Infrastructure as a Service (IaaS) Consumers gets access to the infrastructure to deploy their
stuff Doesn’t manage or control the infrastructure Does manage or control the OS, storage, apps, selected
network components
SaaS
PaaS
IaaS
Amazon Google Microsoft Salesforce
Service Delivery Model Examples
Products and companies shown for illustrative purposes only and should not be construed as an endorsement
10
Cloud Deployment Models
Private cloud Enterprise owned or leased
Community cloud Shared infrastructure for specific community
Public cloud Sold to the public, mega-scale infrastructure
Hybrid cloud Composition of two or more clouds
Cloud Computing Example - Google AppEngine
PaaS http://code.google.com/appengine/ Google AppEngine API
Python runtime environment Datastore API Images API Mail API Memcache API URL Fetch API Users API
A free account can use up to 500 MB storage, enough CPU and bandwidth for about 5 million page views a month
Conventional Manually
Provisioned Dedicated
Hardware Fixed Capacity Pay for Capacity Capital &
Operational Expenses
Managed via Sysadmins
Cloud Self-provisioned Shared
Hardware Elastic Capacity Pay for Use Operational
Expenses Managed via
APIs
Conventional Computing vs. Cloud Computing
Cloud Computing Summary
Cloud computing is a kind of network service and is a trend for future computing
Scalability matters in cloud computing technology Users focus on application development Services are not known geographically
Kai Hwang and Deyi Li, “Trusted Cloud Computing with Secure Resources and Data Coloring”, IEEE Internet Computing, Sept. 2010
Cloud Providers and Security Measures
23
General Security Advantages
Shifting public data to an external cloud reduces the exposure of the internal sensitive data
Cloud homogeneity makes security auditing/testing simpler Clouds enable automated security management Redundancy / Disaster Recovery
24
General Security Challenges
Trusting vendor’s security model Customer inability to respond to audit findings Obtaining support for investigations Indirect administrator accountability Proprietary implementations can’t be examined Loss of physical control
10 Security Concerns
Where’s the data? Who has access? What are your regulatory requirements? Do you have the right to audit? What type of training does the provider offer their
employees? What type of data classification system does the provider
use? What are the service level agreement (SLA) terms? What is the long-term viability of the provider? What happens if there is a security breach? What is the disaster recovery/business continuity plan
(DR/BCP)?
7 Potential Risks
Privileged user access Regulatory compliance Data location Data segregation. Recovery Investigative support Long-term viability
What Is New?
Accountability No Security Perimeter Larger Attack Surface New Side Channels Lack of Auditability Regulatory Compliance Data Security
No Security Perimeter
Little control over physical or network location of cloud instance VMs
Network access must be controlled on a host by host basis
New Side Channels
You don’t know whose VMs are sharing the physical machine with you. Attackers can place their VMs on your machine. See “Hey, You, Get Off of My Cloud” paper for how.
Shared physical resources include CPU data cache: Bernstein 2005 CPU branch prediction: Onur Aciiçmez 2007 CPU instruction cache: Onur Aciiçmez 2007
In single OS environment, people can extract cryptographic keys with these attacks.
36
Lack of Auditability
Only cloud provider has access to full network traffic, hypervisor logs, physical machine data.
Need mutual auditability Ability of cloud provider to audit potentially malicious or
infected client VMs. Ability of cloud customer to audit cloud provider environment.
37
Data Security
Symmetric Encryption
Homomorphic Encryption
SSL
MAC Homomorphic Encryption
SSL
Redundancy Redundancy Redundancy
Confidentiality Authorized to know
Availability Data Never Loss Machine Never Fail
Integrity Data Has Not Been Tampered With
Storage Processing Transmission
Data Security Is A Major Concern
Security concerns arising because both customer data and
program are residing in Provider Premises.
Security is always a major concern in Open System Architectures
Customer
Customer Data
Customer Code
Provider Premises
Why Data Is Not Secure
Cloud Security problems are coming from Loss of control Lack of trust Multi-tenancy
Mainly exist in public cloud
Loss of Control in the Cloud
Consumer’s loss of control Data, applications, resources are located with provider User identity management is handled by the cloud User access control rules, security policies and enforcement
are managed by the cloud provider Consumer relies on provider to ensure
Data security and privacy Resource availability Monitoring and repairing of services/resources
Lack of Trust in the Cloud
A brief deviation from the talk Trusting a third party requires taking risks
Defining trust and risk Opposite sides of the same coin People only trust when it pays Need for trust arises only in risky situations
Defunct third party management schemes Hard to balance trust and risk e.g. Key Escrow Is the cloud headed toward the same path?
Multi-tenancy Issues in the Cloud
Conflict between tenants’ opposing goals Tenants share a pool of resources and have opposing goals
How does multi-tenancy deal with conflict of interest? Can tenants get along together and ‘play nicely’ ? If they can’t, can we isolate them?
How to provide separation between tenants?
Possible Solutions
Loss of Control Take back control
Data and apps may still need to be on the cloud But can they be managed in some way by the consumer?
Lack of trust Increase trust (mechanisms)
Technology Policy, regulation Contracts (incentives): topic of a future talk
Multi-tenancy Private cloud Takes away the reasons to use a cloud in the first place Strong separation
Cloud Security Summary
Cloud computing is sometimes viewed as a reincarnation of the classic mainframe client-server model However, resources are ubiquitous, scalable, highly virtualized Contains all the traditional threats, as well as new ones
In developing solutions to cloud computing security issues it may be helpful to identify the problems and approaches in terms of Loss of control Lack of trust Multi-tenancy problems
Selected Publications
G. Wang, Q. Liu, F. Li, S. Yang, and J. Wu, "Outsourcing Privacy-Preserving Social Networks to a Cloud," accepted to appear in the 32nd IEEE International Conference on Computer Communications (IEEE INFOCOM 2013).
Q. Liu, C. C. Tan, J. Wu, and G. Wang, "Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments" Proceedings of the 31st IEEE International Conference on Computer Communications (IEEE INFOCOM 2012).
G. Wang, Q. Liu, and J. Wu, "Hierarchical Attribute-Based Encryption for Fine-Grained Access Control in Cloud Computing," Proceedings of the 17th ACM Conference on Computer and Communications Security (CCS-10).
Q. Liu, C. C. Tan, J. Wu, and G. Wang, "Towards Differential Query Services in Cost-Efficient Clouds," accept to appear in IEEE Transactions on Parallel and Distributed Systems (TPDS).
Q. Liu, G. Wang, and J. Wu, "Time-Based Proxy Re-encryption Scheme for Secure Data Sharing in a Cloud Environment", Information Sciences.
Q. Liu, C. C. Tan, J. Wu, and G. Wang, "Cooperative Private Searching in Clouds," Journal of Parallel and Distributed Computing (JPDC).
G. Wang, Q. Liu, and J. Wu, "Hierarchical Attribute-Based Encryption and Scalable User Revocation for Sharing Data in Cloud Servers," Computers & Security.
Multi-User Data Sharing Environment
Cloud Security problems are coming from : Loss of control Lack of trust (mechanisms) Multi-tenancy
Security Issues
Data Security Revocation Retrieval Privacy
The cloud service provider is a potential attacker!!
Data Security
Natural way Adopting cryptographic technique
Current solutions
Traditional symmetric/ asymmetric encryption Low cost for encryption and decryption Support key delegation--HIBE Hard to achieve fine-grained access control
Attribute-Based encryption Easy to achieve fine-grained access control High cost for encryption and decryption Do not support key delegation
Public Key Cryptography
53
Hierarchical Attribute-Based Encryption (HABE)
Application scenario
Sample URA
Requirements Fine-grained access
control Hierarchical key
generation Efficiency
Hierarchical Attribute-Based Encryption (HABE)
Key technique Combine the hierarchical identity-based encryption and
attribute-based encryption Use the attributes and exact ID to identify each user
HABE Architecture
User Revocation
Naïve solution The data owner re-encrypts data and distributes new keys
to the data user Frequent revocation will make the data owner become a
performance bottleneck Proxy re-encryption (PRE)
Time-Based Proxy Re-Encryption
PRE in clouds The data owner to send re-encryption instruction to the
cloud The cloud perform re-encryption based on proxy re-
encryption
T2<T1: Potential security risk
How to achieve automatic revocation without sending any instructions?
Time-Based Proxy Re-Encryption
Key technique Incorporate time into PRE This scheme is suitable for the application where the valid of
access is pre-determined
2012
12...
1 31
...
1
1 31
...
( )a ss H a=( ) ( )
a
ya ss H y=
( )( , ) ( )y
a
y ma ss H m=
( , )( , , ) ( )y m
a
y m da ss H d=
A time tree is constructed The data owner and the cloud
share a secret seed s The cloud re-encrypt data
based on internal time automatically while receiving a data access request
User Privacy
User privacy Search privacy: The cloud cannot know what the users are
searching for Access privacy: The cloud cannot know what/which files are
returned to the users Existing solutions
Private search (PS) can protect user privacy while searching public data
Searchable encryption (SE) can protect search privacy while searching private data
Searchable Encryption (SE)
Bob sends to Alice an email encrypted under Alice’s public key.
Alice’s email gateway wants to test whether the email contains the keyword urgent so that it could route the email to her PDA immediately.
But,Alice does not want the email gateway to be able to decrypther messages
Efficient Searchable Encryption
Problem
The user needs to perform decryption Thin client has only limited resources
Requirements Enable the cloud to perform partial decryption without
compromising search privacy User can access data from the cloud anytime and anywhere with
any devices
Efficient Searchable Encryption Key technique
Alice takes both Bob and CSP’s public key as inputs of the encryption algorithm
CSP uses its secret key to perform partial decrypt and generate an intermediate value
Bob use the intermediate value to quickly recover data
Private Search (PS)
Cloud Bob
[1] [1] [0] [0]
F1 F2 0 NA
A compressed version of all files
F1: {A,B} F2:{B,D} F3:{C,D}
Given a public dictionary that contains all keywords, e.g., dictionary=<A,B,C,D>.
Bob wants to retrieve files with keywords A and B
Private Search (PS)
Homomorphic encryption
E(x)*E(y) = E(x+y) E(x)^y = E(x*y)
F1: { A, B} F2: {B,D} F3: {C,D}
F1 F2 0 NA
[1] [1] [0] [0] key trick: map unmatched files to 0
F1 NA
F1 F2 F3
F2 0 survival collision survival unmatched
E(F2)* E(0) =E(F2)
Cooperative Private Search (COPS)
Problem for simple PS Processing each query is expensive. Given n users, the
cloud needs to execute n queries Performance bottleneck on the cloud
COPS Architecture A proxy server (ADL) is introduced between the users and
the cloud (trusted) Aggregate user queries Distribute searching results
Cooperative Private Search (COPS)
Key technique The user and the cloud share
Shuffle functions shuffle the dictionary and the query --- to preserve search privacy Pseudonym function: hide file name Obfuscated function: hide file content ---preserve access privacy
Key merits User privacy is preserved from
The cloud The proxy server Other users
Efficient Information Retrieval for Ranked Queries (EIRQ)
Problem for Simple COPS No ranked queries The cloud returns all matched files
Queries are classified into 0,1,…,r-1 ranks.
Rank-i query retrieves (1-i/r) percentage of matched files
Files that match rank 0 queries
Files that match rank 1 queries Files that match
rank i queries
Will not be filtered Filtered with probability 1/r
Filtered with probability i/r
… … … …
The cloud
Cannot know which files are filtered/returned
Cannot know each queries’ rank
Efficient Information Retrieval for Ranked Queries (EIRQ)
Key techniques: Construct a mask matrix to protect query ranks Filter files without knowing which files are filtered
QueryGen Step 1:
User ADL Cloud Keywords,
rank
FileFilter
File Recovery
Matrix Construct
Step 2:
Step 4:
Step 3:
Mask matrix
Buffer
Certain percentage of files matching user keywords
Efficient Information Retrieval for Ranked Queries (EIRQ)
ADL constructs a mask matrix that is encrypted with its publics key, and sends it to the cloud
Cloud
ADL
A B
C
D
[1] [1]
[1] [1]
[1] [0]
[0] [0]
… …
[0] [0]
{A, B} Rank 0
{A, C} Rank 1 Alice
Bob
Number of ranks, r=2
Number of keywords
Construct Mask Matrix
Cloud
F1: { A, B} F2: {B, D} F3: {C, D}
buffer
ADL
…
A B
C
D
[1] [1]
[1] [1]
[1] [0]
[0] [0]
… …
[0] [0]
The cloud chooses a random column for each file
…
F1 and F2 will be returned F3 will be filtered with 50%
A file, matched rank i query, the probability to be filtered i/r
For F3: 50% 50% E(0)*E(0)=E(0) E(0)*E(0)=E(0) E(0)^F3 =E(0) E(1)^ F3 =E(F3)
Filter Files