Top Banner
Umeå universitet Bachelor Thesis Spring -13 Cloud computing from a privacy perspective Author: Daniel Evertsson Supervisor: Jerry Eriksson September 6, 2013
34

Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s...

Aug 03, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

Umeå universitet

Bachelor Thesis

Spring -13

Cloud computingfrom a privacy perspective

Author:Daniel Evertsson

Supervisor:Jerry Eriksson

September 6, 2013

Page 2: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.
Page 3: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

Abstract

The cloud could simplifies the everyday life of private individuals as well asbig enterprises by renting out recourses. Resources such as storage capacity,computational power or cloud-based applications could be accessed withoutthe need to invest in expensive infrastructure. Even though many enterprisescould benefit from using cloud services they hesitate, partly because they feardata leakage when storing sensitive data in the cloud environment.

The goal has been to prevent unauthorized users to access the users’ databy using client-side encryption. The solution must be able to support ex-isting features. For example many applications support multiple devices,which means that the user can access the same data from devices such asSmartphone, Tablets and desktop computers.

The result showed that there are two main approaches to implement client-side encryption. The first approach bases the encryption key on randomelements. It’s without a doubt the most secure method to use, but it’snot user-friendly. The user has to distribute the generated encryption keybetween all the devices, for example moving files back and forth. The secondapproach bases the encryption key on a password. The security will decreasebut it will be more user friendly.

It appears that the biggest problem related to client-side encryption, isn’tthe encryption itself, but the distribution of encryption keys. As the numberof users increase, the key destitution problem gets more distinct. Often thekey distribution is handled by something called a key manager, which couldoperate at different levels. It could be built into the application or it couldbe an external application. There are organizations which made guild linesfor how to design key management systems.

Page 4: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.
Page 5: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

Acknowledgements

First of I would like to thank Cristian Klein at the department for distributedsystems for coming up with the idea for this thesis. He has also provided a lotof valuable input and support. I would also like to thank the teachers JerryEriksson and Pedher Johansson for valuable input to this project.

Page 6: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.
Page 7: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

Contents

1 Introduction 11.1 Client-side encryption . . . . . . . . . . . . . . . . . . . . . . 21.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Existing solutions that offers Storage-as-a-service 52.1 CrashPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Mozy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 TeamDrive . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.4 Wuala . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.5 Summary of common encryption techniques . . . . . . . . . . 72.6 Other solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Client-side encryption strategies 93.1 User supplied key . . . . . . . . . . . . . . . . . . . . . . . . . 93.2 Password based key . . . . . . . . . . . . . . . . . . . . . . . . 10

3.2.1 Test implementation . . . . . . . . . . . . . . . . . . . 133.3 PBKDF vs. Random based encryption key . . . . . . . . . . . 15

4 Conclusion 194.1 Client-side encryption . . . . . . . . . . . . . . . . . . . . . . 194.2 PBKDF or Random based encryption key . . . . . . . . . . . 204.3 Dynamic iteration . . . . . . . . . . . . . . . . . . . . . . . . 204.4 Client-side encryption drawbacks . . . . . . . . . . . . . . . . 214.5 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Bibliography 25

Page 8: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.
Page 9: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

1

Chapter 1

Introduction

In today’s society the use of different internet-connected devices has in-creased dramatically. We access the internet though devices such as Smart-phones, Tablets, laptops and desktop computers. Between the years 2003and 2010 the number of devices increased from 500 million to 12.5 billiondevices[1]. This is an increase of 2500% in seven years. In 2010 there wherealmost twice as many devices as there where people in the world. Usershas developed a need to store and access the same data from there differentdevices.

As a solution to the problem, a concept called cloud computing has beendeveloped. The idea is to let the user access the clouds resources such asstorage, software, platforms and infrastructure1. As a user you get accessto these resources through the internet, often by using a thin client like aweb browser or a client application. You get access to the resources withouthaving to invest in new infrastructure or developing new software. Anotherbenefit with cloud computing is that the user only pays based on the re-courses consumed.

Even though there are many advantages with cloud computing many com-panies hesitate to use it. In 2012 Varonis Systems Inc presented a researchwhich showed that 80 percent of the interviewed companies didn’t want toinvest in cloud based solutions. They didn’t even allow their employeesto use existing cloud based services [2]. The main reasons where that theyfeared data leakage, security breaches and compliance issues. 70 percent said

1 If you want to know more about different kinds of cloud services visit TechNet Mag-azine (http://technet.microsoft.com/en-us/magazine/hh509051.aspx)

Page 10: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

2 CHAPTER 1. INTRODUCTION

that they would use cloud based services if they were as robust as internaltools.

Because the security is a crucial element in whether companies will startusing cloud based services or not, this will be the main focus of this thesis.This thesis will study different encryption techniques which could be usedto encrypt data stored at the cloud provider.

Since user often needs to access data from multiply devices this factor shouldbe taken into account. The users should be able to access their files fromdevices like desktop computers, laptops, Smartphone’s and Tablet’s. In orderto identify the user a single user account should be used. Since all devicesinvolved should be able to use the encryption technique presented in thisthesis hardware limitation, like computational power, should be taken intoaccount.

1.1 Client-side encryption

To make it more difficult for a unauthorized people2 to access the usersdata it should be encrypted. One option would be to let the cloud providerencrypt all the data that is stored in the cloud. This method is called server-side encryption. The problem with this approach is that if a attacker getsaccess to the cloud-provider or if an employee of the cloud provider tries toaccess the data they will also have access to the decryption key which makesit very easy to decrypt the data.

To make the data less accessible a method called client-side encryption willbe used to encrypt all the users’ data before it’s sent to the cloud provider.In contrast to server-side encryption, where the encryption key is storedby the cloud provider, the client-side encryption approach only stores theencryption key locally. This will prevent the cloud provider from accessingthe data since they won’t know how to decrypt the it.

2Unauthorized people could be employees of the cloud provider or people who brokeninto the cloud providers system

Page 11: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

3 1.2. PROBLEM STATEMENT

1.2 Problem statement

First of I will look at existing solutions that offers Storage-as-a-service. Thesolutions that are interesting are those who offer some kind of client-sideencryption. Secondly the most common client-side encryption techniques willbe identified and described in more detail. Advantages and disadvantageswith the different approaches will be pointed out. The goal is to decide whichencryption technique offers the highest security level. Then in order to seehow the encryption affects the performance of the client application, a testshould be implemented to see how the encryption of large files affects theexecution time of the application. In the last part a discussion about thedifferent encryption technique will be presented. Hopefully this thesis will beable to identify the biggest problems related to client-side encryption.

1.3 Definitions

In this section terms often used in this thises will be defined.

Salt:

Salt is often random generated data used to encrypt data. The purpose ofthe salt is to aggravate, so called rainbow attacks [3]. In a rainbow attackthe hacker generates a table of encryption keys. The table is generated onceand then used to test all the generated keys for a given number of users. Theidea is to add a salt when generating the encryption key. The salt shouldbe generated by random, or at least be different for every user. This forcesthe hacker to generate a new rainbow table for every user, which is a veryexpensive operation. The salt is considered public information, which meansthat even if the salt is known to the hacker, it will still increase the resourcesneeded to crack the encryption.

SHA:

Secure hash algorithm (SHA) was developed by the United States NationalSecurity Agency. Together with MD5, SHA is the most conventional hashfunction used in cryptography.

AES:

Advanced Encryption Standard (AES) is a standardized encryption algo-rithm developed by National Institute of Standards and Technology. The al-

Page 12: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

4 CHAPTER 1. INTRODUCTION

gorithm is built to use encryption keys by length 128, 192 or 256 bit [4].

Account password:

This is a password that is used to authenticate a user when logging in tothe system. The account password will be stored in the cloud and there byaccessible to the ones who got access to the cloud-provider.

Archive password:

This is a password used to encrypt data. Its only stored locally unlike anaccount password, which is stored online. It’s also worth mentioning that ifthe archive password is lost there will be no way to decrypt the data.

Page 13: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

5

Chapter 2

Existing solutions that offersStorage-as-a-service

There are cloud providers who try to ensure the privacy of their users. Peoplefrom Fraunhofer Institute for Secure Information Technology have written areport in which they compare different cloud storage providers and evaluatethe applications based on different criteria [5]. The criteria that are evaluatedare whether the applications support any kind of encryption technique amongother things. Out of the seven applications that are benchmarked, the fourapplications that support client-side encryption has been selected in order toidentify common techniques used for client-side encryption. The applicationsthat will be presented in this chapter are CrashPlan, Mozy, TeamDrive andWuala. In the last part of this chapter other applications, which is notpresented in the report witten by Fraunhofer Institute for Secure InformationTechnology, will be studied in order to see if they have come up with anyother solution to the client-side encryption problem.

2.1 CrashPlan

CrashPlan1 offers three kinds of encryption techniques. As default the ac-count password, which is known by CrashPlan, will be used to generate a128-bit encryption key. Secondly the user could choose an archive password,which is not known to CrashPlan, it will be used to encrypt the encryption

1Applcation created by Code 42 Software

Page 14: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

6CHAPTER 2. EXISTING SOLUTIONS THAT OFFERS

STORAGE-AS-A-SERVICE

key. The encrypted key will be stored in the cloud and distributed to otherclients. In the third alternative the user enters an encryption key which isonly stored locally.

2.2 Mozy

Mozy2 offers two methods for encryption. All data is encrypted on theclient before sent to the cloud-provider. The first option is to use a 448-bitencryption key provided by and also known to Mozy. The user could alsoenter a private 256-bit encryption key which will only be stored locally.

2.3 TeamDrive

TeamDrive3 uses a concept called space which is similar to a folder. Whencreated the space could be made empty or based on an existing folder. Allfiles that are stored in the space will be transmitted to the cloud provider. Forevery space a unique AES-256 key is generated which means that every spacehas an individual encryption key. In order to share spaces between differentdevices the encryption key for that particular space has to be distributed tothe other devices. This is done by letting the user export the key to a “.pss”-file. The file then has to be transferred by the user to the new device.

2.4 Wuala

Wuala4 uses something called convergent encryption. Based on each file’scontent a hash is calculated, the hash is used to encrypt the file. The hashis then encrypted using the account key. The only way to access the keyis to own the original file. The method has one big flaw; it’s open to socalled “confirmation of a file attack” where the attacker knows the content ofa file. If this is the case then they can verify that a user owns a copy of thatfile. The attack is most efficient if the text is publicly available, for examplecopyrighted material. It’s also very simple to see if two users share the samefile.

2Applcation created by EMC Corporation3Applcation created by TeamDrive Systems4Applcation created by LaCie

Page 15: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

7 2.5. SUMMARY OF COMMON ENCRYPTION TECHNIQUES

2.5 Summary of common encryption techniques

Both CashPlan and Mozy offer server-side encryption, or rather a key gen-erated and stored by the cloud provider. The applications also lets the userenter an encryption key which are only stored locally. CashPlan also offers athird alternative where the user enters an archive password. TeamDrive onthe other hand generates a key when a so called space is created, which isonly stored locally. Wuala uses convergent encryption where the encryptionkey is calculated based on the content of the file being encrypted.

2.6 Other solutions

There are other Cloud providers, which are not mentioned in the report writ-ten by Fraunhofer Institute for Secure Information Technology, which offersclient-side encryption. Applications like Idrive5, Swissdisk6 and SpiderOak7. They have solved the client-side encryption by using the techniques men-tioned in previous section. To be more specific Idrive lets the user enter anprivate encryption key. Swissdisk and SpiderOak uses an archive passwordin order to generate an encryption key.

5Applcation created by IDrive Inc6Applcation created by SwissDisk ICS7Applcation created by SpiderOak

Page 16: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

8CHAPTER 2. EXISTING SOLUTIONS THAT OFFERS

STORAGE-AS-A-SERVICE

Page 17: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

9

Chapter 3

Client-side encryptionstrategies

By studying the existing solutions I have identified two main approachesto solve the problem concerning client-side encryption. In this chapter thisapproaches will be presented and their strengths and weaknesses will bepointed out.

3.1 User supplied key

It’s pretty common to let the user enter a generated encryption key which willonly be stored locally. The key could sometimes be generated by the clientapplication or in other cases third party programs like an online key generatorcould be used. In order to make it harder to crack the encryption the usershould make sure that the encryption key is based on some random element.The length of the key is also an important factor. Today the recommendedlength of an encryption key is 256-bits, since the AES supports encryptionkey up to 256-bits[4].

One flaw with this technique is that there can be many devices connected tothe same user account. If that’s the case then the encryption key has to bedistributed between the different devices. One simple solution would be tolet the user memorize the 256-bit long encryption key. If a the encryptionkey would be presented using common characters1 used in passwords the key

1The definition of common characters are [0-9], [a-z] and [A-Z]

Page 18: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

10 CHAPTER 3. CLIENT-SIDE ENCRYPTION STRATEGIES

will be approximately 43 characters long. The probability that the user willbe able to memorize this long random generated key is not reasonable.

There are other ways to distribute the encryption key like the approach usedby TeamDrive, where the encryption key is exported to a “.pss”-file. Onething to remember is the fact that no information about the encryptionkey should be stored in the cloud, for security reasons. The cloud providercan’t be involved in the key distribution for the same reasons as server-sideencryption shouldn’t be used. The risk that the encryption key is hijackedby the cloud provider is too great a threat.

3.2 Password based key

Another common way to achieve client-side encryption is to let the user enteran archive password, which will be used to encrypt the data. Based on re-search made by a scientist from Council for Scientific and Industrial Researchin 2009, most passwords are between 6-9 characters long [6]. For more de-tailed statistics see Figure 3.1. Compared to the 43 characters that a 256-bitencryption key corresponds to, a password would most likely result in a re-duced number of possible key combinations. See Table 3.1 for information onhow the password length affects the number of possible combinations.

Characters Number of combinations Number of bits6 5, 68002 · 1010 ∼ 36− bits7 3, 52161 · 1012 ∼ 42− bits8 2, 18340 · 1014 ∼ 48− bits9 1, 35371 · 1016 ∼ 54− bits10 8, 39299 · 1017 ∼ 60− bits20 7, 04423 · 1035 ∼ 120− bits30 5, 91222 · 1053 ∼ 180− bits40 4, 96212 · 1073 ∼ 240− bits43 1, 18261 · 1077 ∼ 256− bits

Table 3.1: How the number of characters ([0-9][a-z][A-Z]) used in a passwordaffects the number of possible key combinations. The last column shows howmany bits is needed to represent the number of combinations.

Page 19: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

11 3.2. PASSWORD BASED KEY

Figure 3.1: The diagram shows how many percent of the 46000 MySpaceusers, used a given numbers of characters in their passwords

To increase security something called Password-Based Key Derivation Func-tion (PBKDF) could be used. The purpose of a PBKDF is to take a passwordand based on that generate a more complex encryption key, and thereby in-crease the time needed to crack the encryption [7]. The function adds a saltto the password. The purpose of the salt is to prevent rainbow attacks, seesection 1.3 for more information. To make this possible the salt has to bedifferent for every user. When choosing salt a simple solution would be touse the username as salt. This will ensure that every user gets a uniquesalt.

Another solution could be to use something called a "keyfile" where the saltwould be based on the content of the file. The file could be any file, forexample a family photo. The strategy is used by applications like TrueCrypt[5]. Like the client-generated encryption key, the information has to be dis-tributed between the clients. Since the salt is considered public information,the file could be stored in the cloud unencrypted.

To make it even harder to get access to the encrypted data a unique randomgenerated salt could be used. The salt has to be stored together with the

Page 20: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

12 CHAPTER 3. CLIENT-SIDE ENCRYPTION STRATEGIES

encrypted data.

After the salt has been added the resulting string is hashed using an approvedhash function, like SHA-256, to generate a 256-bit key. In order to increasethe resources needed to crack the encryption, the encryption key is hasheda given number of times. Like the salt, the number of iterations is consid-ered public information. In a report written by people from the NationalInstitute of Standards and Technology, the number of iterations should beat least 1000[7]. This means that an attacker would have to do 1000 hashcomputations for every password, which increase the time needed before hewill be able to test a given password. This is based on the assumption thatthe attacker knows the hash function and the number of iterations.

From the users perspective the time needed to make the calculations won’tmake a big difference. As long as the number of iterations is not too highwhich will result in a delay in the application. 1000 iteration is consideredminimum while using a PBKDF. Since an increased number of iterationsamplify the resources needed to calculate the encryption key the higher thenumber the better. Since the system should be able to support different de-vices the devices with the smallest amount of computational power should bethe one determining the number of iterations. Smartphone’s should probablybe considered the weakest link.

In a report written by people from Horst Görtz Institute for IT-Security, asmartphone with a 1GHz ARM processor should be able to do 4000-10000iterations in what they defined as a reasonable amount of time [8]. Sincethe number of iterations has a huge impact on the time needed to break theencryption it is desirable to have as large number of iterations as possible. Touse 4000 iterations instead of 1000 would mean that the time would increaseby four times.

In their report they also suggested the use of dynamic iteration count wherethe number of iteration depends on the current computational power. Forexample how many iteration the system is able to do in a limited amountof time. The iteration count is then stored with the encrypted data tomake sure that the data could be decrypted. With this method the num-bers of iterations would increase over time according to technological scalingeffects.

Page 21: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

13 3.2. PASSWORD BASED KEY

3.2.1 Test implementation

In order to test the time needed to encrypt data a small scale implementationhas been made. To keep it simple a client-server application which handlesnotes was developed. First off, client-side encryption was implemented usingJava’s Crypto library. In order to generate an encryption key an existingPassword-based key derivation function was used. The function used theaccount username as salt and an archive password provided by the user. Ithashed the salt and password combination 2000 times using SHA-1. Theproduced key follows the AES.

The implementation was used to test how the encryption affects the perfor-mance of the client application. To do the test a number of files of givensize was encrypted. The test showed that the encryption time where lineardependence of the size of the file. It takes less than a second to encrypt 20megabytes of data which must be considered relatively fast. The test wasmade on a laptop with 2,4Ghz Intel core duo processor and 2 GB ddr3 RAM.The operation system used was Windows 7 (32-bit).

Since users access the cloud through internet a comparison between the en-cryption and the upload speed of the internet was made.

In a report written by people from Akamai Technologies the average internetspeed in Sweden is 7.3-Mbit/s [9]. Let’s convert it to megabytes per secondin order to see how fast data could be sent to the cloud provider.

Megabit per secondNumber of bits per byte

= Speed in megabyte per second

7.3

8= 0, 9125

In Figure 3.2 the speed needed to encrypt data is compared to the speedneeded to upload the data to the cloud provider. The figure shows thatthe time needed to upload a file is much higher than the time needed toencrypt the data. In this case the time needed to encrypt the data will beinsignificant. In order to see whether a higher internet speed would be ableto compete with the encryption time I chose an internet speed of 200-Mbit/s.In this case the encryption time was slower than the time it took to uploadthe file, at least for files smaller than 30 megabytes. The result it presentedin Figure 3.3.

Page 22: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

14 CHAPTER 3. CLIENT-SIDE ENCRYPTION STRATEGIES

Figure 3.2: The time needed to encrypt data of different size compared withtime needed to send the data to the cloud. Based on an internet connectionof 7.3-Mbit/s

Page 23: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

15 3.3. PBKDF VS. RANDOM BASED ENCRYPTION KEY

Figure 3.3: The time needed to encrypt data of different size compared withtime needed to send the data to the cloud. Based on an internet connectionof 200-Mbit/s

3.3 PBKDF vs. Random based encryption key

In order to show how much time would be needed to break an encryptionkey made by a PBKDF compared to a generated encryption key based onrandom elements, a small example will be presented. In this example it willbe assumed that a computer would be able to test 109 password per secondin a brute force attack.

PBKDF:The PBKDF creates an encryption key based on an 8 character2 long pass-word. The number of password combinations would then be approximately1014. It will be assumed that the time needed to generate a key would be

2The characters that could be used in the password are [a-z] [A-Z] [0-9]

Page 24: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

16 CHAPTER 3. CLIENT-SIDE ENCRYPTION STRATEGIES

0,2 seconds. To clarify the time needed to generate the key is the time itwill take to do add the password and the salt and doing a given numberof hash computations. This means that the attacker would be able to test5 keys every second when a PBKDF is used since he has to compute thecorresponding key for every given password. To be more exact it would take1 second + 5/109 seconds but it will round it to one second.

Random based encryption key:Since this encryption key is based on random elements it does not have acommon denominator as the PBKDF has. If a 256-bit long encryption keywill be generated there will be approximately 1077 possible key combinations.As mentioned before it’s assumed that the attacker will be able to test 109

keys every second. So for every key tested using PBKDF 2 · 108 keys wouldbe tested using the random based encryption key approach.

In order to get the number of seconds it would take to break an encryptionkey the total number of combinations has to be divided by the number oftested keys per second.

Generic formula:

Number of possible combinationsNumber of tested keys per second

= Second needed to crack encryption

PBKDF:1014

5= 2 · 1013seconds ≈ 634196years

Radom based key:

1077

109= 1068seconds ≈ 3 · 1060years

A summary of the number of possible key combinations and the time neededto crack a given encryption key is presented in Table 3.2. Let’s compute therelation between the number of combinations and the time needed to crackthe encryption.

Page 25: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

17 3.3. PBKDF VS. RANDOM BASED ENCRYPTION KEY

Relation between the numbers of combinations:

1077/1014 = 1063

Relation between time needed to crack encryption:

1068/(2 · 1013) = 5 · 1054

The relation between the time needed to break an encryption and the relationbetween to the number of possible combinations has decreased. Even thoughthe PBKDF increases the time needed to break the encryption it still isn’tenough to compensate for the lack of key combinations.

PBKDF Random based keyNumber of key cobinations ∼1014 ∼1077

Time needed to crack encryption ∼ 2 · 1013seconds ∼ 1068seconds

Table 3.2: A summary of the number of possible key combinations and thetime needed to crack a given encryption key. The PBKDF is based on apassword containing 8 characters while the Random based key is a 256-bitencryption key.

Page 26: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

18 CHAPTER 3. CLIENT-SIDE ENCRYPTION STRATEGIES

Page 27: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

19

Chapter 4

Conclusion

In this chapter a discussion about whether PBKDF or Random based en-cryption keys should be used, will be presented. Benefits and drawbackconcerning the use of dynamic iterations will be pointed out. Then otherbig holdback that companies should take into account before using client-side encryption will be presented. Last suggestions to areas which could bestudied further will be introduced.

4.1 Client-side encryption

The problem that prevents users from encrypt all data using client-side en-cryption, as I see it, is the fact that if the encryption key is lost all datawill be irretrievable. Therefore the user should have a choice whether to useclient-side encryption or not. Information about the fact that the data willbe irretrievable when the encryption key is lost should be pointed out tothe users, as should the benefits with client-side encryption. An example ofan application that does not show the benefits with client-side encryption isMozy, even though they offer the service. The user is only informed that thedata will be irretrievable if the password is lost. The fact that this wouldincrease security is never mention.

Page 28: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

20 CHAPTER 4. CONCLUSION

4.2 PBKDF or Random based encryption key

One of the most secure ways of encrypting data is to use the "User suppliedkey" approach mentioned in Section 3.1. Even though is very hard crack theencryption the task of distributing the encryption key will be rather complex.There are applications like TeamDrive where an encryption key en generatedand stored locally. In the end the same key distribution problem will occurin this approach. In order to make it easier to distribute the encryption keyTeamDrive has a feature where the user could export the encryption key toa “.pss”-file. The user then has to transfer the “.pss”-file to all the differentdevices. I personally would not appreciate to have to transfer file betweenall my devices to be able to access my data. For example some applicationsmake it possible to access files though a web browser. Sometimes you are atpublic places like an internet cafe and want to access data though the webbrowser. In order to do so you have to access your key in some way. It is nota practical solution thought I guess it could be solved in some way, maybeby using third party software which could store the encryption key.

A more practical alternative would be to base the encryption key on a pass-word and then use a Password-based key derivation function. As mentionedbefore, the biggest disadvantages with this approach is that users tend touse short passwords and it often follows some kind of pattern. This resultsin a weak encryption key since the only private information is the password.If someone would use this approach I think it’s important that the softwareinforms the user whether the archive password is considered weak. Person-ally I would prefer this approach because it is more practical. The goal wasto increase the security of the user’s data then apparently a Random basedencryption key approaches is more desirable.

4.3 Dynamic iteration

In Section 3.2 where the “Password based key“ approach where presented,the use of dynamic iteration count was introduced. The idea is good sincethe number of iterations increases relative to the computers computationalpower. The technique has one big flaw. When creating applications formultiple devices, where the difference in computational power is large, therecould be cases where devices won’t be able to do the hash computations in areasonable amount of time. For example if a device with high computational

Page 29: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

21 4.4. CLIENT-SIDE ENCRYPTION DRAWBACKS

power, let’s say a desktop computer, encrypt a file. Then the file is sharedwith a device with low computational power, let’s say a Smartphone. Inorder for the Smartphone to decrypt the file it has to do as many hashiterations as the computer. Since it’s a difference in computational power itwill probably take the Smartphone a noticeable amount of time to decryptthe data.

4.4 Client-side encryption drawbacks

There are factors that companies have to take into account before they decideto use client side encryption. The problem is that so far we have just consid-ered systems involving a single user. Even though the use of multiply deviceshas been considered, a single user has been responsible for distributing theencryption key.

What if a company wants to start using a cloud provider? Let’s say theydecide to use the “User supplied key” approach in Section 3.1, to encrypt theirdata. A key is generated and spread to all the employees. Later an employee,in this example we will call him John, get’s fired. Now he poses as a securitythreat since he may have stored the encryption key used by the company. Toprevent John from access the files the company could download all the filesfrom the cloud provider and encrypt it using a newly generated encryptionkey. This is not really efficient. Like shown in the test implementation inSection 3.2.1, for the time being the internet speed is rather slow. There areexamples where the cloud providers limit the download speed of their usersmaking the download even slower. Wouldn’t it be easier if John just didn’thave access to the decryption key?

Note that there are algorithms which separate encryption and decryptionkeys. The algorithms are called asymmetrical keys, it consist of a privateand a public key. The private key is used to decrypt data, it’s also used formathematically calculating new public keys. A public key is used to encryptthe data which could only be decrypted using the private key.

So back to the example if every employee got a public key for encryptingthe data and only a few administrators got access to the private key. Inorder to keep the private key hidden and still enable regular employees todecrypt data a centralized server could be used for decrypting the data. Thecentralized server could also be called a key manager.

Page 30: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

22 CHAPTER 4. CONCLUSION

A key manager is responsible for storing encryption keys. They could oper-ate at different levels, it could be built into an application and sometimes anexternal key manager could be used for handling keys for multiply software’sat once. Normally new encryption keys are generated with given time in-tervals like once every month, as a security precaution. If the private key iscompromised then the intruder won’t have access to all the companies’ data.As mentioned before it’s not efficient to encrypt data using a new encryptionkey since the data has to be downloaded every single time. In order to beable to access files encrypted with outdated encryption keys a history of keyshas to be handled by the key manager.

In the example a centralized server where used, but some key managementsystems uses a distributed approach where the data is encrypted and de-crypted locally and the sent to the cloud provider. A distributed solutionrequires significantly less bandwidth since the data won’t be sent to the cen-tralized server for decryption. A distributed solution will eliminate pointof failure. But the implementation will probably be more complex than acentralized solution.

The real problem with client-side encryption isn’t the encryption itself buthow to managing the encryption keys. As shown in this thesis there area few common strategies to solve this when only a single user is involved.The problem get rather complex when multiply users are involved. Thereare organizations which have tried to make guild lines for how do designkey management systems. For example National Institute of Standards andTechnology wrote a report in 2012 in which they tried to show the problemsrelated to key management combined with some guild lines [10]. The sameyear Securosis, L.L.C. wrote a report where they tried to show the differentlevels of key management and when to apply them to get the best result.They also mentioned that there is an increased standardization of communi-cation protocols between key management systems and encryption systems[11]. In 2008 Nubridges gave out a report based what they think is the eightbest practice for designing a key management system [12].

4.5 Future work

Key management is a topic which could be investigated even further. Asmentioned in this thesis key management is a crucial element in whether theclient-side encryption cloud be used or not. It would be fun to investigate

Page 31: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

23 4.5. FUTURE WORK

which standards exist. Sometimes companies exchange encrypted data witheach other. How does this affect the key management system?

Page 32: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

24 CHAPTER 4. CONCLUSION

Page 33: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

25

Bibliography

[1] D. Evans, “The internet of things - how the next evolution of the inter-net is changing everything.” http://www.cisco.com/web/about/ac79/docs/innov/IoT_IBSG_0411FINAL.pdf, 2011. [Online; accessed 2013-05-17].

[2] R. Hartmann, “The bring your own services (byos) paradox.”http://www.varonis.com/news-events/press-releases/2012/byos-paradox.html, 2012. [Online; accessed 2013-04-29].

[3] J. Ullrich, “Isc diary – hashing passwords.” http://www.dshield.org/diary/Hashing+Passwords/11110, 2011. [Online; accessed 2013-05-08].

[4] J. McCaffrey, “Keep your data secure with the new advanced encryptionstandard.” http://msdn.microsoft.com/en-us/magazine/cc164055.aspx, 2003. [Online; accessed 2013-05-17].

[5] M. H. T. K. M. R. U. V. Moritz Borgmann, Tobias Hahnand S. Vowe, “On the security of cloud storage services.”https://www.sit.fraunhofer.de/fileadmin/dokumente/studien_und_technical_reports/Cloud-Storage-Security_a4.pdf, 2012.[Online; accessed 2013-04-24].

[6] R. van Heerden and J. Vorster, “A statistical analysis oflarge passwords lists, used to optimize brute force attacks.”http://researchspace.csir.co.za/dspace/bitstream/10204/3328/1/Van%20Heerden_2009.pdf, 2009. [Online; accessed 2013-04-24].

[7] W. B. Meltem Sönmez Turan, Elaine Barker and L. Chen, “Recom-mendation for password-based key derivation - part 1: Storage appli-cations.” http://csrc.nist.gov/publications/nistpubs/800-132/nist-sp800-132.pdf, 2010. [Online; accessed 2013-04-24].

Page 34: Cloud computing from a privacy perspective - umu.se · 2013-09-11 · 2.4 Wuala Wuala4 uses something called convergent encryption. Based on each file’s contentahashiscalculated,thehashisusedtoencryptthefile.

26 BIBLIOGRAPHY

[8] M. K. C. P. T. Y. Markus Dürmuth, Tim Güneysu and R. Zimmermann,“Evaluation of standardized password-based key derivation against par-allel processing platforms.” http://www.emsec.rub.de/media/crypto/veroeffentlichungen/2013/01/29/esorics_pbkdf2.pdf, 2013. [On-line; accessed 2013-05-15].

[9] B. R. David Belson, Tom Leighton, “State of the internet.” http://www.akamai.com/stateoftheinternet/, 2012. [Online; accessed 2013-06-06].

[10] W. B. W. P. Elaine Barker, William Barker and M. Smid,“Recommendation for key management – part 1: General.”http://csrc.nist.gov/publications/nistpubs/800-57/sp800-57_part1_rev3_general.pdf, 2012. [Online; accessed 2013-08-16].

[11] R. Mogull, “Pragmatic key management for data encryp-tion.” https://securosis.com/assets/library/reports/Pragmatic-Key-Management.v.1.pdf, 2012. [Online; accessed2013-08-16].

[12] Nubridges, “Best practices in encryption key management datasecurity.” http://www.northdoor.co.uk/_assets/_download/CBBF299D-5056-897F-ED891771F53907B2.pdf, 2008. [Online; ac-cessed 2013-08-16].