Top Banner
AN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by MOHAMMAD AHMADIAN M.S. University of Central Florida, 2014 M.S. Amirkabir University of Technology, 2009 A Proposal submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in the Department of Electrical Engineering and Computer Science in the College of Engineering and Computer Science at the University of Central Florida Orlando, Florida Fall Term 2016 Major Professor: Dan C. Marinescu
60

AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

Jun 27, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

AN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTEDNOSQL DATABASES IN A PUBLIC CLOUD

by

MOHAMMAD AHMADIANM.S. University of Central Florida, 2014

M.S. Amirkabir University of Technology, 2009

A Proposal submitted in partial fulfilment of the requirementsfor the degree of Doctor of Philosophy

in the Department of Electrical Engineering and Computer Sciencein the College of Engineering and Computer Science

at the University of Central FloridaOrlando, Florida

Fall Term2016

Major Professor: Dan C. Marinescu

Page 2: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

c© 2016 Mohammad Ahmadian

ii

Page 3: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

ABSTRACT

Processing the vast volume of data generated by web and mobile applications necessitates a scal-

able and flexible data management system. Database-as-a-Service (DBaaS) is a new paradigm

offered by cloud computing promising a cost-effective and efficient database functionality that

meets all requirements. However, outsourcing data storage to clouds changes significantly the

threats and adds new dimension to data security. While many traditional data processing threats

remain, DBaaS introduces new challenges such as confidentiality violation and information leak-

age from privileged malicious insiders. We consider the problem of building a secure DBaaS on

top of a public cloud infrastructure where the Cloud Service Provider (CSP) is not completely

trusted by the data owner. We present a high level description of several architectures that combine

recent and modern cryptographic primitives to achieve our goal. In this thesis a novel search-

able security scheme is proposed to leverage secure query processing in presence of a malicious

cloud insider without disclosing sensitive information. Comprehensive database security scheme

comprises more than just encryption and this thesis is focused on information leakage prevention.

Therefore, information leakage prevention as a key challenge is addressed. The main contributions

of our work are:

i. Searchable security scheme for non-relational databases of the cloud DBaaS; ii. Leakage min-

imization in the untrusted cloud. The analysis of experiments that employ a set of established

cryptographic techniques to protect databases and minimize information leakage, proves that per-

formance of our solution is bounded by communication cost and not cryptographic computation.

iii

Page 4: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

To Ghazal & my dear parents.

iv

Page 5: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

ACKNOWLEDGMENTS

I would like to express my sincere gratitude to my advisor Prof. Dan C.Marinescu for the con-

tinuous support of my Ph.D. study and research, for his patience, motivation, enthusiasm, and

immense knowledge. His guidance helped me in all the time of research and writing of this thesis.

Moreover, I would like to thank the rest of my thesis committee: Prof. Joseph Brennan, Dr. Mark

Heinrich, and Dr. Pawel Wocjan, for their encouragement and insightful comments.

v

Page 6: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

TABLE OF CONTENTS

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

CHAPTER 1: INTRODUCTION AND MOTIVATION . . . . . . . . . . . . . . . . . . . 1

1.1 Cloud Relational Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.1 Searchable Security Scheme For RDBMS . . . . . . . . . . . . . . . . . . 2

1.1.2 Cloud Data Storage And Management components . . . . . . . . . . . . . 3

1.2 Cloud NoSQL Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 Data Models For NoSQL Databases . . . . . . . . . . . . . . . . . . . . . 5

1.2.2 Searchable Security Scheme For NoSQL databases . . . . . . . . . . . . . 7

1.3 Leakage Proof Data Processing In Public Cloud . . . . . . . . . . . . . . . . . . . 8

1.3.1 Cryptosystems For Outsourced Data Store . . . . . . . . . . . . . . . . . . 9

1.4 Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

CHAPTER 2: RELATED WORK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

CHAPTER 3: RESEARCH OBJECTIVES AND APPROACH . . . . . . . . . . . . . . . 19

vi

Page 7: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

3.1 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.3 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.4 JSON And BSON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.5 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

CHAPTER 4: CURRENT WORK AND PRELIMINARY RESULTS . . . . . . . . . . . 24

4.1 SecureNoSQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.1.1 The Proposed Construction: SecureNoSQL Scheme . . . . . . . . . . . . 27

4.1.2 Security Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.1.3 Processing Queries On Encrypted Data . . . . . . . . . . . . . . . . . . . 35

4.1.4 Measurements And Experimental Results . . . . . . . . . . . . . . . . . . 37

4.2 Leakage Prevention In DBaaS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.2.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

CHAPTER 5: CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.1 Work In Progress And Tasks Time Table . . . . . . . . . . . . . . . . . . . . . . . 42

5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

vii

Page 8: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

LIST OF REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

viii

Page 9: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

LIST OF FIGURES

4.1 Architecture of SecureNoSQL. . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2 The high level structure of the security plan. . . . . . . . . . . . . . . . . . . 29

4.3 Structure and description of Collection: (a) The chart outlines the structure of

collection containing the name of collection and name of all fields which are

considered as meta-data thus should be protected with proper cryptographic

module. The pointer to a cryptomodule, the encryption key, and the initial-

ization vector used for the encryption of the items. (b) The description of

a collection and security parameters in designed JSON based language. In

this specific case the Advanced Encryption Standard in deterministic (AES-

DET) mode with a 128-bit key and an initialization vector (IV) is assigned to

encrypt the name of the collection and the fields name. . . . . . . . . . . . . 30

4.4 Structure and description of Cryptographic modules: (a) Security Plan with

the second section, the cryptographic module, expanded. The attributes in-

cluded for each module are: name, type, key size, key, input and output size.

(b) The OPE encryption including the cryptosystems and their attributes. The

proxy applies these modules using the key-value pairs (KVP). . . . . . . . . 31

ix

Page 10: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

4.5 Structure and description of Data element: (a) The chart outlines the structure

of Data elements containing attributes of data elements such as name, type

and value for of collection and name. Then introduces security parameters

for each data elements. (b) The data element section of a sample database

which are represented in designed notation. A data item has 7 fields: id,

name, salary, balance, ccn, ssn, and email. The id, name, email and salary

are required fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.6 Structure and description of Mapping cryptographic modules to the Data ele-

ment: (a) Security plan with the fourth section expanded. This section estab-

lishes a correspondence between the data fields and the cryptographic mod-

ules used to encrypt and decrypt it. (b) The mapping section of the schema

for a sample database with 7 fields. For example, the id and the name will be

encrypted with OPE 128 bit and AES-DET, respectively. . . . . . . . . . . . 33

4.7 SecureNoSQL applied to: (a) The key-value data model; Key1, . . . , Keyn

are all encrypted using the cryptographic module z while the correspond-

ing values, V alue1, . . . , V aluen are encrypted with cryptographic modules

1, 2, . . . , n, respectively. (b) The document store data model; the meta-data

such ass collection name encrypted as well as attributes with assigned cryp-

tographic modules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.8 The validation process of input data against security plan in the client side. . . 35

4.9 Security plan designed for sample input: (a) Data element section of sample

security plan. (b) Output of JSON Data validation for sample database. . . . . 36

x

Page 11: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

4.10 The query db.customers.find({salary:{$gt:5000}, balance:{$lt:2000}}) re-

ceived from an application. (a) The parsing tree of the query (b) The crypto-

graphic modules applied to the data elements according to schema definition . 37

4.11 Query processing time in milliseconds (ms) for the unencrypted database and

for the encrypted databases when the 32-bit keys are encrypted as 64, 128, 256

and 512-bit integers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.1 Estimate work plan and timeline . . . . . . . . . . . . . . . . . . . . . . . . 43

xi

Page 12: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

LIST OF TABLES

2.1 Information leakage management methods comparison . . . . . . . . . . . . 18

4.1 Overhead of encryption upon security level . . . . . . . . . . . . . . . . . . 35

4.2 Sample queries and their corresponding encrypted version . . . . . . . . . . 37

5.1 List of publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

xii

Page 13: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

CHAPTER 1: INTRODUCTION AND MOTIVATION

Cloud computing is an appealing alternative for data processing but, at the same time, it raises

serious concerns regarding the security of sensitive data. Is it feasible to outsource computation

without reveling private information? This thesis tries to find an answer to this question through the

investigation. Web and mobile applications based on cloud services are ubiquitous and Database

As a Service (DBaaS) is one of the most important cloud services for data storage. The idea

of outsourcing storage and processing of private data to a third party is a high-risk decision that

makes applications vulnerable to unauthorized access by external or by malicious cloud insiders.

Security and privacy in the cloud environment are a critical concern for cloud users. The majority

of the cloud service providers (CSPs) support features that allow system administrators to deploy

a basic level of security controls for hosted datasets. Nevertheless, it seems that there is no full-

proof accepted solution to prevent unauthorized access by malicious insiders who have unlimited

access to the entire system. The security and privacy threats associated with cloud computing

negatively affect all cloud services and act as an inhibitor for potential cloud customers. Many

cloud users have sensitive data related to their enterprise, so any unauthorized data access will

devastate their business. We propose a solution that satisfies security requirements for applications

using DBaaS for database functionality either as relational (RDBMS) or non-relational (NoSQL)

database management systems.

In the proposed searchable security scheme there are three interested parties, the data owner, the

cloud service provider and the user’s applications. This thesis assumes that in public cloud en-

vironment all three parties interact. The proposed scheme is easily adaptable to the hybrid and

community cloud environment where the security risk is lower than the public cloud. Most CSPs

such as Amazon Web Services (AWS), Google AppEngine, Microsoft Azure are providing full

featured RDBMS and NoSQL database systems. We first examine the security requirements of

1

Page 14: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

DBaaS. Then, we report on our scheme so called “SecureNoSQL” for query processing over en-

crypted NoSQL database the first scheme for NoSQL databases.

1.1 Cloud Relational Databases

RDBMS are widely used by most organizations for supporting data management for many appli-

cations. There are IT professionals are trained to implement application with using RDBMS. For

those reasons RDBMS have been offered as a service by CSPs even at their early development

stages. Nowadays, cloud RDBMS such as Relational Database Service (RDS) offered by Amazon

Web Services (AWS) is a cost-efficient database functionality. AWS RDS provides six popular

database engines to choose from, including Amazon Aurora, Oracle, Microsoft SQL, PostgreSQL,

MySQL and MariaDB.

1.1.1 Searchable Security Scheme For RDBMS

RDBMS are used by operational database systems for On-Line Transaction Processing (OLTP).

Cloud computing adopted RDBMS and equipped it to more features and delivers it as a fully

managed and integrated service. Therefore, the cloud RDBMS is ideally suited for complex query-

intensive analytic workloads. In cloud DBaaS the application developer plays a more important

role than on-premise computation because cloud eliminate the need for database administrators

and this can be seen as another reason for the popularity of DBaaS. Major CSPs such as Amazon

Web Services, Microsoft Azure and Google Cloud Platform offer a broad range of cloud storage

and data management that help organizations move faster from on-premise computing to cloud

computing.

2

Page 15: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

1.1.2 Cloud Data Storage And Management components

Cloud storage is a cost effective and scalable service that allows customers to store and access data

anywhere and anytime using the Internet. In the cloud storage model, the CSP is considered as

third party that provides reliable storage service for users who pay on a per use basis. CSPs store

multiple copy of data redundantly across different geographical location to reduce access time and

facilitate disaster recovery. Although cloud storage is cost-effective, it poses significant security

and privacy risks. Once in cloud storage the owner of the data has no longer control on where it

is stored and what how it is protected against unauthorized access. For instance, AWS offers an

array of flexible and affordable data management services including Simple Storage Service (S3),

SimpleDB, RDS, Elastic Compute Cloud (EC2) and DynamoDB.

Amazon Simple Storage Service: AWS S3 uses a simple data model, consisting of two types of

storages: objects and buckets. Objects, like files, contain data and metadata but, objects are not

organized in a hierarchy and every object exists at the same level. A bucket is a logical unit of

storage used to store objects. From the security viewpoint, S3 only provides an access control

mechanism based on rules to either grant or deny access permission to the S3 objects or buckets.

Obviously, having access control does not provide protection for S3 data against malicious insider.

Encryption can be applied for the stored data to protect from the cloud internal. Data in a bucket

can be encrypted to protect it from either insider or outsider threats.

Amazon Elastic Compute Cloud: EC2 is a virtual server on demand that user can manage it like

a physical machine. EC2 can be created by the API or management console. AWS has defined a

unit for measuring the processing power of an EC2 instance to ensure their performance remain

consistent over time. AWS offers verity of choice for the EC2 instances that offer different level

of performance and resources with the corresponding different in pricing. EC2 uses the public key

part of the key pair associated with the AWS account to secure login, so that only someone with

the corresponding private key can access to the EC2 instance. In addition, by using concept of

3

Page 16: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

security group that are basically collections of rules the traffic of EC2 instance is manageable.

1.2 Cloud NoSQL Databases

The name NoSQL given to the storage model discussed in this thesis is misleading. Michael Stone-

breaker notes that blinding performance depends on removing overhead. Such overhead has noth-

ing to do with SQL, but instead revolves around traditional implementations of ACID transactions,

multi-threading, and disk management” [46]. The “soft-state” approach in the design of NoSQL

databases allows data to be inconsistent and transfers the task of implementing only the subset of

the ACID properties required by a specific application to the application developer. The NoSQL

ensures that data will be “eventually consistent” at some future point in time, instead of enforcing

consistency at the time when a transaction is “committed”. Data partitioning among multiple stor-

age servers and data replication are also tenets of the NoSQL philosophy; they increase availability,

reduce the response time, and enhance scalability.

Scalability and availability are critical requirements for E-commerce, social networks and other

applications dealing with very large data sets. Companies heavily involved in cloud computing

discovered early on that traditional RDBMS cannot handle the massive amount of data and the

real-time demands of on-line applications critical for their business model. RDBMS schema is of

little use for such applications and conversion to NoSQL databases seems a much better approach.

Big data and mobile applications are the two most important growth area of cloud computing. Big

data growth can be viewed as a three-dimensional phenomenon; it implies an increased volume

of data, requires an increased processing speed to produce more results, and, at the same time, it

involves a diversity of data sources and data types [37]. A delicate balance between data security

and privacy and efficiency of database access is critical for such applications. Many cloud services

used by these applications operate under tight latency constraints; moreover, these applications

4

Page 17: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

have to deal with extremely high data volumes and are expected to provide reliable services for

very large communities of users. Nowadays NoSQL databases are widely supported by cloud

service providers. Their advantages over traditional databases are critical for big data application.

Amazon DynamoDB: AWS offers DynamoDB, a fully managed fast and flexible NoSQL database

service that provides fast performance with consistent scalability. DynamoDB supports both docu-

ment and key-value store models that are very flexible data models, this feature make DynamoDB

best choice for mobile, web, gaming and Internet Of Things (IOT) applications. AWS Manage-

ment Console or the Amazon DynamoDB Application Program Interface (API), can be used for

scale up or down without downtime or performance degradation.

1.2.1 Data Models For NoSQL Databases

In recent years more than 120 NoSQL databases have been created including CouchDB, Neo4j,

VaultDB, MongoDB, Cassandra, and BigTable 1 and all of these are refereed by umbrella term

NoSQL. They are classified based on their data models. Choosing proper data model has extremely

important influence on the performance and scalability of the data stores. Since, our work has tight

connection to NoSQL data models; therefore, for being precise we bring a brief definition for data

models.

Key-value stores: This simple data model resembles an associative map or a dictionary where a

key uniquely identifies the value. The data can be either a primitive data type such as a string, an

integer, an array, or it can be an object. This model is effective for storing distributed data thus,

it is highly scalable and this motivates its use by cloud data management systems. Systems such

1For compete list refer http://www.nosql-database.org/

5

Page 18: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

as Bigtable [14], CouchDB 2, DynamoDB [44], MemcacheDB 3 and Redis 4 use this model. This

model is not suitable for applications demanding relations or structures.

Column-family stores: In this model the data are stored in a column-oriented style and the dataset

comprise several rows, each row is indexed by a unique key, so-called primary key. Each row is

composed of a set of column families, and different rows can have different column families. Sim-

ilarly to key-value stores, the row key resembles the key, and the set of column families resembles

the value represented by the row key. However, each column family further acts as a key for the one

or more columns that it holds, whereas each column consists of a key-value pair. Hadoop HBase

directly implements the Google Bigtable concepts, whereas Amazon SimpleDB and DynamoDB

contain only a set of column name-value pairs in each row, without having column families. Some-

times, SimpleDB and DynamoDB are classified as key-value stores. Typically, the data belonging

to a row is stored together on the same server node. Cassandra provides the additional function-

ality of super-columns, which are formed by grouping various columns together. Cassandra can

store a single row across multiple server nodes using composite partition keys. In column-family

stores, the configuration of column families is typically performed during start-up. A column fam-

ily in different rows can contain different columns. A prior definition of columns is not required

and any data type can be stored in this data model. In general, column-family stores provide more

powerful indexing and querying than key-value stores because they are based on column families

and columns in addition to row keys. Similarly, to key-value stores, any logic requiring relations

must be implemented in the client application.

Document stores: In this model data are stored inside the internal structure, while in the key-

value store the data are opaque to database. Now the database engine applies meta-data to create

2http://couchdb.apache.org3http://www.Memcached.org4http://redis.io

6

Page 19: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

a higher level of granularity and delivers a richer experience for modern programming techniques.

Document-oriented databases are using a key to locate the document inside data store. Most docu-

ment stores use JSON or BSON (Binary JSON). Document stores are suited for applications where

the input data can be represented in a document format. A document can contain complex data

structures such as nested objects. Document store allows document grouping into collections. A

document in a collection should have a unique key. Unlike an RDBMS, where every row in a

table follows the same schema, a document in document stores may have a different structure.

Document stores provide the capability of indexing documents based on the primary key as well

as on the contents of the documents. Like key-value stores, they are inefficient in multiple-key

transactions involving cross-document operations.

Graph Databases: This data model based on graphs can be used to represent complex structures

and highly connected data often encountered in real-world applications. In graph databases, the

nodes and edges have individual properties consisting of key-value pairs. Graph databases are

a good alternative for social networking applications, pattern recognition, dependency analysis

and recommendation systems. Some graph databases such as Neo4J 5 support ACID6 properties.

Graph data stores are not as efficient as other NoSQL data stores and do not scale well horizontally

when related nodes are distributed to different servers.

1.2.2 Searchable Security Scheme For NoSQL databases

Data security in cloud platform is critical for the applications running on public clouds because

multiple virtual machines (VMs) often share the same physical platform [50, 51, 52]. Using classic

cryptography primitives can protect data while in storage, but even the encrypted data has to be de-

5http://neo4j.com6ACID (Atomicity, Consistency, Isolation, Durability) properties guarantee that transactions are processed reliably.

7

Page 20: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

crypted for processing purpose. This is particularly troubling when searching databases containing

personal information such as healthcare or financial records; then the entire plaintext database is

exposed to such attacks. This motivates us to investigate methods for searching encrypted NoSQL

databases. Though general computations with encrypted data are theoretically feasible using the

algorithms for Fully Homomorphic Encryption (FHE) [24], this is by no means a practical solu-

tion at this time. Existing algorithms for homomorphic encryption increase the processing time of

encrypted data by many orders of magnitude compared with the processing of plaintext data. Re-

cent implementation of FHE [28] requires about six minutes per batch; after optimization this time

drop to almost one second for computing simple operation on encrypted data [20]. Other related

methods are Learning With Error (LWE) [7], lattice based encryption [39, 10], and Attribute based

Encryption [26].

1.3 Leakage Proof Data Processing In Public Cloud

Encryption is a common practice to promise privacy of data and query, but still encrypted data and

query are vulnerable against information leakage in cloud platform. A database can be encrypted

by data owner before outsourcing to the cloud in such a way that client queries can still be pro-

cessed on transformed data. Ultimately, the encryption does not hide all information about the

encrypted data, for instance the collection name (or table name in RDBMS), field name, number

of the field, involved in the query and their length often revealing information about the encrypted

data. Moreover, a cloud insider can infer sensitive information from sequence of queries. This type

of attacks on encrypted database is classified as information leakage. Outsourced encrypted data

set should leak sensitive information as little as possible. An acceptable level of security on search-

able encryption can be achieved with the Oblivious RAM (ORAM) [25, 40, 34] method. The major

problem of ORAM is its efficiency and the high computational cost and intense communication

8

Page 21: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

between client and server.

We will argue that any query is an object with several features. Therefore, any query is considered

as a point in n dimensional feature space. Then we use a linear classifier with training data set to

extract implicit information from encrypted dataset. Every query is distinct from others in terms

of measurable features, such as the length of query string, number of involved fields, number of

objects, operation between objects, aggregate functions, domain of query and the timing informa-

tion. These features form a fingerprint from each unique query which can be identified uniquely.

Furthermore, the fingerprint of each specific client can be achieved with high confidence based on

the combination of the fingerprints of all the most periodic issued queries. In this research work

We will formulate the information leakage from encrypted data sets then we will define metrics

and cost coefficient of leakage prevention solution, to measure their performance.

1.3.1 Cryptosystems For Outsourced Data Store

Data in the cloud computing can be in one of three states: store, transit, or process. Developers

of web applications need to have efficient tools to protect sensitive information from a third party,

including the CSP. In an effort to maintain security and privacy, any comprehensive data security

mechanism must take into account the protection criteria for data in any of these states.

The communication channels can be secured by using the standard HTTP over Secure Socket

Layer (SSL) communication protocol. Most CSPs provide an API for the web service that enables

developers to use both the standard HTTP and the secure version of the HTTPS protocol. The

security requirements of data in transit state fully can be satisfied by using HTTPS for communi-

cation with cloud. In addition, the endpoint authentication feature of the SSL protocol makes it

possible to ensure clients are communicating with an authentic cloud server.

9

Page 22: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

The basic idea is to encrypt the data before uploading it to Cloud. However, the data should be

decrypted by the cloud server before getting processed. In other words, the data owner should

disclose decryption key to the server in order to decrypt the data before performing any required

operation. The problem is when the decryption key is compromised, the data confidentiality would

be affected. Therefore, in the cloud computing model, new set of cryptosystems is required. En-

cryption schemes that support operations on encrypted data are called homomorphic encryption

which have a very wide range of applications in cloud computing. In a nutshell, a fully homomor-

phic encryption scheme is a cryptosystem that allows evaluation of arbitrary complex operations

on encrypted data.

A cloud developer is responsible to ensure that the data in cloud storage is protected by authen-

tication based on user’s credentials. Moreover, for highly sensitive data, the risk of illegitimate

access should be considered. For instance, the data should be protected from a malevolent insider

who may gain access to the data. Thus, for protection purposes, the sensitive information should

be encrypted before being uploaded to the cloud. Any type of encryption can be used, since there

is no required data format for cloud storage.

Random (RND). Applying A RND type encryption scheme, a message is coupled with a key and

a random Initial Vector (IV). This scheme is called probabilistic, since encryption of the same

message with the same key yields different ciphertext. This randomness provides the highest level

of security. Randomness property is achievable with different encryption algorithms. Advanced

Encryption Standard (AES) with Cipher Block Chaining (CBC) mode [19] is used for RND en-

cryption. AES is a symmetric block cipher algorithm with a key size of 128,192 or 256 bits and

with a block size of 128 bits. RND type schemes are semantically secure against chosen plaintext

attacks and hides all kind of information about ciphertext. As a result, RND scheme does not allow

any efficient computation on the ciphertext. Equation 1.1 describes the encryption and decryption

10

Page 23: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

of a block cipher in CBC mode.

C1 = Ek(P1 ⊕ IV ), P1 = IV ⊕Dk(C1)

for j = 2 . . . n; Cj = Ek(Pj ⊕ Cj−1), Pj = Cj−1 ⊕Dk(Cj)

(1.1)

Where: Ek is the Encryption algorithm, Dk is the Decryption algorithm, k is the secret key P is a

block of plaintext data and C is a block of ciphered data.

Deterministic (DET). A DET encryption scheme is a cryptosystem which always produces the

same ciphertext for an equal pair of given plaintext and key. Block ciphers in Electronic Code Book

(ECB) mode with a constant initialization vector are deterministic (DET). Deterministic encryption

scheme leaks information about ciphertext of same plaintext. AES encryption scheme in ECB

mode is used for DET encryption over document-oriented NoSQL databases. This DET scheme

enables server to process pipeline aggregation stages such as group, count, retrieving distinct values

and equality match 7 on the fields within an embedded document. The embedded document can

maintain the link with the primary document through application of DET encryption. The Equation

1.2 displays the encryption and decryption operation in a DET.

for j = 1 . . . n; Cj = Ek(Pj); Pj = Dk(Cj) (1.2)

Order-Preserving Encryption (OPE). OPE projects the order relation between plaintext data

elements to their ciphertext values. OPE leaks the order of ciphertext, so it supports a lower degree

of security. Even in Modular Order-Preserving Encryption (MOPE) [38] which is an extension

to the basic OPE for security improvement, there is information leakage. An efficient inequality

comparisons on the encrypted data elements can be performed by applying OPE which supports

7Equality matches over specific common fields in an embedded document will select documents in the collectionwhere the embedded document contains the specified fields with the specified values.

11

Page 24: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

range queries, comparison, Min(), Max() on the ciphertext. We use the algorithm introduced in

[6] and implemented in [4] for cloud environment. Equation 1.3 shows the preservation of order

relation in plaintext and the ciphertext.

∀x, y | x, y ∈ Data Domain x < y =⇒ OPEk(x) < OPEk(y) (1.3)

Additive Homomorphic Encryption (AHOM). AHOM is a scheme that allows the server to

conduct computations on ciphertext with the final result that get decrypted at the proxy. In spite

of sustained research efforts [24, 8] of the Fully Homomorphic Encryption (FHE), there is no

efficient FHE, except for limited operations. We applied Paillier [41] scheme that supports additive

operations as shown by Equation 1.4. It should be noted that m1,m2 are messages to be encrypted

where m1,m2 ∈ Zn. r1, r2 are randomly selected and r1, r2 ∈ Z∗n. In other words, the product of

two ciphertexts decrypt to the sum of their corresponding plaintexts.

Dk(Ek(m1, r1)× Ek(m2, r2)mod n2) = m1 +m2 (mod n) (1.4)

Definition 1 (Information leakage)

Information leakage is the ability of an attacker to infer sensitive information either through mul-

tiple database searches or through statistical analysis of cloud database queries. In a nutshell,

information leakage can be defined as using combination of data, meta-data and query that are

classified at lower level L1 to extract information that are at higher level L2.

In this research, we restrict our discussion to secure query processing particularly over encrypted

NoSQL databases with minimum information leakage. The key part of SecureNoSQL is evaluation

a set of operations on the encrypted databases. Moreover, the designed novel algorithms for in-

12

Page 25: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

formation leakage prevention from data or query are added to SecureNoSQL. We also introduced

a novel descriptive language based on the JSON8 notation which enables the users to generate a

security plan. The security plan is useful tools for data owners for regulating security parameters

management without getting involved in the details. We considered four sections for any secu-

rity plan, the collection, data element, cryptographic modules and the mapping between them.

The concurrent queries are supported by the present designed structure; however, for the relevant

concurrent experiments, a network of multiple servers and clients are required. At this moment,

such configurations and hardware setup were not available. Thus, for some experiments of this

research we have used EC2 instances which is consistent with the final goal of this study. Since the

standard Database Management System (DBMS) are used in this work, therefore the concurrent

queries over encrypted distributed datasets are automatically supported without extra cost.

1.4 Roadmap

We discuss all of our approaches and solutions addressed above in the rest of this proposal which

has been organized as follows: the latest related work and researches on the subjects of secure

query processing and information leakage prevention are reviewed in Chapter 2. Chapter 3 repre-

sents the research objectives, motivation, threat model, JSON and BSON and finally we describe

the problem statement.

All the experiments of prototype systems are presented in Chapter 4. We propose two schemes for

secure query processing over encrypted data sets and information leakage management. The orga-

nization and the structure of security plan and the notation of descriptive language for generation

of security plan are discussed in Section 4.1. Afterwards, the mechanism for information leakage

8JSON (JavaScript Object Notation), is a lightweight text based syntax for storing and exchanging data objectsconsisting of key-value pairs. It is used primarily to transmit data between a server and web application. JSONpopularity is due to the fact that it is self-describing and easy to understand by human and machine.

13

Page 26: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

prevention is discussed in Section 4.2. Finally this proposal is concluded in Chapter 5 with the in

progress and completed tasks time table as well as the published and under review papers.

14

Page 27: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

CHAPTER 2: RELATED WORK

High scalability and distribution feature are the most important requirements for processing a large

volume of data which is mostly created by human or connected devices. DBaaS is extensively used

for data processing and meets both aforementioned requirements. Furthermore, DBaaS enables

users to use a database without running their own sever. In DBaaS setup, CSP takes the responsi-

bility of maintaining the hardware and the software. The cost for the service is proportional to the

usage of resources. Although easy launch of database through web-based console is an alluring

option, DBaaS brings in series of new security risks which need to be addressed. Some of the

studies on DBaaS focus on information leakage caused by sharing physical infrastructure among

multiple virtual machines. The study concluded by Ristenpart et al [43] showed the Infrastructure

As A Service (IaaS) model is susceptible for information leakage despite the isolation of virtual ma-

chines. A method called “Advanced cloud Protection System (ACPS)” for secure visualization in

cloud environment, proposed by Lombardi et al [36], mitigates security risks for external attackers

assuming the cloud is trustworthy.

The performance and efficiency of DBaaS have been extensively studied in the literature [27,

18, 17]. Techniques to improve workload balancing between clients and server and graph-based

partitioning algorithm for improving the performance and obtaining almost linear elastic scale-out

are introduced in [18]. Furthermore, a new benchmark framework compares DBaaS performance

offering by various CSPs [17].

The first SQL-aware query processing over encrypted database was CryptDB [42]. CryptDB sat-

isfies data confidentiality for the relational database. However, CryptDB cannot perform queries

over data encrypted with different keys. One important application of searching on encrypted

data [11, 45, 48] is in cloud computing where the clients outsource their storage and computation.

15

Page 28: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

In [11] a practical searchable security scheme is introduced which can search on encrypted data

sets in sub-linear time complexity by using different types of indices, however it is not practical

on NoSQL data sets which are designed to scale to millions of users doing updates simultane-

ously [13].

NoSQL databases are suffering from lack of proper data protection mechanism because these

databases have been designed to support high performance and scalability requirement. In or-

der to protect personal and sensitive information, a privacy and security preserving mechanism is

required in big data platforms. Integration of privacy aware access control features into existing

big data are discussed in the [30]. The evolution of big data systems from the perspective of an in-

formation security application is studied in [23, 47]. A cloud based monitoring and threat detection

system proposed by [16] for critical component to make infrastructure systems secure. Security in

DBaaS has been studied by several research projects [42, 29, 48, 31]. In all of these researches the

cryptosystems applied for encrypting databases before outsourcing to the CSP, in the same way

queries are encrypted and processed on the server. This is a practical general approach for pro-

tection of sensitive data at the off-site data-store. For example, in [42] CryptDB is introduced for

processing queries over encrypted relational databases. Similarly, in SecureNoSQL is proposed

for processing queries over encrypted NoSQL databases in cloud platform. The system supports

access to a MongoDB1 encrypted document-store database. SecureNoSQL is a secure proxy that

allows the applications to access and process queries on the encrypted datasets. The proxy receives

queries from clients, extracts the elements of the query, applies security parameters on them, and,

finally, forwards them to the cloud database server. After an encrypted query is processed by the

database the proxy receives the results, decrypts it and forwards it to the client. SecureNoSQL is

an open infrastructure easily extended with new encryption modules. To implementation of leak-

1MongoDB is a document-oriented NoSQL database which adopts the concept of traditional table-based relationaldatabase structure in favor of JSON-like documents with dynamic schema.

16

Page 29: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

age prevention algorithms, the construction of SecureNoSQL has been further developed for the

study discussed in this research. The leakage prevention mechanism also implemented inside the

SecureNoSQL. We have implemented a number of cryptosystems for different types of queries and

now we describe the characteristics of these cryptosystems and their applications.

Information leakage issue in a single untrusted server is studied in [49] and statistical measurement

of information leakage investigated in [15]. The weakness of k-anonymity solution for protection

against identity disclosure is recovered by introducing t-closeness in [33] which requires closeness

between distribution of sensitive attributes in the equivalent classes to the global distribution of

attributes.

To protect sensitive data from untrusted CSP the existing crypto-primitives which require de-

cryption key for processing could not be applicable, consequently the research track of finding

cryptosystems that allow processing over ciphertext data has been became appealing. Most of

researches focused on Homomorphic Encryption that allows computations to be carried out over

encrypted data [24]. Other cryptosystem that relaxed on security notion is Order-Preserving En-

cryption (OPE) also introduced in [6] and implemented in [4] for cloud platform. Untrusted CSP

still can extract information from encrypted data. In the majority of the research works in the lit-

erature, it is assumed that applying cryptographic techniques adequately provide protection in the

untrusted cloud platform, while this assumption is not utterly true. The information leakage from

encrypted data in the cloud is a plausible risk and very few works address this risk. The reported

research in this thesis, leverages the leakage-free query processing over very large scale encrypted

datasets. Ultimate goal is minimizing the information leakage with efficient solutions; therefore, a

diversity of techniques is utilized.

17

Page 30: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

Table 2.1: Information leakage management methods comparisonMethod Description Context Advantage Downside Reference

ObliviousCross-Tags(OXT)

Searchable symmetricencryption

Searches for a set ofkeywords

Practical (1)Multiple round ofinteractions; (2)Pre-Processing

Cash et al. [12]

Extended-OXT

Searchable symmetricencryption

Searches for a set ofkeywords

Extends OXTto: (1)Substring;(2)Wildcards andPhrase; (3)Substring

(1)Multiple round ofinteractions; (2)Pre-processing

Faber et al. [21]

CryptDB Secure query process-ing

SQL aware database Efficient Leakage from en-crypted data

Popa et al. [42]

SecureNoSQL Leakage resilientquery processing overencrypted database

NoSQL database Covers: (1)searchover encryptedNoSQL databases;(2)Leakage preven-tion

Requires extra hard-ware resources forProxy

Current work *

* The paper related to this work is currently under review.

18

Page 31: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

CHAPTER 3: RESEARCH OBJECTIVES AND APPROACH

3.1 Research Objectives

The primary research interests of this work are at the intersection of cloud computing and informa-

tion security, in an area known as secure computation outsourcing on the third party. We seek to

understand the needs for security and privacy of both individual users, as well as, of large organi-

zation in a public cloud environment. The security of users’ data in the cloud computing, as a large

scale distributed computational platform, is a demanding challenge that influences all users. The

evidence shows that the importance of information security in cloud is increasing as more on-line

systems are moving into to the cloud. In general, our research vision is to design security schemes

that enable cloud users to securely receive the productivity and computational benefits of the cloud

DBaaS without compromising security and privacy.

3.2 Motivation

The principal research challenge is to answer this question, “Is it possible to delegate processing

of your data without getting your private information revealed?” In other words, the goal of my

research is to resolve the conflict between the availability of data on a public-access cloud and

providing the required security level. By using classic encryption, the cloud server needs to decrypt

the data with secret decryption key before being able to process the data; however, this process

reveals users private information to adversary or malicious insider. Resolving this issue requires

a multidisciplinary approach that ties computer science and mathematics with application specific

knowledge such as finite field. As a summary, my research objective is to design a secured solution

for cloud-based on-line applications in order to address the corresponding security requirements.

19

Page 32: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

The key contributions and impact of this research are cloud-based large-scale database systems,

on-line transaction processing (OLTP) and web applications.

Technology research analyses indicate the large number of enterprises are using cloud DBaaS

from major CSP. The number of websites hosted on AWS has increased from 6.8M in September

2012 to 11.6M in May 2013, a 71% upsurge [1] [35]. Furthermore, a 67% annual growth rate

is predicted for DBaaS by 2019. Undoubtedly, considering the cloud threat model an efficient

security scheme is required for high volume of data stored and processed in the cloud. Threats

of cloud computing can be analyzed from multiple viewpoints, this work investigates it from the

adversarial prospective which is a holistic multifaceted procedure that considers whole system’s

security end-to-end. The adversarial threat analysis starts with thinking like a hacker and continues

to prepare a corresponding countermeasure. The model identifies two classes of threats as external

and internal attackers. These two classes of threats are addressed by the proposed solution. The

description of the two major threats are as follows.

3.3 Threat Model

A threat model describes the threats against a system. The threat model of cloud computing can

be analyzed from multiple viewpoints. In this work we investigate this issue from the adversarial

prospective. The adversarial threat model for DBaaS is a holistic process based on end-to-end

security. The model identifies two classes of threats, as external and internal attackers.

External attacker: An attacker from the outside of cloud environment might obtain unauthorized

access to the data by applying techniques or tools to monitor the communication between clients

and cloud servers. External attackers, in most cases, face a more complex task because they must

bypass firewalls, intrusion detection systems and other defensive tools without any authorization.

20

Page 33: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

Cloud malicious insiders: An internal attacker has primary advantages of being within the pro-

tected area of cloud and having access to resources. A major side effect hosting database in the

cloud is unauthorized access to data by the cloud internals which are refereed as malicious insid-

ers. More specifically, a certain employee or contractor of the CSP will have access to the servers,

software and hardware and therefore, to user’s data. The Efforts for data protection provided by

CSP could be bypassed by malicious insider. Encrypted datasets accompanied with secure proxy

construction such as SecureNoSQL, guarantees that malicious insiders never obtain the decryption

keys. The proxy encrypts/decrypts data and query/response between clients and cloud. The proxy

construction assures the malicious insider could not explicitly access to the sensitive information,

however still there is risk of information leakage from ciphered datasets. The malicious insider

exploits the leaked information to organize more extensive attacks to amplify leakage. In the tra-

ditional single propose on-site server malicious insider only has access few database as source to

conduct data inference attack, but the cloud server has access to millions of datasets belonging

to large variety of enterprises. With initial brute force inference attack the adversary can extract

implicit information. This analytic is totally from information leakage view points in the cloud

infrastructure is a novel idea of this work.

3.4 JSON And BSON

JavaScript Object Notation (JSON) is an open standard format which can be used to transmit data

objects consisting of key-value pairs in a self describing manner. JSON was primarily used as a

main data format to interchange data between servers. JSON supports all the basic data types of the

JavaScript programming language [9]. JSON is a simple, lightweight and efficient data structure

and these features make it as an appealing option for database vendors. Thus, several NoSQL

document store databases such as MongoDB, CouchDB and Google Cloud DataStore adopt JSON

21

Page 34: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

as their primary data representation to store and index. There is a binary extension for JSON

known as Binary JSON (BSON) that is being used by most of the document databases to represent

JSON documents in binary-encoded format in the back-end processing. BSON extends the JSON

model to provide additional data types and it is more efficient than JSON. In fact, BSON provides

users the ease of use and flexibility of JSON together with the speed and efficiency of a lightweight

binary format. In the document database, data, query, and query response all represented in the

JSON format such that document databases are referencing as JSON databases.

In this work we use JSON to create a new concept so-called security plan. In particular, the

security plan is a document contains a hierarchical collection of key-value pairs that describes

data elements, parameters of cryptosystems and mapping between these two. Every security plan

document includes four top-level sections represented in key-value pairs (see Section 4.1).

3.5 Problem Statement

The data owner has a database containing sensitive information and wants to encrypt and upload

it to a cloud and give search permission to a group of users using DBaaS. The data owner wants

to keep the data and users queries private from the CSP. Users should be able to retrieve all doc-

uments that satisfy specific condition posed by their queries from the encrypted database. An

additional privacy requirement critical for some applications such as stock market data, is to hide

any information about the access pattern from a cloud insider.

The proposed solution requires only one interaction per query with a minimum communication

between users application and DBaaS server. The work of DBaaS server for processing a re-

quested query over encrypted database still remains liner in the size of database. We address both

confidentiality and leakage prevention requirement.

22

Page 35: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

• We propose a descriptive language based on JSON notations that enables the users to create a

security plan for database and describe security parameters and assign proper cryptographic

primitives to the data elements.

• A multi-key, multi-level mechanism. The lifetime of an encryption key is shorter than that of

encryption modules so it is subject of change more frequently than encryption parameters.

Furthermore, keys are assigned for single data element, while encryption algorithms could be

applied for several data elements with several keys. This separation allows a more efficient

enforcement of security policy and of key management.

• We design an effective validation procedure against security plan in SecureNoSQL, helps to

initially evaluate locally all requests, rather than forwarding large numbers of fallacious key-

value pairs to remote cloud server. This mechanism helps to avoid unnecessarily increase of

workload and response time of remote cloud server.

• Support for a comprehensive, flexible protection. The solution is open-ended, users can add

new customized cryptographic modules simply by using designed descriptive language.

• A balanced system with a security level-proportional overhead. The overhead of scheme is

proportional to the desired level of security.

• SecureNoSQL addresses the information leakage from fully or partially encrypted databases

in the cloud. The malicious insider potentially could pool all databases and extract sensitive

information from correlation with various hosted databases. We propose a novel algorithm

that minimize information leakage in the untrusted cloud.

The details of SecureNoSQL proxy is discussed in the Chapter 4.

23

Page 36: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

CHAPTER 4: CURRENT WORK AND PRELIMINARY RESULTS

Our research started with exploring the characteristics of cloud architecture and cloud services

followed by a study of classic and modern cryptography. The research results were published in

a paper entitled “Application of order-preserving encryption (OPE) among multiple organizations

in hyper cloud environment” [4]. In this work, we report a novel solution for efficient query

processing on encrypted data. The security scheme we proposed adds overhead due to increased

size of the ciphered data. Processing the transformed data increases the time and space in an

acceptable range. This part of our research delivers an encryption solution by a straightforward

relaxation of standard security philosophies such as indistinguishability against chosen-plaintext

attack which is unfeasible by a practical OPE scheme. As a result, a security notion is proposed in

the essence of pseudo-random functions and related primitives such that the OPE scheme becomes

“as random as possible” to fulfill the order preserving constraints.

Afterwards, our research was continued with designing a scheme for read intensive large scale

database. In this scheme the classic cryptography is used for securing geographic information

databases for location based system [3]. This work concentrates on security of distributed large

scale databases with high rate of read and low rate of write operations. More specifically, wide

variety of applications ranging from social networks to military applications are using location

information for delivering different services. Moreover, smart phones and hand-held devices are

increasingly being used for mobile transactions. These devices are mostly GPS-enabled and can

provide location information to the service providers. In some cases, the geographical information

of clients is integrated with location-based applications as an authentication factor to enhance

security. Yet, since it is easy for attackers to forge location information, the security of geographical

information is a critical issue. The geographical database features were discussed and an effective

security scheme was proposed accordingly for mobile devices with limited resources.

24

Page 37: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

Our research is currently under progress exploring security solutions for Big Data in the cloud

computing. Many applications in areas as diverse as computational science and engineering, com-

putational finance and economics, mobile computing, and social media require access to very large

NoSQL databases, stored on computer clouds. The NoSQL databases are preferred to relational

databases for such Big Data applications due to faster response time and scalability. Considering

importance of Big Data in research communities, my investigations are extended to the field of

security of cloud Big Data. A security scheme for Big Data applications is developed based on

encryption of data in JavaScript Object Notation (JSON) format. The results of the experiments

carried out on the very large NoSQL data-stores were inspected for different types of queries.

In a related topic in my research, the grouped homomorphic operation over encrypted data-stores

and leakage prevention on the untrusted cloud server. We explored the problem of minimizing

information leakage from the encrypted databases in cloud environment. A fundamental issue in

the cloud environment is to preserve the data security. Data encryption is the basic means by which

sensitive information can be protected from intruders and malicious insider or external attacks.

However, users need to interact with the encrypted data stores through queries. Analyzing queries

enables cloud to illegitimately gain knowledge about underlying sensitive information which is

considered as information leakage. On the other hand, preserving security of queries against cloud

is a fundamental issue for data owners. Literature studies are focused on data and query privacy;

however, the information leakage from encrypted data-sets and queries is not being addressed in

any study. Therefore, a diversity of techniques will be utilized including: analysis of information

leakage of different cryptosystems on the encrypted large data sets, implementation, experimental

setup, theoretical analysis, and simulation in the cloud.

25

Page 38: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

4.1 SecureNoSQL

This section introduces a construction, denoted as SecureNoSQL, that is a framework to incorpo-

rate the data confidentiality and information leakage prevention algorithms. SecureNoSQL lever-

ages secure query processing for web and mobile applications which are using DBaaS. There are

two possible different system organizations which are able to fulfill our design objectives. The first

one, shown in Figure 4.1 is suitable when all database users belong to the same organization. Then

the proxy runs on a trusted server behind a firewall, so that the communication between clients and

the proxy is secure. The second case, clients are unrelated to one another and access the system

through public lines. In this case, either each client’s software includes a copy of the proxy and

only encrypted data is transmitted over public lines, or Secure Sockets Layer(SSL) protocol is used

to establish secure connection to the proxy. Figure 4.1 illustrates the high-level architecture of

SecureNoSQL as a secure proxy between user’s applications and cloud NoSQL database server.

Data Query Security plan Query Response Query Response

Security LayerData Integrity

Leakage prevention

Query Language

Data ModelStorage Engine

Data store Replica set

Data Owner Client1 Clientn

QueryResponse

Application layer

SecureNoSQL Proxy

Cloud NoSQL database

Figure 4.1: Architecture of SecureNoSQL.

26

Page 39: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

4.1.1 The Proposed Construction: SecureNoSQL Scheme

SecureNoSQL is based on general principles of NoSQL database products. We introduce a new

concept, denoted as security plan, which is a JSON description of subsequent data elements, meta-

data and parameter configuration of cryptosystems. In the proposed solution, a descriptive lan-

guage is introduced to generate and read the security plan automatically. JSON, as a dominant

format in NoSQL databases, is selected as a format to express the designed security plan. We used

a subset of JSON notation readable by human and machine. Document databases, such as Mon-

goDB, store documents inside the collection in JSON representation in a similar way as RDBMS

stores tables and records. A query and the corresponding response are also represented in JSON

format; therefore, the governing format in document database is JSON. Additionally, there is a

binary extension of JSON, known as BSON, which is used by document oriented databases for

efficient encoding/decoding. JSON query model is a functional, declarative notation, designed

especially for working with large volumes of structured, semi-structured and unstructured JSON

documents. The data owner develops the security plan that outlines and maps out the determined

crypto-primitive with specific parameters to a particular data element.

The schema in NoSQL database is flexible which allows a different number of attributes for dif-

ferent documents corresponding to the same object. On the other hand, in order to create a com-

prehensive protection for all data elements in the database, a full list of attributes is required to

assign proper level of protection. Therefore, we define a logical operator denoted as Super Docu-

ment which is basically the union of all attributes for different versions of documents related to the

same object. As it is described in Equation 4.1, each pair 〈ki, vi〉 represents an attribute and Super

27

Page 40: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

Document represent all attributes of a specific object.

d1 =⟨〈k1, v1〉, 〈k2, v2〉, . . . , 〈ki, vi〉

⟩d2 =

⟨〈k1, v1〉, 〈k2, v2〉, . . . , 〈kj, vj〉

⟩. . .

dn =⟨〈k1, v1〉, 〈k2, v2〉, . . . , 〈kl, vl〉

⟩Super Document D =

n⋃i=1

di

(4.1)

In addition, a match functionM(di, dj) is required to determine whether any two given documents

di, dj can be merged or not. Two documents can be merged provided that they share the same

attribute from an identifying class or group of attributes from semi-identity class.

4.1.2 Security Plan

In fact, the security plan identifies the mechanism that is applied to maintain the security of data

elements in a database. Also it determines how to interpret queries that are issued by a specific

user’s applications. As it can be seen in Figure 4.2, we organized the security plan in four sub-

divisions which enable us to efficiently describe security rule, not only for data elements but also

for meta-data such as field-name (Key) and collection name. These subdivisions are the building

blocks of security plan which elaborates how those rules are enforced over the giving data. The

structure of the subdivisions are presented as follows:

1. Collection. The first section includes the name of a collection and a reference to the encryp-

tion module to be used to encrypt the name of collection and name of fields (meta data).

2. Cryptographic modules: The second section lists the cryptographic modules for encrypting

28

Page 41: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

the fields of the database entries in the query.

3. Data elements. The third section lists the properties of each data field including the data

type. The data type determines cryptographic modules to be applied to each field.

4. Mapping cryptographic modules to the fields: The fourth section specifies the cryptographic

modules used to encrypt the value of fields. This information is used by the proxy to encrypt

and decrypt the data fields.

Security Plan

Collection

Cryptographic modules

Data elements

Mapping cryptographic modules to the fieldsFigure 4.2: The high level structure of the security plan.

Collection: A collection is defined as a group of NoSQL documents which is an equivalent for

table in relational database. A collection has some properties like name which need to be protected

by encryption. The structure of collection is illustrated and described in the Figure 4.3. For more

clarification, refer to the listing 4.3b that presents how to secure a sample collection using our

designed descriptive language.

The key-value pairs (KVP) are the primary data model for a NoSQL database. The key is used as

an index to access the associated value of the data pointed by the reference ref. The initialization

vector (IV) is a fixed-size, random input to the cryptographic module encryption. Additionally, a

collection exists within a single database. Documents within a collection can have different fields.

29

Page 42: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

Collection

name

encryption

ref

key

iv

fieldName

encryption

ref

key

iv

(a)

{"$collection" : {"name" : "Personnel","$encryption":{"$ref":"/AES-DET","key":"02468acebdf135790369cf258be147ad","iv":"2468ace0" }},"$fieldName": {"$encryption": {"$ref":"/AES-DET","key":"0123456789abcdef0123456789abcdef","iv":"ffeeddcc"}}}

(b)

Figure 4.3: Structure and description of Collection: (a) The chart outlines the structure of collection contain-ing the name of collection and name of all fields which are considered as meta-data thus should be protectedwith proper cryptographic module. The pointer to a cryptomodule, the encryption key, and the initializationvector used for the encryption of the items. (b) The description of a collection and security parameters indesigned JSON based language. In this specific case the Advanced Encryption Standard in deterministic(AES-DET) mode with a 128-bit key and an initialization vector (IV) is assigned to encrypt the name of thecollection and the fields name.

Typically, all documents in a collection are related with one another.

Cryptographic modules. There are various encryption algorithms for different applications, each

with diverse strengths and weaknesses. The choice of a particular cryptosystem depends on the se-

curity policy of applications. Criteria for algorithm selection include: the security against theoreti-

cal attacks, cost of implementation and performance issues whether the encryption and decryption

can be parallelized in CPU pool like cloud computing. Other factors may be involved in the selec-

tion of an algorithm are the memory requirements and the integration in the overall system design.

According to the proposed format, the Cryptographic modules introduces all encryption modules

30

Page 43: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

and their parameters such as key, key-size, initialization vector and output-size. The structure of

this section depicted in Figure 4.4a and the listing introduced in Figure 4.4b is displaying second

section of security plan for the previous example.

Cryptographic modules

Module #1

name

type

keySize

key

inputSize

outputSize

Module #2

name

type

keySize

key

inputSize

outputSize

(a)

{"OPE" : {"properties" : {"encryptionMethod" : {"type" : "string","enum" : [ "OPE" ] },"keySize" : {"type" : "integer","minimum" : 64,"maximum" : 4096,"default" : 128 },"key" : {"type" : "string","pattern" : "ˆ([0-9a-fA-F]{2})+$" },"inputSize" : {"type" : "integer","minimum" : 8,"maximum" : 128,"default" : 32 },"outputSize" : {"type" : "integer","minimum" : 64,"default" : 128 } },"required" : [ "key", "

encryptionMethod" ],"additionalProperties" : false}}

(b)

Figure 4.4: Structure and description of Cryptographic modules: (a) Security Plan with the second section,the cryptographic module, expanded. The attributes included for each module are: name, type, key size,key, input and output size. (b) The OPE encryption including the cryptosystems and their attributes. Theproxy applies these modules using the key-value pairs (KVP).

Our proof of concept uses the parametric Order Preserving Encryption (OPE) and the Advanced

31

Page 44: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

Encryption Standard(AES) modules. The system is open-ended, users can add the cryptosystems

best suited to the security requirements of their application. In our design the definitions of the

cryptographic modules and of the pairs, encryption key and initialization value, are separated fol-

lowing the so-called key separation principle [22]. This security practice is based on the observa-

tions that users have long- and short-term security policies. The cryptographic modules are less

likely to change while the key and the initialization value change frequently.

The data elements. The third section of security plan, the data elements and their properties are

covered. Figure 4.5 presents the structure and description of Data element section of Security plan.

The listing displayed in Figure 4.5b displays data elements and its JSON description for previous

example. To ensure the desired level of security the security plan should provide the description of

all sensitive data elements of database in third section of security plan.

Data elements

Field #1

name

type

value

Field #2

name

type

value

(a)

{"id":{ "type":"integer" },"name":{ "type":"string" },"salary"{ "type":"integer" },"balance"{ "type":"integer" },"ccn"{ "type":"integer" },"ssn"{ "type":"integer" },"email":{ "type":"string" }"required":["id","name","email","

salary"]}

(b)

Figure 4.5: Structure and description of Data element: (a) The chart outlines the structure of Data elementscontaining attributes of data elements such as name, type and value for of collection and name. Thenintroduces security parameters for each data elements. (b) The data element section of a sample databasewhich are represented in designed notation. A data item has 7 fields: id, name, salary, balance, ccn, ssn, andemail. The id, name, email and salary are required fields.

Mapping cryptographic modules to the fields The last section of security plan specifies all cryp-

32

Page 45: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

tographic modules for all sensitive data fields. Figure 4.6 and the listing presented in Figure 4.6b

show the mapping of the cryptographic modules and the corresponding JSON format for a sample

application.

Mapping cryptographicmodules to the Fields

Field #1

Cryptographic module m

Field #2

Cryptographic module n

(a)

{"id":{ "$ref": "#/definitions/

ope128" },"name":{ "$ref": "#/definitions/

AES-DET" },"email":{ "$ref": "#/definitions

/AES-DET" },"salary":{ "$ref": "#/

definitions/ope256" },"ssn":{ "$ref": "#/definitions/

ope256" },"ccn":{ "$ref": "#/definitions/

ope256" },"balance":{ "$ref": "#/

definitions/ope256" }}

(b)

Figure 4.6: Structure and description of Mapping cryptographic modules to the Data element: (a) Securityplan with the fourth section expanded. This section establishes a correspondence between the data fieldsand the cryptographic modules used to encrypt and decrypt it. (b) The mapping section of the schema fora sample database with 7 fields. For example, the id and the name will be encrypted with OPE 128 bit andAES-DET, respectively.

As outlined in Section 1, the method presented in this work can be easily extended to the other

NoSQL data models discussed in Section 2. Figure4.7 shows how this extension from the KV to

the document store model can be carried out.

Query and data validation The proxy validates the data and query as a JSON-formatted input

with the reference security plan. Afterward, enforcing assigned crypto-primitives, generates new

query with respect to NoSQL query semantic; in this process it applies to each field the cryp-

tographic modules described in the mapping section of the schema, Finally, the proxy forwards

33

Page 46: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

Cryptographic module z

Key1 V alue1

... ...

Keyn V aluen

Cryptographic module1

...

Cryptographic modulen

(a)

Collection name

Cryptographic module x

Document ID

Cryptographic module y

Cryptographic module z

Key1 V alue1

... ...

Keyn V aluen

Cryptographic module1

...

Cryptographic modulen

(b)

Figure 4.7: SecureNoSQL applied to: (a) The key-value data model; Key1, . . . ,Keyn are all encryptedusing the cryptographic module z while the corresponding values, V alue1, . . . , V aluen are encrypted withcryptographic modules 1, 2, . . . , n, respectively. (b) The document store data model; the meta-data such asscollection name encrypted as well as attributes with assigned cryptographic modules.

new encrypted query/data to the NoSQL database server. Figure 4.8 depicts the schema validation

process.

For better illustration, consider listings depicted in Figure 4.9a as an input data after running val-

idation process the output is generated (see Figure 4.9b). The output of validation process is a

single file which contains descriptive information for data and meta-data in designed format and

ready to execute on the SecureNoSQL.

The output of validation process is a single file which contains descriptive information for data

and meta-data in designed format and ready to execute on the SecureNoSQL. The final output of

validation process for example is illustrate in Figure 4.9b. As it noted earlier in Section 3.5, the

34

Page 47: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

JSON Data/Query Security plan

Validation of data elements (format matching)

Extraction of encryption parameters

Applying cryptomodules to the data and metadata

Forward encrypted Data/Query to cloud NoSQL server

NoSQL server

Figure 4.8: The validation process of input data against security plan in the client side.

prosed scheme is proportional to desired security level which explicitly expressed in security plan

for any database. In Table 4.1 the data overhead based on the different parameters for several

crypto-primitive are contracted.

Table 4.1: Overhead of encryption upon security level

Database Plain OPE64 OPE128 OPE256 OPE512

Size(MB) 170 430 508 662 1000

4.1.3 Processing Queries On Encrypted Data

According the proposed scheme, in order to process queries over encrypted data the queries should

transferred to the encrypted version with respect to security plan, this task is designed to conducted

by our secure proxy. The security plan provides the assigned cryptographic modules to be applied

to the different fields of query. Figure 4.10 displays the processing and rewriting of a sample query.

35

Page 48: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

{"id": 1,"name":"Mohammad

Ahmadian","email": "

[email protected]",

"salary":17000,"ssn": 433042664,"ccn":"47162552387","balance":1320}

(a)

{"id": {"$encryption": {"encryptionMethod": "op2128","key": "

ADBDBC3B439DB495A81DA1BE56ACA" },

"value": 1 },"name": {"$encryption": {"encryptionMethod": "AES-DET","key": "00112233445566778899

aAbBcCdDeEfF" },"value": "Mohammad Ahmadian" },"email": {"$encryption": {"encryptionMethod": "AES-DET","key": "00112233445566778899

aAbBcCdDeEfF" },"value": "[email protected]" },"balance": {"$encryption": {"encryptionMethod": "ope256","key": "

A75C644DF2E4EFE5328BB35E3C636" },

"value": 1320 }}

(b)

Figure 4.9: Security plan designed for sample input: (a) Data element section of sample security plan. (b)Output of JSON Data validation for sample database.

For better understanding the query encryption, in Table 4.2 you can find some sample encrypted

queries after enforcing security plan. As it can be seen, data elements and immediate values are

encrypted, however the output is consistent with NoSQL semantics.

36

Page 49: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

and

salary 5000

balance 2000

(a)

and

9mnGu8Q2VDstE+T9jFw2wQ==

3986410786398723978941641627711702

5pgAxn6BF08WtM7zyuYaKg==

161374267674800082431533686937402

(b)

Figure 4.10: The query db.customers.find({salary:{$gt:5000}, balance:{$lt:2000}}) received from an ap-plication. (a) The parsing tree of the query (b) The cryptographic modules applied to the data elementsaccording to schema definition

.

Table 4.2: Sample queries and their corresponding encrypted versionQuery Encrypted query

1 db.customers.find({ssn:936136916})db[”k/IevnbanDMQHNkb9cRgUg==”].find({”5pgAxn6BF08WtM7zyuYaKg==”:74172405478441908041711118833862143778})

2db.customers.find({balance:{$gte:5084610},balance:{$lte:9911843}})

db[”k/IevnbanDMQHNkb9cRgUg==”].find({”3iXpo2l8xZpW7J7TezFdeA==”:{$gte:402982988013604629517872370128473753},”3iXpo218xZpW7J7TezFdeA==”{$lte:785596355698717592780268633369454231}})

3db.customers.aggregate([{$group:{ id:null,minBalance:{$min:”$balance”}}}])

db[”k/IevnbanDMQHNkb9cRgUg==”].aggregate([{$group:{ id:null,EncMinBalance:{$min:”$3iXpo2l8xZpW7J7TezFdeA==”}}}])

4db.customers.aggregate([{$group:{ id:null,maxBalance:{$max:”$balance”}}}])

db[”k/IevnbanDMQHNkb9cRgUg==”].aggregate([{$group:{ id:null,EncmaxBalance:{$max:”$3iXpo2l8xZpW7J7TezFdeA==” }}}])

5db.customers.find({$or:[{Salary:{$gt:516046}},{balance:{$lt:285462}}]})

db[”k/IevnbanDMQHNkb9cRgUg==”].find({ $or: [ { ”9mnGu8Q2VDstE+T9jFw2wQ==”: { $gt: 40994186216785746613193244129885849}},{”3iXpo2l8xZpW7J7TezFdeA==”:{$lt:22657430453144634679791167652174833}}]})

4.1.4 Measurements And Experimental Results

The experiments to measure the query time must be carefully designed. To construct average

query processing time each experiment has to be carried out repeatedly. We noticed a significant

reduction of database management response time after the first execution of a query, a sign that

MongoDB is optimized and caches the results of the most recent queries. A solution is to disable

37

Page 50: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

the cache, or if this is not feasible, to clear the cache before repeating the query. Another important

observation is that modern processors have a 64-bit architecture and are optimized for operations

on 64-bit integers. This explains why for three of the five types of queries, Q2 (Range query), Q3

(equality), and Q4 (logical), database response time is slightly shorter for the encrypted database

than for the unencrypted one when the keys are 32-bit integers.

Comparison EqualityRange Logical Aggregation

300

350

400

450

500

550

600

650

700

Que

rypr

oces

sing

time

(mic

rose

cond

s)

32 bit64 bit

128 bit256 bit512 bit

Figure 4.11: Query processing time in milliseconds (ms) for the unencrypted database and for the encrypteddatabases when the 32-bit keys are encrypted as 64, 128, 256 and 512-bit integers.

Our measurements show that the response time of the NoSQL database management system to

encrypted data depends on the type of the query. The shortest and longest database response time

occur for Q1 (comparison) and Q5 (aggregated queries), respectively; for these two extremes the

time for the unencrypted database almost doubles, but the time for encrypted databases increases

only by 70− 80%. As expected, the query processing type for a given type of query increases, but

only slightly, less than 5% when the key length increases from 64, to 128, 256, and 512 bit. As

expected, the OPE encryption time increases significantly with the size of the encryption space; it

38

Page 51: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

increases almost tenfold when the size of the encrypted output increases from 64-bit to 1024-bit

and it is about 10 ms for 256-bit. The decryption time is considerably smaller, it increases only

slightly from 0.11 ms to 0.17 when the size of the encrypted key increases from 64-bit to 1024 bit.

Secure proxy is an important element for the proposed architecture; therefore, the potential attacks

that could affect the proxy, also should be taken to considerations. In general, two major possible

attacks on proxy are Denial of Service (DoS) and unauthorized access. In DoS attack, the attacker

sends so many network traffic to the proxy, that the system is not capable of process within the

expected time frame. Successful DoS attacks can turn the proxy to a bottleneck of the system.

In unauthorized access attacks, attackers use a proxy to mask their connections while attacking

to the different targets. For improving the security of proxy against DoS attacks and reducing the

consecutive impacts, there are different solutions including blocking the undesired packets or using

multiple proxies with load balancers. Moreover, for prevention of unauthorized access attacks, it

is required to use best fit authorization to access the proxy. User authentication based on group

membership with different authorizations are best practical solutions.

4.2 Leakage Prevention In DBaaS

Encryption is a common practice to promise privacy of data and query, but still encrypted data and

query are vulnerable against information leakage in cloud platform. A databases can be encrypted

by data owner before being outsourced to the cloud in such a way that client queries can still be

processed on the transformed data. Ultimately, the encryption does not hide all information about

39

Page 52: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

the encrypted data. For instance, collection name (or table name in RDBMS), field name, number

of field, involved in a query and their length often reveal sensitive information about the encrypted

data. Moreover, a cloud insider can infer sensitive information from sequence of queries. This type

of attacks on encrypted database is categorized in information leakage class. Outsourced encrypted

data set should leaks sensitive information as little as possible. An acceptable level of security

on searchable encryption can be achieved with the proposed scheme. For studying information

leakage from DBaaS model, we choose NoSQL database model with flexible scheme. In the data

model of NoSQL, a database is depicted as a collection of documents C = {d1, d2, . . . , dn} and

accordingly a document is modeled with a set of key-value pairs {ki, vi} each of which represents

an attribute of an object.

4.2.1 Problem Statement

We assume that the data is fully or partially encrypted before being outsourced to the CSP. How-

ever, fully or partially encrypted databases in the cloud are at the risk of information leakage in the

presence of a malicious cloud insider who potentially could pool all databases and extract sensi-

tive information from correlation between various hosted databases. This work characterizes most

common sources of information leakage from encrypted NoSQL databases. We propose and ana-

lyze a secure query processing system with minimum information leakage in an untrusted cloud.

Also a metric to quantify the information leakage is introduced. This work currently is under

progress and the experimental results will be presented in the dissertation.

40

Page 53: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

CHAPTER 5: CONCLUSION

We presented a novel searchable secure scheme over encrypted NoSQL databases which provides

protection for sensitive information in presence of two important threats confronting database-

backed applications. The proposed scheme meets all design objectives with respect to three prin-

ciples: i. Running queries efficiently over encrypted data using a novel JSON-aware encryption

strategy, the evaluation on a large trace of queries from a variety of databases running on the cloud

DBaaS shows that SecureNoSQL can support search operations over encrypted NoSQL data. The

throughput penalty of SecureNoSQL is modest, resulting in a reduction of 1425% on performance

of query processing time as compared to Plain database. Our security analysis shows that Se-

cureNoSQL protects most sensitive attributes of collection with highly secure encryption schemes

for variety of applications. ii. With application security plan which is novel notion introduced in

this work for automation of security parameter configuration to enforce security policy on database

and relevant queries. Intuitively, the life time of encryption key is shorter than encryption algo-

rithm and we expect key change happening more frequently than changing cryptosystem itself.

By using the designed descriptive language, the data owner manage the security parameters to the

secure proxy with minimum effort. iii. Our security analysis shows that SecureNoSQL protects

most sensitive attributes of collection with highly secure encryption schemes for variety of appli-

cations. Furthermore, the server application is kept unmodified and the user never involved in the

complexity of security measures.

The secure proxy is a critical component of the system, it is multi-threaded and the cache man-

41

Page 54: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

agement is non-trivial. The management of the security attributes is rather involved. On the other

hand, a proxy integrated in the client-side software can be light-weight and considerably sim-

pler. We are currently implementing the two versions of proxy. Experimental results for multiple

large datasets with up to one million documents show that SecureNoSQL is rather efficient. Our

approach can be extended to a multi-proxy structure for big data applications. We are now im-

plementing a sophisticated mechanism for maintaining consistency of hash values database in the

proxies datasets based on PAXOS [32, 37]. Outsourcing encryption data sets to the third party

like cloud environment provides good level security, however encryption of query and data is still

vulnerable against data leakage in cloud platform. The encryption does not hide all information

about the encrypted data, and this is new area for research and investigation for future works. We

introduced novel techniques to protect encrypted data sets to prevent malicious insider to discover

implicit information especially with cross-referencing attack. The propose method introduces data

overhead which is proportional to the desired security level.

5.1 Work In Progress And Tasks Time Table

My research work in progress is on the leakage prevention from both plain and ciphertext databases

hosted by DBaaS. We propose solution for this problem by utilizing data encryption as a primary

approach that protects sensitive information from intruders and malicious insiders. In the rest

of this research, we implement the proposed algorithm in real world cloud service and NoSQL

databases hosted by DBaaS. The tasks have been done so far are shown in blue bars and the tasks

42

Page 55: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

in progress are illustrated with red bars, all demonstrated in figure 5.1. The titles of all resulting

papers during this research work are listed in Table 5.1.

2013 2014 2015 2016

1 2 3 4 5 6 7 8 9 101112 1 2 3 4 5 6 7 8 9 101112 1 2 3 4 5 6 7 8 9 101112 1 2 3 4 5 6 7 8 9 101112

Study

Published papers

Paper 1

Paper 2

Submitted papers

Paper 3

Paper 4

Paper 5

Ready to submit

Paper 6

Revision of papers

Revising the Papers

Figure 5.1: Estimate work plan and timeline

Table 5.1: List of publicationsPaper Paper Authorship Journal or StatusNo Title Conference

Paper 1Security of Applications Involving Multiple M.Ahmadian, A.Paya IEEE 28th InternationalOrganizations-OPE in Hybrid Cloud Environments [4] D.Marinescu Parallel & Distributed Processing Published (2014)

Paper 2A security scheme for geographic information M.Ahmadiandatabases in location based systems [3] J.Kho., D.Marinescu IEEE SoutheastCon Published (2015)

Paper 3SecureNoSQL: An approach to secure search on M.Ahmadian, F.Plochan International Journal ofencrypted NoSQL databases in public cloud [5] Z.Roessler, D.Marinescu Information Management (IJIM) Published (2017)

Paper 4An Analysis of Information Leakage due to Insider M.Ahmadian Journal of Information Securityand some Outsider Attackers in Computer Clouds D.Marinescu and Applications Under review

Paper 5Secure Query Processing in Cloud NoSQL [2] M.Ahmadian IEEE International Conference

on Consumer Electronics Published (2017)

Paper 6On information leakage in cloud database M.Ahmadian Transaction of sustainable computationservices D.Marinescu Under review

5.2 Future Work

The current research will be continued by the following suggestions:

43

Page 56: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

• Multiple proxies in order to deal with a huge number of clients,

• Developing an efficient, fully homomorphic encryption for unlimited operations over the

encrypted data,

• Encryption key management mechanism development for periodically assigning new key for

cryptosystems in order to obtain higher levels of security.

44

Page 57: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

LIST OF REFERENCES

[1] Amazon web services growth unrelenting. (last accessed 3rd May, 2016).

[2] M. Ahmadian. SECURE QUERY PROCESSING in CLOUD NoSQL. In 2017 IEEE in-ternational conference on consumer electronics (ICCE) (2017 ICCE), Las Vegas, USA, Jan.2017.

[3] M. Ahmadian, J. Khodabandehloo, and D. Marinescu. A security scheme for geographicinformation databases in location based systems. IEEE SoutheastCon, pages 1–7, April 2015.

[4] M. Ahmadian, A. Paya, and D. Marinescu. Security of applications involving multiple orga-nizations and order preserving encryption in hybrid cloud environments. IEEE Internationalconf. on Parallel Distributed Processing Symposium Workshops (IPDPSW), pages 894–903,May 2014.

[5] M. Ahmadian, F. Plochan, Z. Roessler, and D. C. Marinescu. SecureNoSQL: An approachfor secure search of encrypted nosql databases in the public cloud. International Journal ofInformation Management, 37(2):63 – 74, 2017.

[6] A. Boldyreva, N. Chenette, Y. Lee, and A. Oneill. Order-preserving symmetric encryption.In Advances in Cryptology-EUROCRYPT 2009, pages 224–241. Springer, 2009.

[7] Z. Brakerski and V. Vaikuntanathan. Fully homomorphic encryption from ring-lwe and secu-rity for key dependent messages. Advances in Cryptology–CRYPTO, pages 505–524, 2011.

[8] Z. Brakerski and V. Vaikuntanathan. Efficient fully homomorphic encryption from (standard)lwe. SIAM Journal on Computing, 43(2):831–871, 2014.

[9] T. Bray. The javascript object notation (json) data interchange format. 2014.

[10] D. Cash, D. Hofheinz, E. Kiltz, and C. Peikert. Bonsai trees, or how to delegate a latticebasis. Journal of cryptology, 25(4):601–639, 2012.

[11] D. Cash, J. Jaeger, S. Jarecki, C. Jutla, H. Krawczyk, M.-C. Rosu, and M. Steiner. Dynamicsearchable encryption in very-large databases: Data structures and implementation. Networkand Distributed System Security Symposium (NDSS14), 2014.

[12] D. Cash, S. Jarecki, C. Jutla, H. Krawczyk, M.-C. Rosu, and M. Steiner. Highly-scalablesearchable symmetric encryption with support for boolean queries. Advances in Cryptology–CRYPTO 2013, pages 353–373, 2013.

[13] R. Cattell. Scalable sql and nosql data stores. ACM SIGMOD Record, 39(4):12–27, 2011.

[14] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra,A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. ACMTransactions on Computer Systems (TOCS), 26(2):4, 2008.

[15] K. Chatzikokolakis, T. Chothia, and A. Guha. Statistical measurement of information leak-age. In International Conference on Tools and Algorithms for the Construction and Analysisof Systems, pages 390–404. Springer, 2010.

45

Page 58: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

[16] R. Chow, P. Golle, M. Jakobsson, E. Shi, J. Staddon, R. Masuoka, and J. Molina. Controllingdata in the cloud: outsourcing computation without outsourcing control. Proc. of the ACMworkshop on Cloud computing security, pages 85–90, 2009.

[17] B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloudserving systems with ycsb. In Proceedings of the 1st ACM symposium on Cloud computing,pages 143–154. ACM, 2010.

[18] C. Curino, E. P. Jones, R. A. Popa, N. Malviya, E. Wu, S. Madden, H. Balakrishnan, andN. Zeldovich. Relational cloud: A database-as-a-service for the cloud. 2011.

[19] J. Daemen and V. Rijmen. Aes proposal: Rijndael. 1999.

[20] L. Ducas and D. Micciancio. Fhew: Bootstrapping homomorphic encryption in less than asecond. Advances in Cryptology–EUROCRYPT 2015, pages 617–640, 2015.

[21] S. Faber, S. Jarecki, H. Krawczyk, Q. Nguyen, M. Rosu, and M. Steiner. Rich queries onencrypted data: Beyond exact matches. In European Symposium on Research in ComputerSecurity, pages 123–145. Springer, 2015.

[22] F. Galiegue and K. Zyp. Json schema: core definitions and terminology. Internet EngineeringTask Force (IETF), 2013.

[23] J. Gantz and D. Reinsel. The digital universe in 2020: Big data, bigger digital shadows, andbiggest growth in the far east. IDC iView: IDC Analyze the Future, 2007:1–16, 2012.

[24] C. Gentry. A fully homomorphic encryption scheme. PhD thesis, Stanford University, 2009.

[25] O. Goldreich and R. Ostrovsky. Software protection and simulation on oblivious rams. Jour-nal of the ACM (JACM), 43(3):431–473, 1996.

[26] S. Gorbunov, V. Vaikuntanathan, and H. Wee. Attribute-based encryption for circuits. Proc.of the Forty-fifth Annual ACM Symposium on Theory of Computing, pages 545–554, 2013.

[27] H. Hacigumus, B. Iyer, and S. Mehrotra. Providing database as a service. In Data Engineer-ing, 2002. Proceedings. 18th International Conference on, pages 29–38. IEEE, 2002.

[28] S. Halevi and V. Shoup. Algorithms in helib. CRYPTO–Advances in Cryptology, pages554–571, 2014.

[29] H. Hu, J. Xu, C. Ren, and B. Choi. Processing private queries over untrusted data cloudthrough privacy homomorphism. In Data Engineering (ICDE), 2011 IEEE 27th InternationalConference on, pages 601–612. IEEE, 2011.

[30] M. Islam and M. Islam. An approach to provide security to unstructured big data. 8th Interna-tional Conf. on Software, Knowledge, Information Management and Applications (SKIMA),pages 1–5, Dec 2014.

[31] M. Kuzu, M. S. Islam, and M. Kantarcioglu. Distributed search over encrypted big data.In Proceedings of the 5th ACM Conference on Data and Application Security and Privacy,CODASPY ’15, pages 271–278, New York, NY, USA, 2015. ACM.

[32] L. Lamport. Paxos made simple. ACM Sigact News, 32(4):18–25, 2001.

46

Page 59: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

[33] N. Li, T. Li, and S. Venkatasubramanian. t-closeness: Privacy beyond k-anonymity andl-diversity. In IEEE 23rd International Conference on Data Engineering, pages 106–115,2007.

[34] C. Liu, L. Zhu, M. Wang, and Y.-a. Tan. Search pattern leakage in searchable encryption:Attacks and new construction. Information Sciences, 265:176–188, 2014.

[35] H. Liu. Amazon data center size. published March, 13, 2012.

[36] F. Lombardi and R. D. Pietro. Secure virtualization for cloud computing. Journal of Networkand Computer Applications, 34:1113–1122, 2011. Advanced Topics in Cloud Computing.

[37] D. C. Marinescu. Cloud computing: theory and practice. Newnes, 2013.

[38] C. Mavroforakis, N. Chenette, A. O’Neill, G. Kollios, and R. Canetti. Modular order-preserving encryption, revisited. Proc. of the 2015 ACM SIGMOD International Conf. onManagement of Data, pages 763–777, 2015.

[39] D. Micciancio. Lattice-based cryptography. Encyclopedia of Cryptography and Security,pages 713–715, 2011.

[40] R. Ostrovsky. Efficient computation on oblivious rams. In Proceedings of the twenty-secondannual ACM symposium on Theory of computing, pages 514–523. ACM, 1990.

[41] P. Paillier. Public-key cryptosystems based on composite degree residuosity classes. In Ad-vances in cryptologyEUROCRYPT99, pages 223–238. Springer, 1999.

[42] R. A. Popa, C. M. S. Redfield, N. Zeldovich, and H. Balakrishnan. Cryptdb: Protectingconfidentiality with encrypted query processing. Proc. of the Twenty-Third ACM Symposiumon Operating Systems Principles, pages 85–100, 2011.

[43] T. Ristenpart, E. Tromer, H. Shacham, and S. Savage. Hey, you, get off of my cloud: ex-ploring information leakage in third-party compute clouds. In Proceedings of the 16th ACMconference on Computer and communications security, pages 199–212. ACM, 2009.

[44] S. Sivasubramanian. Amazon dynamodb: A seamlessly scalable non-relational database ser-vice. Proc. of ACM SIGMOD Int. Conf. on Management of Data, pages 729–730, 2012.

[45] D. X. Song, D. Wagner, and A. Perrig. Practical techniques for searches on encrypted data.Proc. IEEE Symposium on Security and Privacy, pages 44–55, 2000.

[46] M. Stonebraker. Sql databases v. nosql databases. Commun. ACM, 53(4):10–11, Apr. 2010.

[47] C. Tankard. Big data security. Network security, 2012(7):5–8, 2012.

[48] S. Tu, M. F. Kaashoek, S. Madden, and N. Zeldovich. Processing analytical queries overencrypted data. Proc. of the VLDB Endowment, 6(5):289–300, 2013.

[49] S. E. Whang and H. Garcia-Molina. Managing information leakage. 2010.

[50] L. Xu, C. Jiang, J. Wang, J. Yuan, and Y. Ren. Information security in big data: Privacy anddata mining. Access, IEEE, 2:1149–1176, 2014.

[51] L. Xu, X. Zhang, X. Wu, and W. Shi. Abss: An attribute-based sanitizable signature forintegrity of outsourced database with public cloud. Proc. of the 5th ACM Conf. on Data andApplication Security and Privacy, pages 167–169, 2015.

47

Page 60: AN APPROACH FOR SECURE AND LEAKAGE ...cs.ucf.edu/~ahmadian/pubs/Proposal.pdfAN APPROACH FOR SECURE AND LEAKAGE RESILIENT SEARCH OVER ENCRYPTED NOSQL DATABASES IN A PUBLIC CLOUD by

[52] X. Yu and Q. Wen. A view about cloud data security from data life cycle. International Conf.on Computational Intelligence and Software Engineering (CiSE), pages 1–4, Dec 2010.

48