Top Banner

of 44

Technical Report.docx

Jun 02, 2018

Download

Documents

kiranns1978
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/10/2019 Technical Report.docx

    1/44

    Technology Analysis Report

    Cloud Storage Gateways

    Kiran Srinivasan , ATG, CTO OfficeShwetha Krishnan , ATG, CTO OfficeMonika Doshi , SPT, CTO OfficeChris Busick , V-series product groupSonali Sahu , V-series product groupKaladhar Voruganti , ATG, CTO Office

  • 8/10/2019 Technical Report.docx

    2/44

    Cloud Storage Gateways NetApp Confidential

    TABLE OF CONTENTS

    1 Style examples (HEADING LEVEL 1, Arial bold 13 pt blue) Error! Bookmark not defined. 1.1 subsection (heading level 2, arial bold 11 pt black) Error! Bookmark notdefined.

    1.2 subsection (heading level 2) Error! Bookmark not defined.1.3 subsection (heading level 2) Error! Bookmark not defined.1.4 subsection (heading level 2) Error! Bookmark not defined.

    2 Template use Error! Bookmark not defined. 2.1 Applying paragraph styles Error! Bookmark not defined.2.2 Adding Graphics Error! Bookmark not defined.2.3 Adding New Tables Error! Bookmark not defined.2.4 styling Unformatted Tables Error! Bookmark not defined.

    3 Appendices Error! Bookmark not defined. 3.1 Appendix title Error! Bookmark not defined.

    3.2

    Appendix titleError! Bookmark not defined.

    4 References Error! Bookmark not defined.

    5 Styles lists Error! Bookmark not defined.

    LIST OF TABLESTable 1) Example NetApp table with caption. ................. .................. ............ Error! Bookmark not defined. Table 2) Single-use styles............................................................................. Error! Bookmark not defined. Table 3) Authoring styles. ............................................................................. Error! Bookmark not defined.

    LIST OF FIGURESFigure 1) Insert name of figure here (Arial bold 9 point gray). ............... ....... Error! Bookmark not defined. Figure 2) Example caption. Always finish captions with a period, even if its not a complete sentence. ...................................................................................................................... Error! Bookmark not defined.

  • 8/10/2019 Technical Report.docx

    3/44

    Cloud Storage Gateways NetApp Confidential

    1 SUMMARYIn this section, we will present the key observations and insights of the report, these will be elaborated atlength in the rest of the report.

    1.1 CLOUD GATEWAYS(Section 2)

    A cloud storage gateway can be defined as follows:

    Cloud storage gateway is a hardware or software based appliance located on your organizations premises. It enables applications located in your local datacenter to access data over a WAN fromexternal cloud storage. The applications continue to use iSCSI, CIFS, NFS protocols while the cloudstorage gateway accesses data over the WAN using APIs such as SOAP or REST.

    Cloud gateways act as a bridge between enterprise data centers and storage that is resident in anexternal service provider, supporting the trend towards hybrid clouds.

    Why are gateways important for our customers?

    1. Agile storage delivery

    a. Provide access to a elastic storage for enterprises with simpler and rapid provisioning2. Lower infrastructure costs :

    a. Pay only for storage used (pay-as-you-go model).

    b. Lower capital costs in the data center.

    c. Reduced storage management, offloaded to the cloud provider.

    d. No separate off-site disaster recovery solution needed.

    Why are gateways important for NetApp?

    1. Another storage tier offering with different SLA properties in our storage portfolio.

    a. Cloud storage is viewed as low-SLA storage. A cloud gateway can enhance the value ofcloud storage for enterprises features like security, dedupe, storage mgmt., etc.

    2. Opportunity to offer MSEs a compelling alternative to dedicated backup appliances (e.g. DataDomain).

    1.2 RATIONALE FOR CLOUD GATEWAYS(section 2.3 ) Enterprise backups, archival data and tape data can leverage elastic cloud storage:

    o Roughly three copies of primary data are created for backup and secondary purposesleading to provisioning issues.

    o Cloud storage in a remote data center can be an equivalent for off-site tape (for DR).o Storage in the cloud is the largest growing cloud service (750 Billion objects in S3 by late

    2011). Mainly archival and online backups from consumer space.o Movement of enterprise data can be facilitated by the advent of cloud gateways. All enterprise applications might not move to the cloud:o Migration of compute and storage to the cloud is not cheap unless the application is

    offered as a SaaS (Software as a Service ) (e.g. Salesforce.com, Microsofts Office365). o Security, control and process concerns for many larger enterprises will force them to

    keep at least some applications on-premises.o On-premise enterprise applications can benefit from elasticity and other cloud storage

    advantages via cloud gateways.

  • 8/10/2019 Technical Report.docx

    4/44

    Cloud Storage Gateways NetApp Confidential

    1.3 KEY CLOUD GATEWAY USE CASES(Sections 3, 4, 5)

    Short term (1-2 yrs): Conduit for secondary storage - backup streams, archival data and tapedata.

    Longer term (2-5 yrs): Conduit for primary Tier-2 application data (primary) Microsoft Exchange,Microsoft SharePoint, Home directories.

    1.4 OPPORTUNITIES AND THREATS FOR NETAPP(Section 7)

    Threat: Tier-2 application primary data, especially in virtualized environments form the bulk of ourrevenue. They can move to the cloud via cloud gateways impacting our revenue significantly.

    Amazons AWS gateway suggests that their next version will aim at primary enterprise data.

    Opportunity-1: Currently N etApp does not have a compelling solution against DataDomainsbackup appliances. A cloud gateway solution with inline deduplication, WAN latency performanceoptimizations, integrated with NetApp data management features (like SnapVault, Sync Mirror)allows us to compete with them in this $2.18B market.

    Opportunity-2: Cloud gateways can enable easier migration of data to cloud service providerswho use NetApp storage. In addition, an integrated solution between a NetApp-based cloudstorage and NetApp cloud gateway can be efficient and compelling.

    1.5 KEY COMPETITORS IN THIS SPACE (Section 6 )

    Startup vendors: Nasuni (primary focus), StorSimple (Sharepoint integration), Panzura (Global filesystem), CTERA (consumer oriented). Enterprise readiness is a question with most of them, onlya couple of them have more than 50 customers.

    Established vendors: Amazon AWS Gateway, Riverbeds Whitewater appliance, Microsoft Azureappliance . Amazons gateway as well as Googles foray into c loud storage highlight the fact thatestablished players are keen to move entreprise data to the cloud.

    EMC has partnerships with almost all gateway vendors. EMC also has Atmos for cloud storage.

    Mode of deployment: All have VSAs, some have both VSAs and physical appliances. Very fewhave HA capabilities.

    1.6 NETAPP ADVANTAGES/DISTINGUISHING FEATURES(Section 8)

    NetApps data management value : Expose NetApps value add in data management (likesnapshots, cloning, mirroring, snapvault) on another storage tier - cloud storage.

    SLO-based management : Enable migration of data between traditional storage tiers and cloudstorage via SLOs.

    Leverage NetApp technologies : Cloud gateways require write-back caching for performance,NetApp can leverage existing technologies to create an efficient write-back cache that is

    protected by HA.

    1.7 KEY ADDITIONAL IP FOR A CLOUD GATEWAY VIS--VIS NETAPP(SECTION 9) Basic cloud gateway infrastructure (for both secondary and primary storage):

    o File to object protocols conversion.o Volume to objects/group of objects data granularity mapping.o Security of objects in the cloud (encryption).

  • 8/10/2019 Technical Report.docx

    5/44

    Cloud Storage Gateways NetApp Confidential

    Value-added features (for both secondary and primary storage):o Compressiono Deduplicationo Application integration.o Cloud-aware data management (like auditing cloud costs)

    Optimizations for a viable primary specific storage solution:o WAN latency optimization via read and write back caching, pre-fetching

    Infrastructure for global collaboration of a primary storage repository:o Global Locking across a WAN.

    1.8 RECOMMENDATIONS FOR NETAPP(Section 9, Section 10)

    BUY : (Near Term 1 yr)

    o

    Pros: Lower time to market; compelling and unique IP (as listed above in Section 1.7).o Cons: Enterprise readiness of many startup vendors; Integration of acquired vendors IP

    with NetApp data management features requires time and effort.

    o Recommendation: Buy only when IP is hard to develop; chart out pathway to integrate.

    PARTNER : (Short Term 3 to 6 months)

    o Pros: Lower time to market; parity with EMC; integrated solutions that lower TCO.

    o Cons: Limited gains; NetApps data management value -add could be hidden

    BUILD : (Long Term 2 yrs)

    o Pros: NetApps distinguishing features can be fully leveraged ; enterprise readiness

    o Cons: Building cloud gateway in ONTAP would take beyond 2015 (LB+). Building non-ONTAP solution might limit exposing our value-adds.

    The overall recommendation is to partner immediately with cloud gateway vendors and pursue the buyand build options in parallel. Specific projects in the build option are outlin ed below:

    Enhance our V-series offering to have a cloud storage backend (already underway)

    ATG Projects:

    o Understand the performance of primary, Tier-2 applications on a cloud gateway

    o Reliability and security aspects of cloud gateway

    o Unique data management functionality required for cloud gateways

    o

    Global file system using cloud gateways

  • 8/10/2019 Technical Report.docx

    6/44

    Cloud Storage Gateways NetApp Confidential

    2 INTRODUCTIONThe growth of cloud technologies, both public and private clouds, have been driven primarily by perceivedreduction in IT costs. The commoditization of server hardware resources (especially CPU cycles, memorycapacity and disk capacity) has been the biggest enabler. In addition, the growth of hypervisorstechnologies to help increase resource utilization and consolidation of application servers have

    contributed significantly to this trend. Also, analytics on large data repositories (big data) have assumedsignificance for many organizations that derive revenue from web services; e.g. Google, Yahoo, Amazon,etc. The scale of data, and the need to compute analytics cost effectively, have forced them to adopt acloud-based infrastructure along with novel paradigms for computing like Map-Reduce and Hadoop.

    2.1 PRIVATE, PUBLIC AND HYBRID CLOUDS

    In the context of our discussion, we primarily deal with cloud storage as opposed to cloud compute.Private cloud storage has been applicable in situations where enterprises feel insecure about certaintypes of data leaving their controlled admin domains, e.g. payroll, source code, corporate email, etc. Onthe other hand, public clouds are applicable for cases where flexibility in terms of compute and/or storage,as well as ease of management trumps other admin considerations.

    Private clouds require both upfront (capital expenses) cost to create them as well as recurring operationalexpenses for administration and management. Whereas, the inherent sharing of resources, economies ofscale, and multi-vendor competition attributed to public cloud vendors enable a pure operationalexpenses model with an expected downward tendency in prices.

    From an enterprise storage context, it is clear that there are always some types of data that can bemoved to a public cloud, e.g. backups, archival data. Therefore, for enterprises, a hybrid cloud model, acombination of private and public clouds is expected to emerge. However, it is speculated that all datawill move to public clouds eventually provided the issues around security, control and service-levels areaddressed adequately.

    2.2 CLOUD GATEWAYS

    For both hybrid clouds as well as for enterprise datacenters to leverage public clouds, there is arequirement of a functionality that can bridge the two worlds and enable data migration between them.We call such a functionality as the Cloud Gateway, which can reside in an appliance or in a virtualmachine. Typically, the raw public cloud storage is accessed via a simple, object-based (PUT/GET)interface using SOAP/REST-based API over HTTPS. In addition, the Cloud Gateway wou ld employ thelocal disks or flash storage associated with the appliance to cache cloud data. In addition, the local diskscould also be employed as the final resting place for certains types of data that need to storedpermanently in the gateway, e.g., filesystem metadata.

    The caching can be write-back or write-through. Typically, to reduce latency for write operations, thecache would be a write-back cache. However, with write-back caching, we need to satisfy theserequirements: the dirty data in the write-back cache has adequate protection against failures and thatthese protection mechanisms do not debilitate the appliances performance. T he storage for the cache isthe only upfront storage investment needed to leverage the cloud gateway. This implies that the cost forthe customer is proportional to the working sets of their workloads as opposed to the entire datagenerated by their workloads. Depending on the workloads, the former could be much smaller than thelatter. In addition, for enterprise storage vendors, their ability to sell actual storage (in bytes) reducesdrastically.

    For a viable cloud gateway to an external cloud storage provider, there are some very specificrequirements due to the nature of the raw cloud data and the API offered to access it. In Section 3.1, welist these requirements and the rationale for having them. In Figure 1, an illustration of a cloud gatewayappliance is presented in Figure 1 in the context of an enterprise datacenter. As can be seen, the cloud

  • 8/10/2019 Technical Report.docx

    7/44

    Cloud Storage Gateways NetApp Confidential

    gateway is an on-premise appliance or a VSA (virtual storage appliance) and talks via NAS (CIFS, NFS)or SAN (iSCSI, FC) protocols with traditional clients in the datacenter. It interacts with the cloud storageusing an object-based interface and uses local disks for caching hot content or as a permanent store forprimary data. Typically, the gateway would be responsible to ensure security of the data before leaves thedata center. In addition, the gateway might contain features (depending on its use) that enableperformance optimizations and latency reductions while accessing cloud storage over the WAN. Last butnot the least, to enhance reliability of data stored in the cloud, the gateway might simultaneously store

    data on multiple cloud vendors to protect against cloud access outages, vendor lock-in and to leverageprice changes. Overall, all of the functionality in the gateway is aimed to lower the TCO by enabling theflexibility of cloud storage, i.e. lower/simpler provisioning (pay-as-you-go), lower admin cost and infinitelyscalable storage.

    Current cloud storage gateway market is very nascent and the offerings are not fully featured. Most of thevendors are smaller startup companies that are new to the storage space and do not own a completestorage portfolio like NetApp or EMC. As of now, only a few offer traditional enterprise capabilities likehigh-availablity and very few of them actually target enterprise storage. Currently, gateways have beenprimarily targeted for backup streams, archival data and as a replacement for tape (offsite disaster-recovery). These are primarily offline workloads, with limited performance requirements that are typicallytolerant to variation in throughput and latency, such as in a WAN. Moreover, since most of the vendorsare startups, they would like to target workloads that are relatively easy to support from a performanceperspective, as opposed to primary workloads that have stringent requirements.

    2.3 MOTIVATING FACTORS

    In this section, we will provide motivation for cloud gateways from the perspective of two differententerprise workloads: secondary storage and for Tier-2 application data.

    2.3.1 Cloud gateway for secondary storage

    Figure 1: Cloud Storage Gateway Architecture

  • 8/10/2019 Technical Report.docx

    8/44

    Cloud Storage Gateways NetApp Confidential

    In the enterprise data center, a rule of thumb is that for every byte of primary data, three bytes ofsecondary data are stored. This includes backups within the data center and a copy on tape at a remotesite for disaster recovery (DR) purposes. With backups, the typical enterprise workflow entails full backupevery week followed by daily incremental backups, leading to secondary data bloat. The reasondeduplication technologies are adopted heavily in this realm is primarily due to this bloat. In spite ofdeduplication technologies, we observe from certain case studies that efficient provisioning of storage to

    address secondary growth is very difficult.The number of objects in Amazons S3 now exceeds 700 million [ref]. Backup and archival data constitutenearly 55% of S3s objects. These backups are largely expected to be online backups of personal laptops(consumer space), the fraction of enterprise backups is not known. However, such a large percentagequestions whether cloud storage can be an efficient option for enterprise backup data as well. Also, thefeatures expected of offsite-copy (for disaster recovery purposes) of enterprise data maintained on tape isvery similar to that offered by cloud storage. But, cloud storage has other inherent advantages relative totape like WAN-based global access and a variable cost structure (due to multiplexing of cloud resourcesacross clients and workloads). We observe that these advantages make the migration of enterprisebackups and off-site copies to cloud storage imminent.

    The inherent elastic nature of cloud storage can help to address the provisioning of secondary data.Thus, the need is for a conduit to send enterprise backup to the cloud with the right level of security(protection) and recovery semantics. We envision the cloud gateway to act as this conduit and enableexisting backup applications to transparently leverage cloud storage. From another perspective, currentlywe observe that within Amazon S3, an overwhelming percentage of data consists of online backups andarchives from individual users (or consumers). Enabling enterprise backups to utilize cloud storage wouldbe a natural extension of this trend.

    As mentioned before, a copy is typically stored on a off-site tape archive, primarily for DR. Similar to cloudstorage, the tape is maintained at a remote data center. Moreover, like cloud storage, the tape archivecould be managed by a different company, maintained as a repository across their multiple customers.Therefore, the functionality and requirements are almost identical. This implies that cloud storage can bean effective and cheaper tape replacement because the extra cost of copying data over to tapes andtransporting them are not applicable. Compared to tape, cloud storage has one distinct advantage, datacan be accessed at any time or place, independent of the actual physical location. With the cloud

    gateway as the bridge to the archive in the cloud, the archive can be kept online indefinitely. Moreover,this online archive can be accessed using traditional protocols and recovered efficiently, with littlelogistical overheads.

    2.3.2 Cloud gateway for Tier-2 application data (primary storage)

    The cloud market as a whole, both private and public is growing rapidly. The cloud vendors classify theirservice in many ways Infrastructure as a Service (ItaaS, e.g. Amazons S3), Platform as a Service(PaaS, e.g. Joyent), Software as a service (SaaS, e.g. Microsofts Office365), etc. Within ITaaS, there isfurther classification as Storage as a Service (StaaS, e.g. Amazons S3) and Compute as a Service(CaaS, e.g. Amazons EC2, Microsofts Azure cloud). Among these different categories, StaaS isexperiencing the highest growth (cumulative growth rate of 25% annually), but CaaS leads in terms of

    revenue[1,2,11].The key question remains as to whether other workloads will adopt cloud storage. A related question iswhether hybrid clouds will become mainstream, where some data resides in an enterprise datacenter or aprivate cloud and the rest resides in a public cloud. An interesting perspective is provided by Intelswhitepaper on the future of IT, datacenters and their evolution vis--vis cloud technologies [4]. Figure 2shows an illustration from the whitepaper on the evolution of hybrid clouds and where the different wo

  • 8/10/2019 Technical Report.docx

    9/44

    Cloud Storage Gateways NetApp Confidential

    workloads will reside. It can be observed that in the mid term, only selective functions will move to publicclouds like caching of content on ItaaS and sales support to SaaS (e.g. salesforce.com). However, in thelonger term, they expect more workloads to go to public clouds backups, storage, manageability, clientVM images to ItaaS as well as CRM, Collaboration, Productivity tools to SaaS. As per this report, forNetApp, the implications are pretty clear a significant portion of the enterprise storage data is moving topublic clouds. Also, in the picture, we can see that internal clients in the enterprise datacenter areexpected to make use of ITaaS services like cloud storage over the WAN. This observation points to theimportance and development of cloud storage gateway technologies in the near future that will enable thisevolution.

    Another aspect in the evolution of cloud workloads is the role of SaaS. SaaS provide you the applicationas a cloud service accessible typically over a web-based interface. For business applications like CRM,ERP, such cloud services are readily available and are being adopted zealously. A unique case in point isMicros ofts Office365, which offers the entire Mircosoft Office suite of applications as web services. Suchservices are clearly cost effective for enterprises. In addition to the advantages of a cloud service, i.e. lowmanagement and admin costs, instant deployment, such services eliminate the extra servers (andassociated data center costs) required to run the application servers in the datacenter. However, the keydisadvantage is that there is very little control on the application data, i.e. their storage and securitypolicies on them. Thus, while adopting a SaaS service, we trust the provider considerably.

    The SaaS security model might be suitable for MSEs but not completely for large enterprises. We expectthat large enterprises would still like the security, control over resources and processes, that standaloneapplication servers offer when they are on-premises. However, they would like to take advantage of theelasticity and cost advantages of cloud storage if possible. Cloud gateways enable this exactrequirement, to serve as a bridge that enable applications to reside in the data center but leverage cloudstorage for their storage needs.

    With cloud gateways, other than SaaS use-cases, we expect that there are a significant number ofworkloads that use compute in the datacenter or private cloud but leverage storage in a public cloud. Akey question is whether this assumption is valid. A contrary opinion is that for all enterprise applications

    Figure 2: Intel's IT evolution - hybrid clouds (Source: Intel whitepaper [4])

  • 8/10/2019 Technical Report.docx

    10/44

    Cloud Storage Gateways NetApp Confidential

    (standard as well as custom) both compute and storage would move to a public cloud and render thegateway functionality useless. There are no clear answers to this question as of now. We need to wait forthe evolution to take place before we make our judgement. However, a hybrid cloud scenario, with a splitof compute in the private cloud (or enterprise datacenter) and storage in the public cloud is plausiblegiven the characteristics of these applications and resources (compute and storage). We list a few ofthem here:

    1. Change in applications: For applications to move completely to a public cloud, a few changeshave to happen:

    a. These applications need to be portable, i.e. encapsulated in a VM, before they can bemoved to a compute cloud service.

    b. Typically these applications work with unencrypted data in the datacenter, when run in apublic cloud compute infrastructure, data traffic into and out of the VM needs to beencrypted. Moreover, data at rest created by the applications need to be maintained inencrypted form.

    c. The application or the layer below the application needs to talk to cloud storage via adifferent protocol. Enterpise applications or the layer(s) below them need an objectprotocol to access cloud storage as opposed to the typical enterprise storage protocols(NAS/SAN). Either a translation from NAS/SAN protocols to object protocols need to bemade or new storage client software that natively talk object protocols need to beintroduced in the application VM stack.

    d. Also, along with translation, to reduce cost of cloud storage, features like storageefficienc y need to be incorporated into the applications VM stack.

    None of these changes are insignificant for many legacy enterprise applications.

    2. Compute and Storage are different kinds of resources: With increasing processor speeds,compute as a resource is one of the cheapest in the datacenter and the most flexible in terms ofusage. Contrarily, the cost per CPU cycle, as seen with Amazons EC2 is not low. Thisobservation has translated to CaaS generating more revenue than StaaS. Moreover, compute isa renewable resouce, the moment a CPU cycle is used up it is available for use again. Whereas,

    storage costs are going down, but the hidden costs of storage administration is still considerable.In addition, storage is a consumable resource, once a byte of storage has been used, it has beenconsumed and cannot be used until the data is erased. These factors might make storage acandidate to migrate to the public cloud but not necessarily compute.

    Given these observations and the growth of hybrid clouds, we feel that that cloud gateways might be theconduit for enterprise storage to move to public clouds at least for some workloads. Burton Groups report[1] on cloud gateways classifies workloads that have already moved and the ones that can move to thepublic cloud via gateways. A key observation is that many Tier-2 applications where NetApp has enjoyedsignificant market share and revenue growth are listed as ones that might move. The lure of flexible, pay-as-you-go and a low capital expenditure model for enterprise storage are the key motivating factorsbehind this prognosis.

  • 8/10/2019 Technical Report.docx

    11/44

    Cloud Storage Gateways NetApp Confidential

    3 CLOUD GATEWAY ARCHITECTURESIn this section, first, we will outline the mandatory capabilities that a cloud gateway should possess mainly dictated by design considerations and partly by the first- movers differentiation in this space.Second, we will provide three alternative models of usage for a cloud gateway. These models are notmutually exclusive of each other.

    3.1 MANDATORY CAPABILITIES

    Given the market space for cloud gateways, the following are the mandatory features expected of anenterprise-class appliance/VSA:

    Operations and protocols (NAS/SAN) that emulate conventional storage arrays and file servers :To enable existing enterprise (storage) clients to access data.

    Translate between files, blocks to objects on the cloud storage : Since the cloud storage APIoffered is typically object-based, the gateway needs to translate appropriately.

    Data leaving the enterprise datacenter needs to be secure : Enterprise data cannot leave thepremises in clear-text and cannot be stored in clear-text in the cloud. This requirement is usuallyachieved by encryption before the data leaves the datacenter.

    Perform smart caching of the data to avoid WAN latency : Typically, write-back caching andefficient pre-fetching strategies are employed in this context. Also, having an effective cachereduces the number of network requests to the cloud storage provider, enabling extra savings.

    Minimize the WAN bandwidth usage by deduplicating data : Most external cloud SSPs charge forboth the data stored as well as for network requests. To ensure minimal data is stored in thecloud, deduplication is essential. Moreover, you pay the SSPs only for network requests of uniquedata.

    Provide access to multiple public cloud storage vendors : Mainly to prevent a single point of

    failure as well as single-vendor lock-in. Export cloud storage semantics to the end admin : Cloud storage features such as on-demand

    capacity and pay-as-you-go pricing model needs to be exported to the admins in a transparentway.

    Monitoring, reporting and other data management capabilities : Since the customer would bepaying the cloud SSP for the storage as well as network requests that originate from the cloudgateway, it is essential to audit all the requests efficiently and present them to the customer ondemand.

    3.2 DESIGN APPROACHES TO CLOUD GATEWAY

    Different approaches or models have surfaced among the cloud gateway vendors to facilitate cloudstorage integration. These models also have a strong relationship to a typical dataset they would support.The performance characteristics of all enterprise workloads cannot be satisfied by all models. However,these models are not exclusive to each other, some appliances blend them. The models are:

    Caching device model : The gateway provides advanced caching algorithms to mask cloudperformance limitations both WAN latency and bandwidth constraints. Typically, write-backcaching is done on local disk or SSD devices.

  • 8/10/2019 Technical Report.docx

    12/44

    Cloud Storage Gateways NetApp Confidential

    Tiered device model : The gateway enables the creation of an explicit enterprise storage tierwith specific performance and capacity characteristics. By definition, in this model, the gateway ispart of a larger eco-system that provides the other storage tiers.

    Copy device model : The gateway provides conventional on-premises storage with scheduledreplication services to the cloud to facilitate backup/recovery functionality as well as a disasterrecovery solution.

    As gateway offerings increase, we expect them to be a combination of these models. The followingsubsections detail each of these models.

    3.3 CACHING DEVICE MODEL

    Figure 3 illustrates a gateway modeled as a caching device. With this approach, a cached copy, i.e., avirtual storage volume (filesystem or LUN) is presented to the datacenter clients by the appliance,whereas the actual volume is in the cloud. The cached copy need not be in-sync with the volume in thecloud. Moreover, the cache type write-through or write-back dictates the invalidation and consistency

    requirements.

    Typically, in order to mask WAN latencies, the caches are designed as write-back caches. This impliesthat during steady-state, some amount of dirty data (unflushed writes) will be present in the cache. In thiscase, since we are dealing with enterprise data, data loss is not acceptable. Therefore, we need toensure that the dirty data can survive the loss of the gateway appliance by appropriate reliabilitymechanisms (e.g., mirroring to another appliance within the datacenter). Since this requirement is similarto the reliability requirements of primary enterprise data, vendors should build these functionalities in theirgateways to be feasible. In addition, in the event of a datacenter disaster, the volume recovered from thecloud needs to be in a consistent state. To enable this, by design, we require well-defined cut-off points(checkpoints/snapshots/consistency-points) in time for synchronizing dirty data from the gateway to thecloud.

    A vendor-proprietary caching algorithm attempts to minimize data transfers between the gateway andthe cloud- storage provider for both reads and writes. Cache reads can transfer to the enterprise dataclients at speeds consistent with NAS or SAN systems. Anytime the cache experiences a read cache"miss," the gateway must retrieve data from the cloud and incur both latency and bandwidth penaltieswhile the data moves from the cloud through the cloud connectivity and finally to the gateway. Typically,for writes, the cache aggregates, compresses/deduplicates, and encrypts the data for transfer to the

    Figure 3: Gateway Caching Model (Source: Gartner)

  • 8/10/2019 Technical Report.docx

    13/44

    Cloud Storage Gateways NetApp Confidential

    cloud at opportune times. To minimize cloud data footprints, improve performance, and preserve dataprivacy, upon transfer the gateway will compress, deduplicate, and encrypt data.

    Some vendors like StorSimple have taken a hybrid approach, where some data like the filesystemmetadata always resides in the gateway, whereas the file system data is being cached. This approachenables fast access to metadata even in the event of a disconnect to the cloud.

    The following are some key technical issues are relevant for the caching approach:

    Cacheable Workloads : To mask the WAN latency effectively, the workloads need to be cache-friendly, i.e., have temporal locality properties that lend to a relatively small working set. Inaddition, the cache replacement policy has a big impact on performance and has to be designedwith due importance. Lastly, effective prefetching strategies need to be devised, in order tominimize cold misses that require synchronous retrieval from the cloud leading to unexpectedand unacceptable WAN latencies (could be three orders of magnitude compared to local disk).

    Sizing : Since we store data storage in the gateway is a function of the working set sizes and notthe actual data size, ideally, we can serve an ever-increasing cloud storage footprint with thesame local storage on the gateway appliance. Even if we do not exhibit this ideal behavior, weexpect the growth of local storage (on the gateway) to be dependent on working set size growth,which is more tied to applications behavior/evolution as opposed to application data growth. Th isinsight offers incredible cost advantages for datacenters, where storage capacity sizing andprovisioning (usually for peak utilization) concerns are largely mitigated. In addition, the lowercapacity requirements augur well for a pure flash-based cache (SSDs) .

    Coherency issue : In some cases, the actual cloud volume could be shared by multiple cloudgateways in different geo-distributed datacenters via gateways . Panzuras Globa l File System isan example. To provide a globally consistent view of a single file system, we need appropriatecoherency mechanisms across the WAN-distributed gateways, such as global locking and cacheinvalidation schemes.

    So far, analysts like Gartner have suggested that the caching model is best for minimal footprintinstallations (like branch offices), file sharing workloads, data archival and low-demand backup, primarilydue to latency concerns. It is still unclear if the requirements of key primary workloads like businessapplications can be satisfied by the caching model. For this model, example vendors are Nasuni,Riverbeds Whitewater appliance and Panzuras Alto 6000 Series Cloud Controllers.

    3.4 STORAGE TIER MODEL

    Figure 4 shows a tiered gateway. With this model, in contrast to the caching model, storage volumesmay wholly exist in the datacenter on local storage and/or exist in the cloud. In this model, the gatewayenables cloud storage access to enterprises as a specific tier in a multi-tier storage hierarchy. Thehierarchy is usually based on performance characteristics of each tier.

    Today, in enterprise datacenters, such multi-tier hierarchies already exist, usually classified based onperformance characteristics - starting from fast, expensive flash-based storage to slow, inexpensivetapes. A cloud gateway would extend this hierarchy by offering another tier with flexible storage capacityand disaster protection but lower performance. Given these characteristics, this tier would fit datasetsdescribed as archival/data warehouse, cold data as well as traditional backups. Also, multi-tier storagehierarchies enable features like automated data migration to reduce overall cost - the ability to move datafrom one tier to another automatically via dataset policies.

    Here are some key issues to relevant to this model:

  • 8/10/2019 Technical Report.docx

    14/44

    Cloud Storage Gateways NetApp Confidential

    Traditional workloads compatibility : A tiered gateway from a different perspective could bethought of as a conventional enterprise storage system with cloud gateway feature. Therefore,

    workloads that are appropriate for local storage systems are still applicable to tiered gateways.However, a key advantage of such a gateway, is that the cloud storage provides infinite capacity,albeit with performance, SLO and cost limitations. Though, current offerings have limited localstorage scalability (StorSimples 7010 appliances maximum is 20TB) and may fall short for IOintensive workloads.

    Caching issues mitigation : As mentioned before, with a write-back caching gateway, to protectagainst data loss due to failures (hardware or cloud connectivity), we need protectionmechanisms in place like local mirroring and consistent checkpoints of the cloud data. In thetiering model, due to the existence of a local storage tier, such issues are largely mitigated orcompletely absent.

    Highly competitive landscape : With the tiering model, we expect existing enterprise storageplayers, e.g. EMC, HP, IBM, Hitachi data systems, etc. to have a tiered gateway as they are best

    equipped to enhance their value proposition with multi-tier storage.The tiering model is best suited for data archival, data warehousing type of datasets. Example productsare StorSimple and F5 Networks ARX.

    Figure 4: Gateway as a storage tier (Source : Gartner)

    Figure 5: Gateway used for a remote copy (Source: Gartner)

  • 8/10/2019 Technical Report.docx

    15/44

    Cloud Storage Gateways NetApp Confidential

    3.5 COPY MODEL

    Figure 5 illustrates a copy cloud gateway. In this model, the gateway is similar to a traditional localNAS/SAN storage system and customers are supposed to use it that way. The performance andmanagement expectations from this appliance are also similar to a traditional on-premise storage system.However, the only unique value-add is the ability to connect to external cloud storage and performreplication/copy services from local storage to the cloud storage. These copy services would be similar oridentical to the ones between enterprise storage systems. The main goal of such services is dataprotection in the event of both the on-premise storage system or the datacenter itself. A secondary goalwould be asymmetric data sharing (i.e., read only) across geographically distributed datacenters.

    With this model, storage admins are required to be able to perform cloud storage related configurationand map traditional copy services notions to the equivalent ones relevant for cloud storage. Also, thegateway is expected to deduplicate and compress data before transferring to the cloud as a large stream.The copy approach is ideal for data protection use-cases (like DR) that require routine snapshotcapabilities. This model has the least barrier to adoption and the ideal first product offering across allgateway models. However, the limited applicability might also mean limited cost savings. Also, since theoffering is geared towards disaster recovery, transferring large datasets from/to the cloud efficiently is akey issue. Not all cloud storage protocols are suited for a streaming workload, all of them currently exportan RPC like object based protocols.

    Some key issues relevant for the copy model are:

    Storage players imminent entry : Enterprise storage vendors with current DR offerings would jump onto the copy cloud gateway soon. The required SLA/SLOs in the cloud storage to supportthis model is a key open question, the role of cloud storage providers to facilitate this model is notclear. Amazons S3 is optimized for relatively smaller objects (order of 1MB) as opposed to largeobjects that might arise of a copy workload (order of TB). This implies more work on the gatewayto split the copy workload onto many smaller objects and the associated book-keeping.

    MSE suitability : Customers in the MSE space are the ideal candidate for such a gateway. Withcloud-based replication, DR services, they can skip the equivalent datacenter-to-datacenteroffered by current vendors completely for cost reasons. We expect this advantage as a keydriver for this model.

    An additional use-case is for data sharing between datacenters with completely non-overlapped workpatterns (two datacenters on different sides of the globe). Though, we have not seen existing vendorstout this specific advantage. CTERA is a prototypical vendor in this space.

    3.6 HYBRID (COMBINATION) MODELS

    It is easy to notice that these models are not mutually exclusive. A number of solutions from existingvendors is a combination of these different models. Typically, most gateways include a write-back cachefor performance reasons irrespective of the intended workload, i.e., backup or primary. An example isPanzura, that can be used as a primary storage as well as for archival purposes. As opposed to usingdifferent models, vendors prefer to differentiate by offering value added services like closer applicationintegration, e.g., StorSimple offers MS Sharepoint/Exchange server integration.

  • 8/10/2019 Technical Report.docx

    16/44

    Cloud Storage Gateways NetApp Confidential

    4 UNIQUE REQUIREMENTS/EXPECTATIONS OF GATEWAYSCompared to traditional enterprise storage in datacenters, cloud gateways have unique requirements thatthey need to fulfill. At a high level, most of these are methods to integrate them into known storagemanagement notions. We look into them next.

    4.1 CLOUD-STORAGE SERVICE AUDITING AND CONSOLIDATIONChanging a storage environment to incorporate cloud providers storage requires service-plan tracking tomanage user accounts, billing, access and usage. Consolidation of user accounts for volume pricing andindirect billing help reduce cost. Also, auditing of all network IOs to the cloud provider is key to provide thecustom er with necessary data to validate the cloud providers actual cost and to project future costs.

    4.2 ANALYTICS BUILT INTO THE GATEWAY

    To manage accounts and optimize savings, report generation for things such as caching efficacy,bandwidth usage, object transfer sizes are required to validate the effectiveness of cloud storage vstraditional storage. Also, these analytics might identify performance bottlenecks and help in provisioningthe right number of gateways and cloud storage volumes.

    4.3 PROVISIONING MANAGEMENT

    As is true for traditional storage, administrators should expect a gateway to offer simplified provisioning.For example, because cloud storage provides dynamic capacity expansion, creating a thin-provisionedvolume from a gateway should be a simple procedure. Expanding storage is straightforward, but when agateway releases unused storage (from the users perspective) it may not release the equivalent amountfrom the cloud storage provider. This can be true of gateways that translate block protocols to a object-based cloud storage without matching. Thus, releasing capacity may require local server agents torelease unused data blocks from the gateway and thus objects in the cloud.

    4.4 DR AND BACKUP INTEGRATION

    Because DR and Backup workloads are an important use-case for a gateway, it would be appropriate tointegrate the gateway to existing mechanisms to perform these operations. For example, backupapplications and protocols like Microsofts VSS (Volume Shadow Copy Services) , NDMP (Network DataManagemen t Protocol) or Symantecs OST (Open Storage Technology) are some relevant ones.

    4.5 FILE SYSTEM INTEGRITY PROTECTION MANAGEMENT

    Cloud gateways provide the ability to share a global file system that is made available to geographicallydistributed datacenters. Some vendors like Panzura have touted this as one of their major features. Thisfeature definitely distinguishes gateways from other traditional storage appliances. However, in order toaccomplish file system integrity in light of a sharing between clients in different datacenters require

    mechanisms like global lock management and global file synchronization.

  • 8/10/2019 Technical Report.docx

    17/44

    Cloud Storage Gateways NetApp Confidential

    5 SOLUTION DEPLOYMENTIn this section, we will compare and contrast different deployment options for the cloud gatewayfunctionality in an enterprise datacenter.

    5.1 LOCATION IN DATACENTER

    The cloud gateway needs the best possible access to WAN connectivity within the datacenter. To a largedegree, both throughput and latency experienced in accessing cloud storage is dictated by WANcharacteristics. Of the two, throughput can be maintained close to the available bandwidth by associatingwith the appropriate physical datacenter of the cloud storage providers network and by sufficientparallelism in software (multiple open connections to the cloud storage provider). For latency, each extranetwork hop in the local datacenter before reaching the WAN connection will be detrimental to the overalllatency. Therefore, the cloud gateway should be placed in the network topology with minimal distance tothe WAN connection of the datacenter.

    5.2 MERITS/DEMERITS OF AN APPLIANCE DEPLOYMENTSome cloud gateway vendors are packaging their functionality in a dedicated physical appliance. Thisapproach has many benefits : dedicated physical resources, performance isolation, fault isolation,typically better control on performance, leaner data path, etc. Of these advantages, the ones thatinfluence performance are prominent. Since the cloud gateway needs to have a write-back cache in mostmodels, the write-back cache can be made reliable with very little effect on performance by usingspecialized hardware (like using NVRAM, using high-speed interconnects to mirror contents to anothernode). This approach is typical of many primary storage systems. In addition, having dedicated memoryand CPU resources just for the gateway functionality enhances predictability in performance.

    There are many disadvantages of this approach as well. First, a dedicated physical appliance typicalcomes with a higher cost. Second, deploying a physical appliance is more time consuming and expensive

    for the admins of a datacenter. Third, the opex component of a physical appliance including therackspace, cooling/heating costs as well as power are not insignificant. Last but not the least, for thevendors manufacturing the systems, there are more dimensions to handle (suppliers/inventory control,qualification), a longer product cycle resulting in a longer ROI.

    5.3 MERITS/DEMERITS OF VM DEPLOYMENT

    Most cloud gateway vendors offer their solution as a VM. This has been influenced largely by the marketinto which they are positioning their solution. The MSE/SMB market has been the focus of many startupvendors. The VM solution is ideal for such cost-constrained environments where the higher performanceof a dedicated appliance is not as important. Constrained in a VM, the cloud gateway is forced to shareresources CPU, memory and storage devices. Moreover, the management of the VM has to beintegrated with the hypervisors management processes and tools.

    A VM-based deployment model has its unique advantages. It is possible to deploy many smaller virtualappliances in each host, such that the combined resources is higher or equal to that of a physicalappliance. In addition, being closer to the application VMs allows the caches in the gateway VMs to bemore effective. Apart from the advantages listed for a physical appliance deployment, the keydisadvantage is that performing any global optimization that entails cross-gateway communication isexpensive and is avoided. For example, the cloud gateways can only perform deduplication within the I/Ostreams originating at their hosts and cannot deduplicate across the gateways. Therefore, some

  • 8/10/2019 Technical Report.docx

    18/44

    Cloud Storage Gateways NetApp Confidential

    duplicates will find their way to the cloud storage resulting in extra costs for the admins. Also, being a VM-based deployment, the performance expectations need to be appropriately calibrated.

    5.4 USE-CASES

    The different architectural models enable one or more different enterprise use-cases for the cloud

    gateways. A single architecture could support multiple use-cases. The following are some important use-cases for cloud gateway deployments.

    5.5 CASE 1: BACKUP/COLD DATA

    By far the most common case where cloud gateways are being employed today. The cloud gateway isused to backup data to the cloud. Typically, the backup copies consume a lot of storage because oftraditional policies full backup every week, with incrementals every day. In datacenters today, dedicatedstorage appliances like Dat a Domains disk -based backup systems are prevalent. With this model,thereare significant overheads: raw storage costs (in spite of deduplication), storage administration costs ofthe systems, datacenter costs, provisioning/planning for storage growth. In addition, backup and recoverysoftware like Symantec or CommVault need to be employed. Backing up to the cloud is cost-effective inthe local datacenter, minimal or no storage admins are needed, and the storage needs can be metdynamically.

    5.6 CASE 2: PRIMARY DATA {CIRTAS, STORSIMPLE, NASUNI}

    A small number of cloud gateway vendors are positioning it as a one-stop solution for all storage needs.They claim that they can cache the most performance-critical working sets on their appliances (virtual orphysical) to enable primary datasets to be stored on the cloud gateway. These appliances are expectedto understand the data access properties of the different data entities (files, blocks or objects) stored onthem and transfer only appropriate ones to the cloud. In addition, they typically perform effective pre-fetching from the cloud to avoid WAN latencies.

    These vendors are careful not to position their appliance for highly latency-sensitive tier-1 applicationslike OLTP. They are targeting tier-2 application content like Exchange or Sharepoint databases, whoseperformance requirements they feel they can satisfy by careful analysis of their data access properties.Moreover, compared to OLTP datasets, tier-2 applications typically generate more data. Therefore, theirstorage growth trends can make them moving to the cloud economically viable.

    5.7 CASE 3: DISASTER RECOVERY AND COMPLIANCE COPY

    With this approach, the traditional tape-based, off-site DR copy is being replaced by a copy in the cloud.The datacenter is expected to have a local disk-based backup appliance for operational recovery. Data isretrieved from the cloud only when access to the datacenter is completely lost. In this use-case, data isbeing sent to the cloud continuously, but hardly ever read back.

    A similar use-case is to keep a copy in the cloud for compliance purposes. Sarbanes-Oxley and HIPPAregulations force the respective verticals to maintain fine-grained data for prolonged periods of time withthe ability to recover when needed. Maintaining an off-site datacenter just for compliance reasons isexpensive. A cloud copy kept with a provider with reasonable reliability and availability guarantees is agood option to keep costs low. Amazons S3 provides different levels of reliability and availability with amatching cost spectrum to enable such use-cases.

  • 8/10/2019 Technical Report.docx

    19/44

    Cloud Storage Gateways NetApp Confidential

    5.8 CASE 4: REDUNDANT DATA BLOAT {NETFLIX USE}

    In rich content space, there is a need to keep multiple copies of the same content at different degrees ofdefinition or resolutions. A case in point is Netflix, they need to maintain multiple copies of their onlinemovie content at different resolutions. Coupled with the number of movies they have, this leads to astorage explosion. Not all resolutions are being used at the same time. Each resolution could beappropriate for a different device on which the movies can be played. A similar case can be made foronline photo repository or sharing services. In such cases, to satisfy storage growth, it would make senseto put such content on an external cloud storage service and stream the content directly from them.

  • 8/10/2019 Technical Report.docx

    20/44

    Cloud Storage Gateways NetApp Confidential

    6 COMPETITIVE LANDSCAPE

    6.1 CLOUD STORAGE GATEWAY ECOSYSTEM

    In Table1 below, we have listed the key products in this space with some of their main attributes alongwith their differentiators.

    U

    Table 1: Cloud Gateway vendors, features and differentiators

    Vendor/Product

    Name

    Form Factor Use-case

    Focus

    Block/File SupportedClouds

    KeyDifferentiators

    AArkeia HardwareAppliance

    DataProtection only

    File S3/Arkeiacloud

    Integrated backup/DR,source-baseddedupe

    Axcient H/WAppliance

    DataProtection only

    File Axcient cloud Integrated data protection and businesscontinuity

    CTERA H/Wappliance +

    backup agents

    Data protection, filesharing

    File S3, EMCAtmos,Rackspace,Hitachi HCP,Mezeo,Scality,

    Nirvanix, IBMGPFS, DellDX/Carringo

    All-in-one

    Egnyte PC Agent,hardware orvirtualappliance

    Cloud fileserver, filesharing, file

    backup

    File Egnyte cloudonly

    Ease of use,centrally managedcloud file serverwith local editand offline access

  • 8/10/2019 Technical Report.docx

    21/44

    Cloud Storage Gateways NetApp Confidential

    Vendor/Product

    Name

    Form Factor Use-case

    Focus

    Block/File SupportedClouds

    KeyDifferentiators

    Gladinet Software Cloud desktop,cloud server

    File Mezeo, S3,AT&TSynapticDrive,Internap,Google,Box.net, OpenStack,

    Nirvanix,RackspaceCloudFiles,Azure

    Wide choice ofstorage clouds,low cost

    Hitachi DataIngestor

    Hardwareappliancewith HA

    Data protection,archiving(private cloud)

    File HDS HitachiContentPlatform only

    Centrally manageand control dataat the edge

    MS i365 Software,hardware

    DataProtection

    File i365 Cloud Range of services,Microsoft DPMintegration

    Nasuni Hardware orVirtualappliance

    Primary NAS,data protection

    File S3 100% UptimeSLA

    Nirvanix CloudNAS

    Softwarefeature

    Cloud filer,sharing

    File Nirvanix only Free of charge,ease of use

    Panzura AltoCloud Controller

    Hardwareappliance(with HA) orvirtualappliance

    Primary,collaboration,archiving

    File S3, LimelightCDN,MicrosoftAzure, AT&TSynapticstorage,

    Nirvanix

    Global Namespace,global datareplication andlocking, andglobaldeduplication

    Riverbed Whitewater Hardware orvirtualappliance

    Data protection only File S3, Nirvanix,AT&TSynapticStorage

    Experience inWAN bandwidth/latencyoptimization

  • 8/10/2019 Technical Report.docx

    22/44

    Cloud Storage Gateways NetApp Confidential

    Vendor/Product

    Name

    Form Factor Use-case

    Focus

    Block/File SupportedClouds

    KeyDifferentiators

    Seven10 StorFirst EAS

    Software Archiving only File EMC Atmos,AT&TSynapticStorage, DellDX6000 +others

    Multi-vendor,multi-platformand multimediaarchiving

    StorSimple Hardwareappliance(with HA)

    Primary,secondary,data protection

    Block AT&TSynapticStorage, S3,EMC Atmos,MicrosoftAzure

    Microsoft andVMWarecertification

    TwinStrata CloudArray

    Hardware orvirtualappliance(with HA)

    Secondary,data protection

    Block S3, EMCAtmos,Mezeo, Scality

    DR anywhere,Computeanywhere

    EMC AtmosGeoDrive

    Softwarefeature

    Cloud filer,sharing

    File Atmos Ease of use,integration withAtmos

    PPLiER/ PRODUCT NA

    As can be seen, the preferred use-cases for most of these cloud gateway products are data protection

    and secondary storage. Very few of them explicitly target the primary space. Also, as of now, S3 seems tobe the preferred cloud storage vendor of choice. EMC is partnering with a lot of these vendors to fuel Atmos cloud deployments. A notable feature is that except a few of them, most of them handle data at filegranularity. For vendors that focus specifically on data protection, it is reasonable to handle at the level offiles. However, vendors that are offering cloud gateways for primary, collaboration and sharing workloadsthe rationale for file-level granularity needs to be established. In the rest of the section, we will presentmore details of a few select products. Each of these represent one particular type of cloud gatewayarchitecture.

    6.2 NASUNIThe Nasuni Filer is an on-premise storage device that serves as a cache to the cloud storage, where theprimary copy of the data resides. It is available both as a virtual machine (on VMWare or Mic rosoftsHyper-V servers) and as a physical appliance. It supports NFS and CIFS, with full integration with ActiveDirectory, DFS, older Windows versions. Key differentiating features:

    Performance with unique caching algorithms: Active data is stored in the cache for local storageperformance, while relatively inactive data, typically bulk file date, is stored off-site in the cloudstorage. For this offsite data, all metadata is cached. Upon a cache miss, the file requested isreturned in chunks so that users can access the data without waiting for the whole file to be broughtinto the local cache. There are special algorithms to handle metadata versus data so that the systemis responsive when the user is scanning directory listings and browsing his folders. Being file-based

  • 8/10/2019 Technical Report.docx

    23/44

  • 8/10/2019 Technical Report.docx

    24/44

    Cloud Storage Gateways NetApp Confidential

    High-Speed SSDs for Performance Acceleration: Upto 12 SSDs to hold frequently used data (frontend cache) to support multiple concurrent users. Admins can assign tiering policies so the .VMDKsget the performance of flash but ARCHIVE.PSTs stay on disk.

    High Availability with redundant components: RAID-5 or RAID-6 protection, hot-swappabledrives, redundant power supplies and fans.

    6.4 STORSIMPLEStorSimple is about separating off the top two tiers from an enterprise storage array, the SSD and fastSAS disk drives and putting them in a 3U hardware appliance, with the rest of the array - the bulk datastorage part - replaced by the cloud. It offers the Armada hybrid cloud storage appliance as a primarystorage alternative to conventional block storage systems in midsized companies (~500 users) anddepartments within enterprises. The iSCSI- based appliance is positioned as all -in-one primary storage,archive, backup/recovery and DR in a single box.

    Four Storage Tiers: SSD - linear (raw tier 1), SSD - deduplicated (tier 2), SAS deduped,compressed, Cloud - deduplicated, compressed, encrypted.

    Weighted Storage Layout (WSL) and BlockRank Algorithm : Figures out what data is relevant toan application over a period of time and makes sure these hotspots/working set data is in theStorSimple appliance while colder data goes out to the cloud. Transparently moves data across tiersof storage to optimize performance and cost for example, 85% utilization threshold causes spillingdownward to lower cost, lower performance tier. WSL is automatic, dynamic and operates in realtime. Data is carved into variable- length blocks. WSL works at block level and uses the "Block Rankto order blocks in terms of their usage patterns, frequency of use, age, reference counts and therelationship segments have with each other, to find the right storage tier. Spilling can be controlled bya per-volume priority setting (local-preferred, normal or cloud-preferred).

    Application integration and application-specific optimization for Microsoft SharePoint andExchange 2010: Has application-optimization plug-ins which maximizes performance on a per-volume basis. Leverages Microsofts EBS and RBS APIs with SharePoint, wherein the SQL serverdatabase is always stored on SSD whereas the content, including BLOBs like audio, video and CAD

    drawings, can be spread over SSD, SAS drive or the cloud. With Exchange, leverages deduplicatedprimary storage and the cloud for supporting DAG, increased mailbox quotas, PST centralization.Can recover individual items or full mailboxes.

    High Availability: Offers dual controllers for enterprise grade HA, redundant power supplies, networkconnections and no single point of failure. Also supports non-disruptive software upgrades. Certifiedby Microsoft and VMWare.

    Others: Concurrent inline block-level dedupe using variable-length subblock segmentation. CloudSnapshots for data protection, backup/recovery, and Cloud Clones for off-site backup, geo-replication,DR, tape replacement and Cloud Bursting for compute.

    6.5 CTERACTERA offers a Cloud Attached Storage solution for SMBs and enterprise branch offices, that combinessecure cloud storage services with on-premises appliances and managed agents in an all-in-one solutionfor backup and recovery, shared storage and file-based collaboration. Its products are all in hardwareappliance form, ranging from the consumer-centric CloudPlug to the enterprise-grade C800 8-bayappliance. It supports file-based protocols such as NFS and CIFS.

    Tiny form factor offering: CloudPlug is a plug-top computing device that instantly transforms anyexternal USB/eSATA drive into a NAS device with automatic secure cloud backup without the need

    http://www.channelregister.co.uk/2010/12/07/storsimples_sashaying_to_cloud/http://www.channelregister.co.uk/2010/12/07/storsimples_sashaying_to_cloud/
  • 8/10/2019 Technical Report.docx

    25/44

    Cloud Storage Gateways NetApp Confidential

    for any user intervention or PC client software, and allows remote access, file sharing andsynchronization.

    Next3 File System for Thin-Provisioned Snapshots: Developed on top of ext3, this createssnapshots using dynamically allocated space, so there is no need to pre-allocate and waste valuablespace on the disk, and unused space is automatically recovered for file system use. It works bycreating a special, sparse file (that takes no space at the outset), to represent a snapshot of thefilesystem. When a change is made to a block on disk, the filesystem first checks to see if that blockhas been saved in the most recent snapshot already. If not, the affected block is moved over to thesnapshot file, and a new block is allocated to replace it. Writes take a little longer due to the need tomove the old block. Over time, this fragments the contiguous on-disk format that ext3 tries to create,affecting streaming read performance.

    File and disk level backup: Backups can be stored both locally and in the cloud. Individual filesbackup/restore as well as incremental, disk- level (bare -metal) backup of live servers is possible forentire system recovery. Supports built-in and custom-created Backup Sets where each representsa group of files of certain types and/or that are located in specific folders. Block-level as well aspartial-file deduplication is done.

    Full Remote Management: All aspects of CTERA's solution can be managed remotely, with no on-

    site presence or intervention. Only require a Web browser to access, configure every feature,including firmware updates, real-time monitoring and event notifications.

    6.6 AMAZON AWS CLOUD GATEWAYThis beta offering from Amazon is a service connecting an on-premises software appliance with cloud-based storage to provide seamless and secure integration between on- site IT environments and AWSscloud storage infrastructure. It is a virtual appliance running atop VMWare ESXi hypervisor on a physicalmachine with 7.5GB RAM for the VM, and 75GB of local disk storage (DAS or SAN). It exposes an iSCSI-compatible interface. It compliments on-premises storage by preserving low-latency performance, whileasynchronously uploading the data to Amazon S3.

    Versioned, compressed EBS snapshots: The gateway proactively buffers writes temporarily on on-

    premises disks, before compressing and asynchronously uploading them to Amazon S3 where theyare encrypted and stored as an Amazon EBS snapshot. Each snapshot has a unique identifier forpoint-in-time recovery - it is mounted as new iSCSI volume on- premise, and the volumes data isloaded lazily in the background.

    Backup, DR, Workload Migration Use Cases: Provides low-cost offsite backup using snapshots. Amazon S3 redundantly stores these snapshots on multiple devices across multiple facilities, quicklydetecting and repairing any lost redundancy. If on-premises systems go down, users can launch

    Amazon EC2 compute instances, restore snapshots to new EBS volumes and get the DR environment up and running with no upfront server costs. To leverage Amazon EC2s on -demandcompute capacity during peak periods, or as a more cost-effective way to run normal workloads, theGateway can be used to move compute to the cloud by mirroring on-premises data to Amazon EC2instances . It can upload data to S3 as EBS snapshots, from which new EBS volumes can be createdusing AWS Management Console or Amazon EC2s APIs, and attached to EC2 instances.

    Monitoring Metrics via Amazon CloudWatch: Provides insight into on- premises applicationsthroughput, latency, and internet bandwidth to S3.

    Bandwidth Throttling : Can restrict the bandwidth between the gateway and AWS cloud based on auser-specified rate for inbound and outbound traffic.

    Gateway-Cached Volumes: Future support, wherein only a cache of recently written and frequentlyaccessed data will be stored locally on on-premises storage hardware, and the entire data set will bein the cloud. Fits cloud as primary storage case, with low access latency to active data only.

  • 8/10/2019 Technical Report.docx

    26/44

  • 8/10/2019 Technical Report.docx

    27/44

    Cloud Storage Gateways NetApp Confidential

    generated $388m in 2010, and will grow at a CAGR of 25% to reach $1.18bn in 2014 (see Figure 7).These figures are for business-centric cloud storage and not consumer cloud storage. significant point isthat compared to all other cloud-based services, this sector has been experiencing the highest growth.

    Probing a bit further, we break the StaaS market into three segments: stand-alone cloud storage, onlinebackup and archiving. As Figure 8 shows, an overwhelming portion of cloud storage objects are backupstreams (online backup) and archival datasets. We expect this trend to continue at least in the nearfuture. In light of this observation, the role of dedicated backup appliances in the datacenter is expectedto diminish. More backup objects will move to the cloud provided the throughput requirements for thebackup streams are being met when the target is cloud storage.

    Application Areas July 2011 April 2011

    Email Systems 39% 35%

    Customer Relationship Management (CRM) 35% 29%

    Document and Enterprise Content Management (ECM) 22% 17%

    Collaboration Tools 22% 22%

    Business Intelligence / Reporting and analytics (BI) 21% 14%

    Disaster Recovery/Failover 20% 20%

    B2B e-Commerce (Business to Business) 17% 16%

    Enterprise Resource Planning (ERP) 11% 7%

    Test and Development 11% 9%

    Supply Chain Management (SCM) 10% 6%

    Table 2: Current cloud usage (Source: ChangeWave, Corporate Cloud Computing Trends, Aug 2011)

    Figure 8: Segmentation of StaaS capacity

  • 8/10/2019 Technical Report.docx

    28/44

  • 8/10/2019 Technical Report.docx

    29/44

    Cloud Storage Gateways NetApp Confidential

    Under gateway potential, we see the typical Tier -2 applications, mainly primary workloads: email,

    collaboration, workgroup files and development/test. These workloads are characterized by applicationsthat care about latency and throughput, but can tolerate some variation in both of them. From NetAppsperspective, it would be useful to understand our share of this tier to assess if cloud gateways represent athreat to an existing sales segment.

    To understand the impact of cloud gateways on NetApps reve nue streams, we have in Figure 10, thebreakdown of our current revenue based on both products as well as workloads over the different

    $4.6

    $1.3

    FY11Size

    ($ Billion)

    ~$16.0 1

    $0.6

    DAS (incl. ServerAttached Storage)

    FY16Size

    ($ Billion)

    FY11 NTAPShare

    (%)

    FY11-16CAGR

    (%)

    $8.0

    $6.0

    ~$8.0 1

    12%

    36%

    -13%

    $2.6 33%

    ~22%

    ~7%

    ~29%

    ~0%

    ~3%

    ~0%

    ~16%

    ~4%

    ~2%

    BigB

    SVI

    BigC

    File Svs, HomeDirs etc

    NonSVI

    Tier1-BP

    Enterprise ContentDepots

    Virtualized(Collaboration, App Dev,

    Tier2-BP, parts of I T infra,parts of Web Infra)

    HPC, FMV, VSS

    Virtualizing(Collaboration, App Dev,

    Tier2-BP, parts of I T infra,parts of Web Infra)

    DSS / DW

    Big Analytics(DSS/DW, Web Infra)

    Active and DarkArchives

    $19.3

    $10.3

    $2.9

    $2.6

    $10.0

    $8.5

    $1.5

    $0.6

    $6.3$8.8

    $5.0$4.5

    14%

    2%

    15%

    34%

    -7%

    4%

    FASRev

    ($ Billion)

    E-SeriesRev

    ($ Billion)

    $1.3

    $2.2

    $1.4

    $0.1

    $0.2

    ~0%

    $0.2

    $0.4

    15.4% Share inthe Open N/W

    Storage Market

    $0.1

    Figure 10: NetApp share based on workloads

    Figure 9: Cloud Gateway workloads (Source: Gartner)

  • 8/10/2019 Technical Report.docx

    30/44

    Cloud Storage Gateways NetApp Confidential

    financial years. It is clear that the bulk of the revenue is obtained from two broad segments: a)Collaboration, App Development, Tier-2 business processing applications, web infrastructure bothvirtualized and non-virtualized segments contributing $3.6B towards total revenue b) File services andhome directories contributing $1.4B to the total revenue. As can be seen, a significant portion of theseoverlap with the workloads that can be potentially satisfied with a cloud gateway (as seen in Figure 5).This is the clear potential threat that NetApp to its current revenue generation model from the emerging

    cloud gateway vendors.

    7.2.2 Cloud gateway vendors share

    Since this market is very nascent, most cloud gateway vendors do not have more than a dozencustomers. Some of the vendors, like Panzura sell their solution exclusively to enterprises, inspite of thenumber of customers, each customer might lead to a bigger revenue. On the other side of the spectrum,there are vendors like CTERA who have a consumer focus and a larger number of customers. Storsimpleand other like them fall in between these two extremes. In Figure 11, we have an approximate number ofcustomers for many of these cloud gateway vendors, their focus and investors. These numbers indicatethat this market is pretty nascent. However, with the emergence of Amazons AWS gateway and othergateways from established enterprise storage system vendors might change the landscape significantly.

    From this Figure, we can also see that more than $100m of VC funding has been infused into this space.Most vendors have an SMB or MSE focus, with some notable failures as well (Cirtas). There is a high-level consensus between different analysts like Gartner, ESG and 451 Research Group in their studies ofthis market:

    a. Current vendors are mainly startups focusing on SMB/MSEs with backup/archival as the primaryuse-case.

    b. Most vendors have significant disadvantages compared to enterprise storage systems today lack of standard high-availability/reliability features and enterprise readiness (untested filesystems). Only a few of them have HA as a default option.

    Figure 11: Cloud Gateways, customers and investors

  • 8/10/2019 Technical Report.docx

    31/44

    Cloud Storage Gateways NetApp Confidential

    c. Established enterprise storage vendors (like EMC or NetApp) can develop solutions that augmenttheir existing solution portfolio, i.e. make cloud storage as an extra tier as well. In addition, theyare not constrained by many of the disadvantages the smaller competitors, especially enterprisereadiness. Once one of the established storage vendors have a solution, they feel that cloudgateways would be adopted faster in enterprise data centers.

    7.3 CASE STUDIES7.3.1 Medplast

    MedPlast provides thermoplastic and elastomer molding products and related services to the healthcare,pharmaceutical and certain consumer/industrial markets. The company has about 600 employees acrossfive sites and is approaching $100million in revenue. It fits clearly in the MSE space that many gatewaysare targeting.

    Medplasts IT department co nsists of just one person running all IT-related operations. Accordingly, it hasa heavy emphasis on outsourcing IT processes and applications and retain only critical systems underdirect control. The company already uses SaaS and hosted applications (like Rackspace-hostedExchange). However, it generates a lot of critical data internally, mainly manufacturing and engineeringrelated. It used a tier- one Hitachi SAN for this data. Medplast was using EMCs Data Domain targetarrays for backup/recovery via Veeam application instances and tape backups for off-site DR.

    Upgrading the Data Domain systems to growing data volumes and corresponding upgrade cycles ledMedplast to explore cloud storage for backup/recovery. This led them to cloud gateways as they offerboth primary storage as well as a means to backup into cloud storage, and also eliminate tape backup inthe process.

    Medplast worked with Cirtas but replaced them with StorSimple. StorSimple offered them with an iSCSIinterface to their applications; enterprise-grade capabilities (specifically high-availability), and was acertified VMWare and Microsoft partner (vendors Medplast uses extensively). Regarding the cloudstorage provider, they chose Amazon S3 because of their ability to geo-replicate for no extra charge.StorSimple deployment at Medplast consists of a HA pair, with 10TB data local and 4TB in the cloud.These storage systems are used by Medplast as primary storage for mission-critical applications andother Tier-2 apps ERP, Sharepoint and file servers. Therefore, usage of Tier-1 storage has reducedconsiderably. Also, StorSimple enables simpler backup and DR operations and eliminated tape backupsentirely. In addition, Snapshots are also available via StorSimple for local recoveries.

    In summary, Medplast was able to solve all their storage requirements (primary, backup and DR copies)necessitated by growth via StorSimple.

    7.3.2 NYU Langone Medical center

    NYU Langone Medical center comprises NYU school of Medicine and three hospitals and has a trifoldmission: patient care, biomedical research and medical education. The storage-engineering departmentat the center has faced numerous challenges in recent years, which led it to evaluate cloud-basedoptions. The department had a four-tier storage strategy that it was looking to squeeze more efficiencies

    out of : tier-1 was high performance SAN running on EMC VMAX; tier-2 was IBM XIV; tier-3 was NAS(Windows-based file server clusters mapped to SAN); and tier-4 was off-site storage for archive/retentionand setup.

    Key challenges included data growth in the order of tens of terabytes and investing in individual storagesystems were no longer feasible. Also, the center is planning to move its primary datacenter (currentlyhoused in an IBM-hosted and managed facility, where they have about 70TB of tier-1 storage) to reduceoperational costs for storage and data retention.

  • 8/10/2019 Technical Report.docx

    32/44

    Cloud Storage Gateways NetApp Confidential

    They evaluated systems from IBM and EMC, but neither could offer performance or cost advantages fortier-4 storage. They started looking at cloud storage providers and settled on Nirvanix for the off-site cloudcomponent, primarily based on cost. Nirvanix was able to offer storage at $0.15/GB/month includingunlimited data movement into and out of the cloud versus $0.87/GB/month it was paying for IBMstorage. With Nirvanix, it needed a means to send data to the cloud with performance requirements, thiseffort lead to investigating cloud gateway options. Key requirements included performance and scalability

    to hundreds of users as well as high availability, seamless/transparent data access (over CIFS and NFS)and efficiency (dedupe and compression). They settled on Panzuras Alto Cloud controller afterevaluating a number of options. They have recently finished a pilot deployment with two 20TB controllers,with plans to move into production by end of 2011. Panzuras selection was based on its ability to have alarge front- end cache with SSDs that allowed it to scale to hundreds of users concurrently. Panzurasglobal namespace was a factor too. It enables their infrastructure to grow incrementally by addinggateways at remote locations with a single namespace and point of management.

    The center is initially using Panzura for its research department replacing low-end NAS units thatindividual researchers had purchased in the past. But it quickly realized that there are other potentialworkloads, including archival workloads (it currently archives to EMC Centera via Symantecs EnterpriseVault) can move to the gateway. Its also considering Panzura/Nirvanix to replace their tape-based backupsystem. According to them, Panzuras snapshots and Nirvanixs replication functions effectively remove

    the need for backup. In all, 100TB of local data can move to the cloud.The center is aware of Panzuras shortcomings in their essentials list richer Active Directory integration;end-user snapshot-based recovery; and global read/write for NFS as well as CIFS. However, theybelieve Panzura/Nirvanix combination could form an integral part of their next-generation datacenterbuildout. Initially, Nirvanix was used only for tier-4 storage, but with Nirvanix hNode deployed in their localdatacenter, they are contemplating using Panzura/Nirvanix for both tier-4 and tier-3. Panzura would beused to front-end the hNode appliance and export a global namespace.

    7.3.3 Seneca Data

    Seneca Data is a IT value-added distributor and custom systems manufacturer that focuses on buildingintegrated systems and related services for resellers, OEMs spanning servers, desktops/laptops and

    storage. Its business is split into three divisions partnering services, engineering services and life-cyclemanagement services.

    The company has been experimenting with cloud-based technologies and services via its CLOUDeCITY(www.cloudecity.com ) service a marketplace for its 3000-4000 resellers to offer cloud services to SMBsand others looking for web-based tools and applications to enable sales, marketing and operational tasks.

    Seneca believes that CLOUDeCITY offers resellers a recurring revenue model without much investmentin a complex and expensive infrastructure. The portal, started in 2011, offering modest services CRM(based on SugarCRM), website/email hosting and blogging tools. However, interest has grown and thereis demand for higher functionality tools such as business-productivity, finance and even specific, verticallyoriented tools like healthcare tools.

    As part of this expansion, Seneca formed a partnership with CTERA to add managed storage and data-

    protection services based on CTERAs cloud storage gateway. Although Seneca markets an onlinebackup service called DataMend for users that require a more customized offering, it found value inCTERA because of its simplicity and ease-of-use (plug and play functionality).

    The partnership initially focused on CTERAs CloudPlug, offering shared local storage, cloud backup,snapshots and browser-based file access aimed at consumers and small-business users. However, it hassince added the CTERA C200 and C400 appliances, which offer 6TB and 12TB of local storage,respectively as well as integration with the CTERA cloud for backup, remote access and collaboration.Seneca has plans for larger appliances from CTERA in the near future. Currently, CTERA charges amonthly fee for the CloudPlug, C200 and C400 appliances, which increases with more storage. In

    http://www.cloudecity.com/http://www.cloudecity.com/http://www.cloudecity.com/http://www.cloudecity.com/
  • 8/10/2019 Technical Report.docx

    33/44

    Cloud Storage Gateways NetApp Confidential

    addition, CTERA has workstation agents that can backup/recover Microsoft Exchange, SQL and ActiveDirectory data. As of now, Seneca claims their customers are pushing around 30% of their locally backed-up data to the cloud.

    7.3.4 Energy-industry customer

    One of Panzuras early wins was with a large energy -industry customer that deployed Panzuras Alto

    Cloud Controllers to create a private cloud storage alternative to off-site tape repositories. Prior toPanzura deployment, the customer used tapes to move older seismic data to an off-site repository, whichcreated potential data-leakage vulnerabilities since the tapes were not encrypted.

    The customer creates 6PB of data per year, which consists of seismic trace files that are a few hundredterabytes in size. Prior to its Panzura hybrid cloud deployment, data-restoration jobs could potentially takeweeks to accomplish using off-site tape repository, a process that takes only a few hours using cloudstorage.

    Beyond backup, more importantly, the customer is using the cloud to keep more of its data set online andcould potentially use hybrid cloud storage to extend data access to its partners. The company usesBlueArc and NetApp NAS systems to hold live data, which offloading older data sets to the off-site tapearchive. Storage costs for the customer were reduced from $6m to $1.2m per year with the hybridappliance and private cloud storage resources eliminating the need to purchase additional NAS systems.

    The backup-replacement solution in the cloud costs them $0.5/GB versus $2/GB for data held on off-sitetape. Also, to share their large data sets with partners, they currently ship the NAS systems to them.However, with Panzura gateways, the data may be shared remotely from the private cloud. They havetheir data in a private cloud as opposed to a public cloud due to security concerns and do not plan onchanging it anytime soon. Panzuras compression capability help reduce the size of nearly 6TB of data itneeds to transfer nightly. Also, Panzuras security certificates and key management mechanisms wereessential features expected by this company.

    7.3.5 Psomas

    Psomas is a 500-person civil-engineering firm based in Los Angeles, serving public and private clients inthe transportation, water, site development, federal and energy markets. The company has 10 offices plus a datacenter spread across the western US, including locations in California, Arizona and Utah.

    By 2010, Psomas was struggling to meet its recovery objectives through its existing tape-based backupinfrastructure. Partly due to long- standing policy to back up everything , it could not meet its backup windows. Moreover, it was running into tape-media upgrade and reliability issues, and was also findingthat maintaining tape-based backup infrastructure at each remote office was increasingly difficult to

    justify.

    Psomas team started exploring disk-based backup alternatives, including EMCs Data Domain basedbackup-to-disk with de- duplication, plus options from Symantec and IronMountains online backupservices. The latter two were too expensive to begin with and Data Domains solution had high up -frontcapex costs.

    Meanwhile, Psomas was exploring a new storage project from an existing vendor Riverbed. It was oneof the early beta customer for Riverbeds Whitewater product before converting into full productioncustomer in early 2011. It is now running the appliance in all of its locations, almost exclusively as virtualappliances and has transitioned away from tape-based backup completely. Psomas leverages theWhitewater appliance to backup to local disks for operational recovery and uses the cloud for DRpurposes. As part of this move, Psomas selectively backups only certain data CAD files, officedocuments and databases. Also, Psomas moved to a hosted Exchange service, so it need not backupemails anymore.

    Psomas uses Amazons S3 as its cloud storage backup target and has currently 12TB of backup data inthe cloud (with dedupe ratios of 20:1). With this approach, Psomas does not worry about running out of

  • 8/10/2019 Technical Report.docx

    34/44

    Cloud Storage Gateways NetApp Confidential

    capacity and upgrading hardware. Amazons very high reliability (eleven nines) and availability ( fournines) are important assurance metrics for them.

    Overall, Psomas has high confidence in its backup infrastructure with very fast restores. Moving toWhitewater is expected to save them around $80000 per year across capex and opex. In addition, it isexploring leveraging Amazons EC2 for running its Autodesk CAD application to enable cloud burstingduring busy periods.

  • 8/10/2019 Technical Report.docx

    35/44

    Cloud Storage Gateways NetApp Confidential

    8 NETAPP DIFFERENTIATIONIn this section, we will discuss the various advantages that NetApp enjoys in comparison to other cloudgateways vendors which enable significant opportunities.

    8.1 UNIQUE ADVANTAGES OF HAVING A NETAPP SOLUTION IN THE MARKET

    NetApp exploits different types media in its storage systems to deliver compelling value to its customersat a low TCO. This includes low-cost SATA drives that are made enterprise ready by providing reliabilityfeatures in software over and above the raw disks. Cloud storage can be perceived as one such mediumwith unique characteristics infinite capacity; low cost of maintenance/management; SLOs defined bycloud service providers (performance, reliability, etc.); poor performance - varying bandwidth; highlatency; object-based interface and global access. Just like ordinary low-cost SATA disks were madeuseful in the enterprise context applying value-added features in software, cloud storage can be madeinto enterprise-class by providing significant value above raw cloud storage.

    NetApp already enables different storage tiers in the datacenter ranging from a performance oriented tierat or closer to the host (like flash-based Project Mercury); primary storage tier (FAS based systems);archival tier for disk-to-disk backup based on SATA drives. Having a cloud gateway would enable cloudstorage to be an extra tier in this hierarchy. With SLO-based (or policy based) data management,NetApps data management software can identify and move data to th e appropriate tier based on data orworkload properties. Hot data will move closer to the higher performance tier and cold data will move tothe slowest tier. The cloud storage tier s performance cha racteristics are largely governed by theproviders policies and their cost structure. For example, Amazons price and performance variessignificantly based on the regional data center picked. So, cloud storage can be an effective tier withflexible characteristics based on cost and enhance our SLO-based data management vision.

    With most cloud storage providers, the cost of cloud storage is dominated by two factors network cost ofaccess (both the number of requ