1. Rakuten new infrastructure - Why we try to make new things
Vol.01 Oct/26/2013 @Rakuten Technology conference 2013 Osamu
Iwasaki Vice Group Manager Server Platform Group / Network
Administration Group Global Infrastructure Development Department,
Rakuten, Inc. http://www.rakuten.co.jp/
2. Self introduction Name : Osamu Iwasaki Role : Network /
Cloud Eng & Mgr Twitter @osamuiwasaki Skype osamu.iwasaki Vice
Group Manager Server Platform Group / Network administration Group
Global Infrastructure Development Department And Committee member
of JANOG (JApan Network Operators Group) Project Manager / Desinger
of Rakuten Private Cloud system(RIaaS) & New data center fabric
network. 2
3. Index 1. Introduction / Current situation - Todays Rakuten
infrastructure status 2. Our legacy infrastructure - Whats the
problems 3. Why we try to change our infrastructure - Simple /
Automate / Cost reduction and tech challenge!! 4. New
infrastructure - Whats the benefits 5. Case - Use case from new
infrastructure 6. Future - Whats we are thinking for the next step
3
4. Index 1. Introduction / Current situation - Todays Rakuten
infrastructure status 2. Our legacy infrastructure - Whats the
problems 3. Why we try to change our infrastructure - Simple /
Automate / Cost reduction and tech challenge!! 4. New
infrastructure - Whats the benefits 5. Case - Use case from new
infrastructure 6. Future - Whats we are thinking for the next step
4
5. Rakutens infrastructure Several DC location around Tokyo
area Each location is active Data Center Cost efficiency,
Scalability, Disaster recovery Hugh server resources for Rakuten
Ichiba services. RIaaS which is Rakuten private cloud system ->
Automation tools. -> Huge resources. Rakuten scalable Fabric
network -> Easy to scale out. 5
6. Our traffic history (Gbps) 160 140 Victory Sale 120 100 80
Super Sales 60 40 20 0 Peak traffic during Victory Sales, over
140Gbps which was about over 5% of Japan Internet traffic. 6
7. Network traffic trend from 2012/Jan(SS traffic focus) (Gbps)
Victory Sales 160 140 Super Sales 120 Super Sales 100 80 Super
Sales Super Super Sales Sales 60 40 20 0 SuperSale CDN 2012/ Jun
2012/De c 2013/Ma 2013/Ju r n 2013/Se p 2013/Oc t(VS) 60G 78.9G
69.1G 75.8G 73.7G 127.6G RakutenDC 12.7G 14.2G 12.8G 12.5G 11.7G
12.9G Total 72.7G 93.1G 81.9G 88.3G 85.4G 140.5G 7
8. PC/FeaturePhone/Smartphone/Tablet share by GMS Also, from
the GMS point of view, Mobile traffics increase rapidly!! Almost
50% 8
9. RIaaS, our private cloud history 12000 Resource transfer to
new Data center 10000 1194 8000 3308 2155 1180 6000 Enhancements
doubles for SuperSale 4000 4228 365 284 213 344 934 317 1754 2411
2247 1268 96 2241 405 2108 1479 2229 1956 1919 2000 1287 1232 130
20 158 200 29 168 231 76 251 Jun 0 108 12 142 Jul Aug Sep Oct 1393
1406 2173 805 1052 474 216 358 480 1077 266 453 365 529 451 642 609
765 1109 Nov Dec Jan Feb Mar 2085 1577 Apr 1598 May 1599 Jun 1632
248 741 2 418 539 0 Jul Aug Sep About 1year ago, we started from
300VMs. But now, around 10000VMs are running for Rakuten Ichiba
services; YoY over 30 times !!! 9
10. Number of Setup Servers 1400 Physical Machines 1200
Virtualization 90% over Virtual 1000 800 600 400 200 0 Over these
past 2 years, our server construction shifted to Virtual from
Physical. 10
11. Index 1. Introduction / Current situation - Todays Rakuten
infrastructure status 2. Our legacy infrastructure - Whats the
problems 3. Why we try to change our infrastructure - Simple /
Automate / Cost reduction and tech challenge!! 4. New
infrastructure - Whats the benefits 5. Case - Use case from new
infrastructure 6. Future - Whats we are thinking for the next step
11
12. Legacy network Lots of segments and hard to operate all of
the networks. 12
13. Index 1. Introduction / Current situation - Todays Rakuten
infrastructure status 2. Our legacy infrastructure - Whats the
problems 3. Why we try to change our infrastructure - Simple /
Automate / Cost reduction and tech challenge!! 4. New
infrastructure - Whats the benefits 5. Case - Use case from new
infrastructure 6. Future - Whats we are thinking for the next step
13
14. Motivate to change Simplenization Toolnization Automation
Cost reduction 14
15. Motivate to change Simplenization Toolnization Automation
Cost reduction and .. Technology challenge !!! 15
16. Motivate to change Simplenization Toolnization Automation
Cost reduction and .. Technology challenge !!! A challenge sprits
is the most important things ! 16
17. Index 1. Introduction / Current situation - Todays Rakuten
infrastructure status 2. Our legacy infrastructure - Whats the
problems 3. Why we try to change our infrastructure - Simple /
Automate / Cost reduction and tech challenge!! 4. New
infrastructure - Whats the benefits 5. Case - Use case from new
infrastructure 6. Future - Whats we are thinking for the next step
17
18. Concept internet Rakuten DC global network Rakuten service
network gb/at/db-net Subsidiary Rakuten XXX Subsidiary Rakuten XXX
Rakuten DC global network is the Data Center side global IP
network. RIaaS Internet connectivity will be provided from Rakuten
global network to each networks include Rakuten environments.
Rakuten shared infra exchange Storage Backup BigData etc 18
19. Concept internet Rakuten DC global network Shared services(
RIaaS, Storage, Backup, etc) will provide to each subsidiary from
shared infra like this image. All of the traffics separated by each
virtualized technology. Rakuten service network gb/at/db-net
Subsidiary Rakuten XXX Subsidiary Rakuten XXX RIaaS Rakuten shared
infra exchange Storage Backup etc etc 19
20. Data center network overview Internet Other DC / Regional
DC DC core Network A VPN-Router AZ1-Router DC core Network B
AZ2-Router Subsidiary Gateway-Router RIaaS Legacy RIaaS Legacy AZ1
AZ1 AZ2 AZ2 Subsidiary Subsidiary Subsidiary Subsidiary Management
network Separate AZ(Availability Zone in DC) to minimize big
trouble impacts. 20
21. Fabric network physical architecture Spine switch Spine
switch Spine Layer3 Border Leaf Border Leaf Switch Other DC L3
Switch L3 Switch Layer2 Border Leaf Switch Leaf Switch Leaf Switch
Leaf Leaf Swtich Leaf Switch Other DC L3 Switch L3 Switch Spine :
Leaf architecture Easy to operate, enhance, standardize quality,
and scale out. 21
22. Fabric network logical architecture internet Router Router
Adopting Ethernet Fabric - Flat network structure - Every network
pass is active - VRF and tag VLAN enables remote control and
no-more-cabling DC Core network Fabric network Scalable Scalable
Fabric Therefore, we can provide flexible and scalable network
structure Shared service (e.g.Storage) RIaaS Physical server Simple
and scalable network architecture. 22
23. Reduce Costs, Improve Agility / delivery time 2011 2013
Enterprise storage VLAN networks Firewall, loadbalancer $10,000 6
weeks $1800 5 days, 15 minutes IDS, security, monitoring Legacy
model is high price / long delivery time RIaaS model is more cheap
/ fast delivery time 23
24. Cost compare Physical Server x RIaaS Over half price down!
DC cost Storage Compute 1U Server RIaaS RIaaS, Private cloud system
dramaticaly reduce our cost 24
25. RIaaS: Concept Roadmap RIaaS Phase2 RIaaS RIaaS Phase3
Multisite BCP RIaaS at East DC + West DC can take balance on
Disaster recovery Multi-Tenant Structure RIaaS for all Rakuten
Group including Subsidiaries Lean/Powerful/Scalable Cloud Service
Reinforce architecture : High density server Premium high-end
storage , Commercial hypervisor Speedy Server Construction using
RIaaS management console 25
26. Database Platform in Rakuten Shuichiro Makigaki Datastore
Platform Group, Grobal Infrastructure Development Department
27. Self Introduction Joined Rakuten as new grads. on April
2012 Working on database and storage technology for next generation
Rakuten infrastructure as a platform. 27
28. Agenda 1. Past MySQL Problems 2. Clustrix Introduction
Benefits 3. Usage in Production 4. HA, Multiple Cluster Management
5. And, Some Demos! 28
29. Past MySQL Problems Manual sharding Manual server
management Long lead time Offline maintenance 90% of CPU is NOT
used! Application servers Master DB servers DB DB BD Slave DB
servers DB BD BD BD BD BD BD BD BD BD DB BD BD BD BD BD BD 29
30. Past MySQL Problems Manual sharding Manual server
management Long lead time Offline maintenance 90% of CPU is NOT
used! We need a new database platform for As a Service! Application
servers Master DB servers DB DB BD Slave DB servers DB BD BD BD BD
BD BD BD BD BD DB BD BD BD BD BD BD 30
31. Clustrix - Introduction Clustrix is an appliance database
server. MySQL Compatible Distributed, Scalable 31
32. Clustrix - Introduction Clustrix is an appliance database
server. MySQL Compatible Distributed, Scalable & ACID guarantee
Automatic Fault Tolerance 32
33. Clustrix - Benefits No manual sharding Automatic data
distribution No manual fault tolerance Automatic! Single point VIP
access APP Scalable Online Schema Change No Sharding Server1 Data3
Server2 Data1 Server3 Data2 ServerX DataY Data1 Data2 Data3 DataZ
Support 33
34. HA, Multiple Cluster Management Production Staging Single
node cluster Cluster1 DB DB DB DB DB DB DB DB Bi-directional
Replication (for BCP) DB RIaaS Development Single node cluster
Cluster2 DB DB DB DB DB DB DB DB DB RIaaS Backup Monitoring NFS
GlusterFS 34
35. Usage in Production Number of DBs 200 150 100 50 Cluster2
Cluster1 0 Data Size (GB) 2500 2000 1500 1000 500 Cluster2 Cluster1
0 35
36. For Database as a Service No Lead Time Charge on demand
Charge on DB size 36
37. For Database as a Service No Lead Time Charge on demand
Charge on DB size Private PaaS Integration Self Management Tool
Demo 37
38. Demo1 Create DB from private PaaS (RPaaS)! 1. Create an
application 2. Login PaaS rpaas login 3. Push the Application rpaas
push 38
43. Index 1. Introduction / Current situation - Todays Rakuten
infrastructure status 2. Our legacy infrastructure - Whats the
problems 3. Why we try to change our infrastructure - Simple /
Automate / Cost reduction and tech challenge!! 4. New
infrastructure - Whats the benefits 5. Case - Use case from new
infrastructure 6. Future - Whats we are thinking for the next step
43
44. Case1 RIaaS benefits for Super/Victory Sale Quick delivery
time !! Physical server construction takes long time. RIaaS which
is Rakuten private cloud system -> IaaS platform. ->
Automation tools. -> Huge resources. Anytime, we could provide
server resources for Super Sale as emergency server enhancement.
44
45. Case2 Layer2 extension between our Data Centers Internet
Current DC New DC GSLB Server Migration L2 extension network Bridge
each Data center network for server migration from Physical to
Virtual with out network setting change. 45
46. Index 1. Introduction / Current situation - Todays Rakuten
infrastructure status 2. Our legacy infrastructure - Whats the
problems 3. Why we try to change our infrastructure - Simple /
Automate / Cost reduction and tech challenge!! 4. New
infrastructure - Whats the benefits 5. Case - Use case from new
infrastructure 6. Future - Whats we are thinking for the next step
46
47. Future plan Fast delivery / Self service BCP / DR
infrastructure Global expansion 47
48. For All Datacenter Services with Self Service portal 2014 -
Software-defined Datacenter Services with Self Service VDC 5 days,
15 minutes 3 minutes The next step will be the fastest delivery
time with self service portal service for all of Rakuten. 48
49. BCP / DR network concept Internet ISP network Tokyo Osaka
Internet -VPN Public Network Rakuten Network Public Network
Otemachi DC or Rakuten Network For IX Connection Tokyo A
Availablity Zone East DC Osaka Tokyo B Availablity Zone Availablity
Zone Availablity Zone West DC We plan to expand to other location
to avoid disaster risks. 49