Transcript
廣告系統在 Docker/Mesos上的可靠性實踐Michael Apr.2014 聚效广告 (MediaV)
Who Am I ?
Where is our system?
Where is our system?
Small Impression with Huge Computing
AD Request10億 200億+
QPS100萬+1萬
Latency500ms 10ms
60 DevOps Engineers2000+ physical server
100+ module with realtime service99.95% service availability
Why Container?
Why Scheduler?
• 人為事故, debug, env changed etc…• 非人為故意, Bug, Crash, OOM, memory leak,
disk full etc…• 外部原因, ad code• On-Call 恢復• Scaling Service• 資源利用率
We are in 2016
We are in 2014
2014Q4touch lmctfy
2015Q1try docker with k8s
2015Q2docker on
mesos/yarn?
2015Q3we are runningdocker/mesos
etc.
2016Q1more
batch job & LTS online
2015Q4more service
ci/release
How to start?
MESOS可以為團隊帶來什麼 ?
典型 LTS adhoc任务轻服务
Free Free
—100%
—100%
資源使用分佈 DEMO
服務 Docker容器化遇到的典型問題SE7EN
1/7
1/7
“If you run SSHD in your Docker containers, you're doing it wrong!”
https://jpetazzo.github.io/2014/06/23/docker-ssh-considered-evil/
–Jérôme Petazzoni
2/7 where is my debug logs?
3/7 Docker Network性能差?
http://machinezone.github.io/research/networking-solutions-for-kubernetes/
4/7 如何寫本地文件?如何存儲持久化?
+
5/7 服務的註冊和發現?
We’re
OR
6/7 如何讓服務可調度性?
這是一個大問題,留給每個 Dev工程師
7/7 服務器的數據加載問題?
拋棄 迎接rsynccpscpftp
Everything API/Thrift
Marathon Framework on MESOS
Chronos Framework on MESOS
Chronos : batch job在分布式系統上的替代品
chronos cron azkaban
distributed Yes No halfWeb UI Yes No Yes
Job history Yes,Simple Manual Yes,Fulldependency Yes,simple No Yes,fullUser Auth No No Yes
Resource limit(cpu/mem/disk) Yes No No
Debug log mesos sandbox Manual web UI
Docker/Mesos實踐過程中需要注意的地方
health check with
Marathonon Mesos
{ "protocol": "COMMAND", "command": { "value": "curl -f -X GET http://$HOST:$PORT0/health" }, "gracePeriodSeconds": 300, "intervalSeconds": 60, "timeoutSeconds": 20, "maxConsecutiveFailures": 3}
{ "protocol": "COMMAND", "portIndex": 0, "command": { "value": "nc localhost 8119" }, "gracePeriodSeconds": 300, "intervalSeconds": 60, "timeoutSeconds": 20, "maxConsecutiveFailures": 3, "ignoreHttp1xx": false }
Marathon port resource --resources="ports(*):[8000-9000, 31000-32000]"
Dockerfile review規則Dockerfile必須 Code ReviewEverything in codebase: code/config禁止使用不穩定的wget/curl源Port資源必須申請並註冊
Q&A ?
ye.mikez@gmail.comzhangye@mvad.com
top related