Top Banner
GitLab Infrastructure Status Report
13

GitLab Infrastructure 20160621

Jul 07, 2016

Download

Documents

sytses

GitLab Infrastructure
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: GitLab Infrastructure 20160621

GitLab InfrastructureStatus Report

Page 2: GitLab Infrastructure 20160621

We have HTTP queue time in monitoring, so we ran an experiment

What if we add more memory to workers?

Page 3: GitLab Infrastructure 20160621

This had a good impact across the board - less load in general

Page 4: GitLab Infrastructure 20160621

Specially on API timings (authorized_keys lookup timings)

This is why git-ssh is going faster, but there is still a long way to go.

Page 5: GitLab Infrastructure 20160621

Some things did not go well with the change

Redis leaves connections behind - GitLab max connections open -> outage

Page 6: GitLab Infrastructure 20160621

Deploys - RC3 blowed in production

On a Friday 1AM my time.

So we built staging the next Monday staging.gitlab.com is way smaller and less powerful than GitLab.com, but it has all the data.

Thanks @Jeroen

Done is better than perfect

Page 7: GitLab Infrastructure 20160621

Deploys - RC4 blew in staging that very same Monday

<

Staging Production

Page 8: GitLab Infrastructure 20160621

Postgres is still dying on us, or was it?

Query counts monitoring allowed us to corner the

problem and get it fixed In RC5

Thanks @marat!

Page 9: GitLab Infrastructure 20160621

Monitoring - improvements on how methods are measured

We are actually

showing where the time

is going now.

Thanks @Yorick!

Page 10: GitLab Infrastructure 20160621

Performance - no progress besides the API

Page 11: GitLab Infrastructure 20160621

Storage● Cephfs - dev.gitlab.org has been running on cephfs for the last month

○ Did you noticed? No? That’s good! :)○ Pushing the linux kernel to it takes 27 minutes ~1.5Gb○ Pushing the linux kernel to GitLab.com takes between 1:30hs to forever

● Our measurements were wrong, Cephfs gives 500/150 IOPS● But it scaled without a hiccup up to 98 workers nodes (clients).● We are testing behaviour when we add more nodes/ODS, etc.● We have a plan to move to Cephfs without downtime

Page 12: GitLab Infrastructure 20160621

Storage - capacity

Git data - 28TB out of 49TB

Shared data - 3TB out of 4TBwe can grow this one easily-ish

Page 13: GitLab Infrastructure 20160621

Other news● What’s coming soon

○ Multiple mount points/shards - Thanks @Alejandro!○ 2 new hires

■ Alex as a Production Engineer■ Ahmad as a Performance Specialist

● We are talking with CI to transfer knowledge into Infrastructure.● We are going to take over GitHost.io● We are starting to build infrastructure monitoring that can be shipped with

GitLab● We are hiring!

That’s all folks!