A brief introduction to version control systems Tim Staley Astronomy Group Monday Seminar Southampton, November 2013 WWW: timstaley.co.uk
Jul 07, 2015
A brief introduction toversion control systems
Tim Staley
Astronomy Group Monday SeminarSouthampton, November 2013
WWW: timstaley.co.uk
The problem No backup Manual copies Centralised VCS Distributed VCS
Aims
É Help identify problem that can besolved.
É Introduce basic concepts of versioncontrol.
É Explain why various technologiesexist, and which you should choose.
The problem No backup Manual copies Centralised VCS Distributed VCS
The problem No backup Manual copies Centralised VCS Distributed VCS
When you need versioncontrol
É Complex documents, built up overtime.
É Multiple collaborators (or even justmultiple machines).
É Multiple versions which ‘co-evolve.’
É Reproducibility (‘snapshots’).
The problem No backup Manual copies Centralised VCS Distributed VCS
Four Evolutionary Stages
The problem No backup Manual copies Centralised VCS Distributed VCS
Stage 0: Not backing up
DON’T DO THIS
The problem No backup Manual copies Centralised VCS Distributed VCS
Stage 0: Not backing up
DON’T DO THIS
The problem No backup Manual copies Centralised VCS Distributed VCS
Stage 1: Manual copies
The problem No backup Manual copies Centralised VCS Distributed VCS
Stage 1: Manual copies
Flaws:É Manual = fallible.
É Backup: Copies of copies.
É Labelling.
We need metadata - datestamps,annotations, attribution.And tools - make this stuff quick andeasy!
The problem No backup Manual copies Centralised VCS Distributed VCS
Stage 1: Manual copies
Flaws:É Manual = fallible.
É Backup: Copies of copies.
É Labelling.We need metadata - datestamps,annotations, attribution.And tools - make this stuff quick andeasy!
The problem No backup Manual copies Centralised VCS Distributed VCS
Aside: ‘Cloudy’ technologies
Trade off — convenience vs control.Good for:É Small docs, frequently updated across
multiple locations (e.g. to-do list).É Basic backups of items unlikely to
evolve (photos, etc).
The problem No backup Manual copies Centralised VCS Distributed VCS
Aside: ‘Cloudy’ technologies
Problems:É Versioning is all automated - can’t
choose sensible ‘checkpoints’ to markout.
É Collaboration is still broken, unlessyou’re working on very simple docs.
NEED MORE METADATA
The problem No backup Manual copies Centralised VCS Distributed VCS
Aside: ‘Cloudy’ technologies
Problems:É Versioning is all automated - can’t
choose sensible ‘checkpoints’ to markout.
É Collaboration is still broken, unlessyou’re working on very simple docs.
NEED MORE METADATA
The problem No backup Manual copies Centralised VCS Distributed VCS
Stage Two
The problem No backup Manual copies Centralised VCS Distributed VCS
Centralised version control
e.g.É ‘Concurrent Versions System’ (CVS,
now defunct).É ‘Subversion’ (SVN).
The problem No backup Manual copies Centralised VCS Distributed VCS
Basic concepts, 1
Record an annotated history of changesets.
É Trunk, branchÉ Parents, ancestors
The problem No backup Manual copies Centralised VCS Distributed VCS
Basic concepts, 2Centralized⇔Master copy
É RepositoryÉ CheckoutÉ Commit / Revision
The problem No backup Manual copies Centralised VCS Distributed VCS
Basic concepts, 3
MergingIn simple cases, merges are automatic!Tree-records allows us to build the newcombined version.
The problem No backup Manual copies Centralised VCS Distributed VCS
Basic concepts, 3Manual merging: When conflicts exist,we have the info and tools to manuallyresolve them.
The problem No backup Manual copies Centralised VCS Distributed VCS
Distributed VCS
1986 – early 2000’s: Why would youmake this any more complex? This works.
INTERWEBS(See e.g. visualised history of Python,https://www.youtube.com/watch?v=cNBtDstOTmA)
The problem No backup Manual copies Centralised VCS Distributed VCS
Distributed VCS
1986 – early 2000’s: Why would youmake this any more complex? This works.
INTERWEBS
(See e.g. visualised history of Python,https://www.youtube.com/watch?v=cNBtDstOTmA)
The problem No backup Manual copies Centralised VCS Distributed VCS
Distributed VCS
1986 – early 2000’s: Why would youmake this any more complex? This works.
INTERWEBS(See e.g. visualised history of Python,https://www.youtube.com/watch?v=cNBtDstOTmA)
The problem No backup Manual copies Centralised VCS Distributed VCS
Centralised doesn’t scale
É Many collaborators.É Cannot check-in half-finished work to
master.É Cannot keep track of a branch for
every collaborator.
É Resort back to hybrid of central copyunder version control, with manylocal, manual backups forintermediate work.
The problem No backup Manual copies Centralised VCS Distributed VCS
Centralised doesn’t scale
É Many collaborators.É Cannot check-in half-finished work to
master.É Cannot keep track of a branch for
every collaborator.É Resort back to hybrid of central copy
under version control, with manylocal, manual backups forintermediate work.
The problem No backup Manual copies Centralised VCS Distributed VCS
The distributed modelStage 3: Distribute!
É Everyone has their own mirror, orclone of the repository.
É Changes are distributed via pushesand pulls.
The problem No backup Manual copies Centralised VCS Distributed VCS
Distribute!
Benefits for you:É More flexible. Allows different
workflows and collaborative behaviouretc.
É Can commit offline, sync later.
É Talk to me later if you want the details.
The problem No backup Manual copies Centralised VCS Distributed VCS
So which should I use?
At this stage, git and mercurial arefunctionally equivalent — but git has wonthe majority mindshare, therefore: bettersupport, better chance of collaboratorsusing same system, etc.
The problem No backup Manual copies Centralised VCS Distributed VCS
So which should I use?
At this stage, git and mercurial arefunctionally equivalent — but git has wonthe majority mindshare, therefore: bettersupport, better chance of collaboratorsusing same system, etc.
The problem No backup Manual copies Centralised VCS Distributed VCS
Summary
É Version control helps with:É BackupsÉ ReproducibilityÉ Comparing arbitrary historical versions.É Maintaining multiple live versions.
É Lots of free services and materialonline to help you out.
É Bit of a learning curve at first - butpayoff is large in long-run. (And nowyou have a headstart!)
The problem No backup Manual copies Centralised VCS Distributed VCS
Advanced Reading
To start, google ‘git intro’, etc. Then. . .É Git for Computer Scientistshttp://eagain.net/articles/git-for-computer-scientists/
É Understanding Git Conceptuallyhttp://www.sbf5.com/~cduan/technical/git/
É Understanding the Git Workflowhttps://sandofsky.com/blog/git-workflow.html