For zombies…… By Zeeshan Khan
May 10, 2015
For zombies…… By Zeeshan Khan
A method for centrally storing files
Keeping a record of changes
Who did what, when in the system
Covering yourself when things inevitably go wrong
Another “trendy” word combination
Something that every software developer should deal with
3
You can avoid using version control
But it can’t last long
You will need to collaborate eventually
It might be tricky sometimes
But you can avoid most problems
Recommendations:
Stick to basic working cycle
Learn basic working cycle commands
Practice on sandbox project
Allows a team to share code
Maintains separate “production” versions of code that are always deployable
Allows simultaneous development of different features on the same codebase
Keeps track of all old versions of files
Prevents work being overwritten
There are version control tools even for designers:
There is version control functionality embedded in:
6
Adobe version cue PixelNovel Timeline
Microsoft Word OpenOffice.org Writer
Branch - a copy of a set of files under version control which may be developed at different speeds or in different ways
Checkout - to copy the latest version of (a file in) the repository to your working copy
Commit - to copy (a file in) your working copy back into the repository as a new version
Merge - to combine multiple changes made to different working copies of the same files in the repository
Repository - a (shared) database with the complete revision history of all files under version control
Trunk - the unique line of development that is not a branch
Update - to retrieve and integrate changes in the repository since the update.
Working copy - your local copies of the files under version control you want to edit
8
• CVS
• Subversion
• VSS, TFS, Vault
• ClearCase
• AccuRev
Centralized (client-server model)
• Git
• Mercurial
• Bazzar
• Perforce
• BitKeeper
Distributed
CVS etc GIT etc.
Users commits changes to the central repository and a
new version is born to be checked out by other users
CENTRALIZED WORKFLOW Branch by Release
Release 1
Release 2
Branch
V1.0 V1.1 V1.2
V2.0 V2.1 V2.2
Merge
• Branch by Feature / Task
Feature 1
Main Trunk
Branch Merge
BranchFeature 2 Merge
CENTRALIZED WORKFLOW Access the central server and ‘pull’ down the changes
others have made
Make your changes, and test them
Commit (*) your changes to the central server, so other programmers can see them.
(*) Work out the merge conflicts (windiff, built in tools etc.)
Canonical Repository
Local Repository
Jeff
3. Local repository is update
from canonical repository
2. Pushes changes to the
canonical repository
4. Working copy is updated
from local repository
1. Commits changes to
the local repository
Each user has a full local copy of the repository. Users commit changes
and when they want to share it, they push it to the shared repository
DISTRIBUTED WORKFLOW• Simple
Add Commit
• Branch by Member/ Features
Init/Clone Push
Development Trunk
V1.0 V1.1
Main Trunk
Developer 1
Developer 2
DISTRIBUTED WORKFLOW Each developer ‘clones’ a copy of a repository to their own machine.
The full history of the project is on their own hard drive.
Two phase commits: You commit first to your local staging area, and then push to the repository.
Central Repository is not mandatory, but you usually have one
Examples of distributed source control systems
Git, Mercurial, Bazaar
Single repository
Commit requires
connection (no staging
area).
Impossible to commit
changes to another user
All history in one place
Reintegrating the branch
might be a pain
Considered to be not so
fast as DVCS
Multiple repositories
Commit does not require connection (due to staging area)
Possible to commit changes to another user
Impossible to get all history
Easier branches management (especially reintegration)
Considered to be faster than CVCS
Centralized Distributed
• Speed
• Simple design
• Strong support for thousands of parallel branches
• Fully distributed
• Able to handle larges projects like Linux kernel effectively
• Ensure integrity
Snapshots of the filesystem are saved in every commit instead of saving the differences
• Fetch or clone (create a copy of the remote repository) (compare to cvscheck out)
• Modify the files in the local branch
• Stage the files (no cvs comparison)
• Commit the files locally (no cvscomparison)
• Push changes to remote repository (compare to cvs commit)
• Git directory: stores the metadata and
object database for your project.
• Working directory: a single checkout of
one version of the project
• Staging area (Index): file contained in
your Git directory that stores information
about what will go into he next commit
Untracked: files in your working directory
that were not in the last snapshot and are not
in staging area.
Unmodified: tracked but not modified
(initial clone)
Modified: tracked and modified
Staged: identified for next commit
There are four elementary object types in Git:
blob - a file.
tree - a directory.
commit - a particular state of the working directory.
tag - an annotated tag (we will ignore this one for now).
A blob is simply the content of a particular file plus some
meta-data.
A tree is a plain text file, which contains a list of blobs and/or trees with their corresponding file modes and names.
A commit is also a plain text file containing information about the author of the commit, a timestamp and references to the parent commit(s) and the corresponding tree.
All objects are compressed with the DEFLATE algorithm and stored in the git object database under .git/objects.
Everything is check-summed before it is stored
Everything is referred to by that checksum.
SHA-1 hash is used for making checksum hash.
Every commit is referred to by that SHA-1 hash.
Cannot change the contents of any file or directory without Git knowing about it
The Secure Hash Algorithm is a 160 bit cryptographic hash
function used in TLS, SSH, PGP, . . .
Every object is identified and referenced by its SHA-1 hash.
Every time Git accesses an object, it validates the hash.
Linus Torvalds: ”Git uses SHA-1 in a way which has nothing at all to do with security. [...] It’s about the ability to trust your data.”
If you change only a single character in a single file, all hashes up to the commit change!
Creating a new repository:
$ git init
Cloning from an existing repository:
$ git clone https://github.com/dbrgn/fahrplan
SPECIFIC CHANGES:
$ git add *.py
$ git add README.rst
$ git commit -m 'First commit'
ALL CHANGES:
$ git commit -am 'First commit'
FROM STAGING AREA
$ git rm --cached file.py
FROM INDEX AND FILE SYSTEM
$ git rm file.py
Git tracks content, not files. Although there is a move command...
$ git mv file1 file2
...this is the same as...
$ mv file1 file2
$ git rm file1
$ git add file2
SHOWING STATUS:
$ git status
SHOWING LOG (ENTIRE PAGED)
$ git log
SHOWING LOG (DATE FILTERING)
$ git log --since=2.weeks
$ git log --since="2 years 1 day 3 minutes ago"
LAST COMMIT
$ git show
SPECIFIC COMMIT
$ git show 1776f5
$ git show HEAD^
UNSTAGED CHANGES
$ git diff
STAGED CHANGES
$ git diff --cached
RELATIVE TO SPECIFIC REVISION
$ git diff 1776f5
$ git diff HEAD^
CHANGE LAST COMMIT
$ git commit --amend
UNSTAGE STAGED FILE
$ git reset HEAD file.py
UNMODIFY MODIFIED FILE
$ git checkout -- file.py
REVERT A COMMIT
$ git revert 1776f5
This is a file describing the files that are to be ignored from git tracking
Blank lines or lines starting with # are ignored
Standard glob patterns work
End pattern with slash (/) to specify a directory
Negate pattern with exclamation point (!)
$ cat .gitignore
*.pyc
/doc/[abc]*.txt
.pypirc
Other clones of the same repository
Can be local (another checkout) or remote (coworker, central server)
There are default remotes for push and pull
$ git remote -v
origin git://github.com/schacon/ticgit.git (fetch)
origin git://github.com/schacon/ticgit.git (push)
WITHOUT DEFAULT
$ git push <remote> <branch>
SETTING A DEFAULT
$ git push -u <remote> <branch>
THEN...
$ git push
FETCH & MERGE
$ git pull [<remote> <branch>]
FETCH & REBASE
$ git pull --rebase [<remote> <branch>]
-> Rebasing should be done cautiously!
Like most VCSs, Git has the ability to tag specific points in history as being important. Generally, people use this functionality to mark release points (v1.0, and so on)
Git uses two main types of tags: lightweight and annotated. A lightweight tag is very much like a branch that doesn’t change — it’s just a pointer to a specific commit. Annotated tags, however, are checksummed; contains the tagger name, e-mail, and date; have a tagging message; and can be signed and verified with GNU Privacy Guard (GPG).
LIGHTWEIGHT TAGS
$ git tag v0.1.0
ANNOTATED TAGS
$ git tag -a v0.1.0 -m 'Version 0.1.0'
Branches are "Pointers" to commits.
Any reference is actually a text file which contains nothing more than the hash of the latest commit made on the branch:
$ cat .git/refs/heads/master
57be35615e5782705321e5025577828a0ebed13d
HEAD is also a text file and contains only a pointer to the last object that was checked out:
$ cat .git/HEAD
ref: refs/heads/master
Scenario 1 – Interrupted workflow
You’re finished with part 1 of a new feature but you can’t continue with part 2 before part 1 is released and tested
Scenario 2 – Quick fixes
While you’re busy implementing some feature suddenly you’re being told to drop everything and fix a newly discovered bug
Branches can diverge.
Branches can be merged.
Different auto-merge strategies are there like fast-forward, 3 way , etc...
If it fails, fix by hand…..
$ git merge <branch>
Auto-merging index.html
CONFLICT (content): Merge conflict in index.html
Automatic merge failed; fix conflicts and then commit the result.
Then mark as resolved and trigger merge commit
$ git add index.html
$ git commit
Linear alternative to merging
Rewrites tree! Never rebase published code!
Often, when you’ve been working on part of your project, things are in a messy state and you want to switch branches for a bit to work on something else. The problem is, you don’t want to do a commit of half-done work just so you can get back to this point later. The answer to this issue is
$ git stash
Stashing takes the dirty state of your working directory — that is, your modified tracked files and staged changes — and saves it on a stack of unfinished changes that you can reapply at any time.
CREATE NEW BRANCH
$ git branch iss53
$ git checkout -b iss53 master
SWITCH BRANCH
$ git checkout iss53
DELETE BRANCH
$ git branch -d iss53
SHOW ALL BRANCHES
$ git branch
iss53
*master
testing
SHOW LAST BRANCH COMMITS
$ git branch -v
iss53 93b412c fix javascript issue
*master 7a98805 Merge branch 'iss53'
testing 782fd34 add scott to the author list in the readmes
SHOW MERGED BRANCHES
$ git branch --merged
iss53
*Master
SHOW UNMERGED BRANCHES
$ git branch --no-merged
testing
AKA feature branches
For each feature, create a branch
Merge early, merge often
If desired, squash commits
?