Top Banner
Introduction to Surabhi Gupta
42
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Git basics with notes

Introduction to

Surabhi Gupta

Page 2: Git basics with notes

Fast, open-source, distributed source control system.

Page 3: Git basics with notes

Client-Server vs Distributed models

VCS SERVER

Version 1

Version 2

Version 3

Version 1 Version 1

Version 1

Version 2

Version 3

Version 1

Version 2

Version 3

Version 1

Version 2

Version 3

To see what a distributed source control system looks like, let us contrast it with a client-server model. In this model, you checkout one snapshot — the state of a file or files at a particular point in time. In a distributed model, you checkout everything locally.

Page 4: Git basics with notes

Advantages of Git over P4

Perforce (Client-Server) Git (Distributed)

Version management system Source control system

Slow due to network latency and increased dependency on server calls Fast! Work locally, offline

Intermediate work cannot be easily saved to P4

Various checkpoints for saving intermediate work

Difficult to experiment Facilitates experimentation

A merger is typically responsible for merging between branches

The developer is responsible for merging their branch into master

Perforce model is centered around being able to MANAGE branches. One can restrict branches, setup policies for checking in, etc. Since changing the history of a branch in P4 is an admin-only privilege and is virtually never done, Perforce is good at keeping an audit trail of your commits. On the other hand, git allows you to change the history of a branch completely, as we will see later on. !Why people love Git? Almost all the work is done locally — lots of freedom when you’re doing work.

Page 5: Git basics with notes

Server for Git

❖ Github, Stash, CloudForge, etc are code management and collaboration tools for Git repos!

❖ They provide fine grained control over permissions, audit of commit history.!

❖ The distributed model of Git facilitates open source projects since individuals can easily fork off repos and merge the changes back in.

You may ask why we need a server in a distributed model? The central server is just another Git repo that everyone has access to and that the team uses to synchronize their work. It is mainly used for collaboration and is designated as the ‘source of truth’. It can be switched out with another repo easily. Distributed model advantage for open-source projects: if a repo for an open-source project is no longer being maintained by the owner but there is interest in the community to keep it alive, someone can fork it off. Over time, changes will be contributed to this location and it will become the de-factor new home for the project.

Page 6: Git basics with notes

Scope of the talk❖ Various roles require different levels of expertise in Git:!

❖ Manager !

❖ Software Engineer/QA Engineer !

❖ Merger/Release Engineer — consumer of git scripts!

❖ Develop scripts that extend git functionality — deep dive into git internals.!

❖ We will cover concepts and commands that will come in handy in your day-to-day work as a developer.!

❖ This talk is a road map of the Git world. Hopefully, it will whet your appetite for exploring the trails.

!Roles: managers: usage of Git will most likely be limited to checking out branches Developers require a working knowledge of git Merger - consumer of git scripts, such as those for bulk merging across releases Develop tools to extend git functionality — deep dive into git internals. !This talk is primarily designed for a developer.

Page 7: Git basics with notes

Roadmap

❖ Content hashing!

❖ Blobs to Branches!

❖ Staging and committing !

❖ Remotes and pull requests!

❖ Merge conflicts!

❖ Git resources

Roadmap for the presentation.

Page 8: Git basics with notes

Content Hashing

❖ Contents are referenced using their hashes: !

sha1(“blob ” + fileSize + “\0” + fileContent)!

echo “foobar” > foo.txt git hash-object foo.txt = sha1 (“blob 7\0foobar\n”)!

323fae03f4606ea9991df8befbb2fca795e648fa!

❖ Fun fact: Renames are not stored in the repo. They’re computed by commands such as git diff, git merge, etc.

SHA1: secure hash algorithm, used on the content of downloaded files to verify that the content is authentic !$ sha1("blob 7\0foobar\n") = "323fae03f4606ea9991df8befbb2fca795e648fa" $ echo "foobar" > foo.txt $ git hash-object foo.txt 323fae03f4606ea9991df8befbb2fca795e648fa !This is a low-level concept but it introduced you to the fundamental representations used by Git. It also helps you build intuition for the graph structures, as we will cover in the following slides. !Renames are computed based on the similarity between the contents of a ‘deleted’ and an ‘added’ file. mv a.txt b.txt git add -A . Output: renamed: force.txt -> fourth.txt

Page 9: Git basics with notes

Blobs to trees❖ A tree is an object that stores !

a) blob!

b) subtree!

❖ Each of these contain metadata about their mode, type and name!❖ A tree object can contain objects of type “blob” or “tree”.!

❖ Example modes: 100755 means it’s an executable file, 120000 specifies a symbolic link

Trees are analogous to directories on a file system. Let us build upon the notion of blobs and see how they come together to form trees.

Page 10: Git basics with notes

Git Internals: Tree

blob

blob

tree

Page 11: Git basics with notes

Commit from trees

❖ A commit is a pointer to a tree!

❖ It is pointed to by one or more parent commits!

❖ It also contains metadata about its:!

1) Author !

2) Committer

Example description of a commit object: tree 9acd01e7390a64900bde0b9749f462c53ccb3c65 parent 770479ca34ffd3450d406228f32aa1cb1d8564a0 author Joan Doe <[email protected]> 1421112508 -0800 committer John Doe <[email protected]> 1421112508 -0800 !Author is the person who originally authored the commit. Anyone who patches the commit after creation is a ‘committer’.

Page 12: Git basics with notes

Git Internals: Commit

parent!commit

tree’

tree blob’

blob

commit

Page 13: Git basics with notes

Commits to trees

parent!commit

commit

tree

tree blob

blob

tree’

blob’

blob

Page 14: Git basics with notes

Reuse of objects

tree

tree blob

blob

tree’

blob’

blob

parent!commit

commit

Reusing blob/tree !from elsewhereor

… under-the-hood!object!

sharing

Since only blob was changed to blob’ in this commit, other git objects (trees and blobs) can be reused.

Page 15: Git basics with notes

Reuse of objects within a tree

“B”“A” “C”

“A”

tree

Blobs can be shared within!a single tree.

The contents of the blob that is grayed out are identical to another blob. These two will there share a common underlying object.

Page 16: Git basics with notes

Multiple parents

P1 P2

C

Git fundamentally forms a directed, acyclic graph. !

Page 17: Git basics with notes

Multiple parents

T1

B1

T2

B2

T3

B3

P1 P2

C

Commits with multiple parents!have a one-to-one relationship with trees, !

similar to commits with single parents

Gain familiarity with the idea of a commit having two parents.

Page 18: Git basics with notes

Branch - pointer to a commitMaster

git branch

The branch pointer moves with the HEAD, as you make additional commits. Git branch command shows all the local branches.

Page 19: Git basics with notes

HEAD - pointer to the current commit

HEAD

git checkout C

Master

HEAD

Master

C C

The checkout command allows you to specify any ref such as a commit SHA, a branch name or even a relative path such as HEAD~1.

Page 20: Git basics with notes

All your codebase are belong to me

❖ git clone!

❖ git log

Version 1

Version 2

Version 3

Version 1

Version 2

Version 3

Version 1

Version 2

Version 3

Server/Remote

You Peer

Download a repo to your local machine using `git clone` !git branch -a to see both local and remote branches When a branch is checkout out for the first time, a local copy of the branch is created. There is nothing special about the repo hosted on the server from the perspective of git — in fact, you could set up a remote that is another git repo on your local machine and pull/push to it just like you would here.

Page 21: Git basics with notes

Our first commit

❖ echo “May the 4th” >> “force.txt”!

❖ git status!

❖ git add force.txt!

❖ git diff —cached!

❖ git commit -m “May the force be with you”

After creating a new file, we need to add it to the git index before we can view the diff. Use git diff —cached to see the differences between the HEAD and the staging area. Use git diff to see the differences between the staged and the unstaged files.

Page 22: Git basics with notes

C3

C2

C1

C4

C3

master

C2

C1

You

Remote

remotes/master

master

git branch -a will show all the local and the remote branches Master is tracking remotes/master Master is a branch and therefore, as we make a new commit on this branch, the pointer moves forward. Tag is a pointer to a commit that cannot be moved, while branches can.

Page 23: Git basics with notes

C4

C3

C2

C1

You

git push

Remote

C4

C3

C2

C1

origin/master

master

master

You may ask, What if I made a mistake?

Page 24: Git basics with notes

What if I made a mistake?

Page 25: Git basics with notes

Undo unstaged changes

force.txt

git checkout — force.txt

echo “new” >> force.txt

Com

mitt

edSt

agin

g !

Are

aU

nsta

ged!

chan

ges

Page 26: Git basics with notes

Unstage changes

force.txt

force.txt

git reset HEAD force.txt

git add force.txt

Com

mitt

edSt

agin

g !

Are

aU

nsta

ged!

chan

ges

git add is actually adding the changes to the index. The add command should be interpreted as “add any new updates” rather than “add new file”. force.txt is already being tracked in the Git index; `git add` stages the new addition to the file namely the word “new”. !Note: As mentioned previously, you can use `git diff —cached` to see the differences between the HEAD and the staging area. It will output ‘+new’ for the diagram on the left and will output nothing for the right diagram. Use git diff to see the differences between the unstaged and staged (or committed, if nothing is staged) versions of the file. It will output ‘+new’ for the diagram on the right and will output nothing for the left diagram.

Page 27: Git basics with notes

Uncommit changes

force.txt

force.txt

git reset —soft HEAD^

git commit -m “Second commit”

Com

mitt

edSt

agin

g !

Are

aU

nsta

ged!

chan

ges

Note: git reset —soft HEAD^ will not change your local working directory. It will merely move the changes from a committed state to a staged state. git reset --hard HEAD^ which will completely blow away all changes between your current HEAD and the reference you specify. As we saw, there are a number of checkpoints in your git workflow. If used wisely, you will never have to wonder what the last “working” state of your codebase was before you made some breaking changes.

Page 28: Git basics with notes

Typical workflow

Typically, if your team has more than one person, you wouldn’t commit to master directly. Recommended workflow:!

1) Check out a private branch!

2) Commit to the branch, and regularly push to remote.!

3) When the work is complete, get a code review (likely via a pull request) and merge the branch into master

Also, regularly rebase over master, assuming you are working in a private branch.

Page 29: Git basics with notes

Step 1: Create a new branch

git branch bugFix

HEAD

masterbugFix

HEAD

master

Page 30: Git basics with notes

Checkout said branch

git checkout bugFix

bugFixHEAD

masterbugFix

HEAD

master

Current branch

Now your pointer is at bugFix. These two commands can be combined into one: git checkout -b bugFix. It is helpful to decompose a command when first learning git as it gives you a glimpse into the atomic actions being performed by git.

Page 31: Git basics with notes

Step 2: Feature development

HEAD

master

B

CbugFix

masterB

C bugFix

D

Local Remote

A A

If you want to experiment with an alternate codeline, you can easily do this in a new branch off of master. git checkout master git checkout -b newDirection !Let us assume that while you’ve been working on bugFix, someone else has committed their changes to the master branch causing it to move forward. The common ancestor of bugFix and master is no longer master (diagram on the right).

Page 32: Git basics with notes

Step 3: Merge into master

A

masterB

CbugFix

D

Remote

A

masterbugFix

B

E

C

New merge commit E

Remote after!merge

D

gitk - show git graph

As we mentioned in the introduction, within the Git model it is the responsibility of the developer to merge their changes into the mainline. It would be remiss not to mention merge conflicts. If there are no conflicts, then you will be able to merge in your changes via a pull request as shown in the right diagram. However, it is recommended that you rebase on top of master, especially If there are merge conflicts. In the latter case, you will need to resolve the conflicts and then run ‘git rebase —continue’. We will explore the graphical underpinnings of rebase in a couple of slides.

Page 33: Git basics with notes

Can we do better?

A

masterB

CbugFix

D We would like to modify the commit history to make it

appear as if bugFix was based on commit D all along!

Page 34: Git basics with notes

Rebase to the rescue

❖ Rebase allows you to replay a series of commits on top of a new base commit. !

❖ Helps keep the commit history clean

Your changes were based off of commit A. Commit D was introduced in parallel. Rebase allows you to modify commit history to make it appear as if you were working on top of D all along!

Page 35: Git basics with notes

Rebase in action

A

masterB

CbugFix

D

bugFix

A

D

C*

B*

git rebase master bugFix

B

C

\

master

Note that commits C and D have been supplanted by C* and D* in the right diagram. If bugFix was a shared branch, you would not want to rebase it on top of master since anyone who was working off of C or D would have the rug pulled out from under them. It is possible to recover from this by cherry picking any changes made on top of C/D into C*/D*. However, it is best to avoid such situations altogether.

Page 36: Git basics with notes

Merge bugFix with master

A

D

EmasterbugFix

A

master

C*bugFix

B*

D

C*

B*Merging the rebased branch bugFix !into master. This merge is typically!

triggered in the code management tool! (Github, Stash, etc) after a pull request!

is approved.

Note: the merge from a feature branch to the mainline (master) is usually done with an explicit “—no-ff” flag which will create a merge commit even when a fast forward is possible. The diagram on the right explains visually how this policy helps keep commits in the mainline have a one-to-one correspondence with features.

Page 37: Git basics with notes

Merge conflicts

❖ Situation: Conflicting modifications to a file that has changed since we checked it out!

❖ Two options: merge, rebase!

❖ On a private branch, it is recommended that you rebase. !

❖ On a shared branch, merge is the way to go.

Let us take a moment to appreciate that a merge conflict cannot be automated away. There is no way for the source control system to know our intention.

Page 38: Git basics with notes

Changing the commit history

❖ “git commit —amend” rewrites the your last commit with the current changes instead of creating a new commit!

❖ Interactive rebase: git rebase -i!

❖ Swiss army knife of modifying history!

❖ Allows you to amend, squash, split, or skip commits as they're applied

Page 39: Git basics with notes

Many roads, one destination❖ There are often multiple ways to accomplish a task in Git, for example:

git branch <branchName> git checkout <branchName>

git checkout -b <branchName>

git checkout -b <branchName> <remoteName>/<remoteBranch>

git branch --track <branchName> <remoteName>/<remoteBranch>

git fetch!git merge git pull

Lots of facades -- actions that can be executed using one (or a combination of) flag(s) in some command may be pulled out into their own command. If you get into a bind, there is most probably a way to recover from the situation. Do not hesitate to seek help! git-users mailing list

Page 40: Git basics with notes

Give It a Try

Explore the topics discussed so far by creating a new Git repository. Let us assumed it has one file foo.txt with the contents “foo bar”. Person A changes it to foo bar bas in the user/personA branch and creates a pull request to merge this change in. Meanwhile, person B changes the contents of foo.txt to “food bazaar. This commit gets merged into master first. For the purposes of this exercise, personB can commit directly to master. Keep in mind that in a real-life scenario, the conflicting change will be typically introduced by the pull request for personB getting merged into master before that of personA). PersonA’s pull request now has merged conflicts and will need to be resolved using rebase.

Page 41: Git basics with notes

Git Resources❖ Learn by playing: http://pcottle.github.io/learnGitBranching/!

❖ Atlassian tutorial: https://www.atlassian.com/git/tutorials/setting-up-a-repository/!

❖ Free CodeSchool course on Git: https://www.codeschool.com/courses/git-real!

❖ StackOverflow is a great resource: http://stackoverflow.com/questions/2706797/finding-what-branch-a-commit-came-from!

❖ Pro Git by Scott Chacon and Ben Straub: http://git-scm.com/book/en/v2

Page 42: Git basics with notes

Closing thoughts

❖ Git is a powerful source control tool designed to maximize the efficiency of the developer. Take full advantage of it!!

❖ We’ve only explored the tip of the iceberg. May the power of Git be with you.