Deep Dark-side Of GitHow Git Works Internally
SeongJae Park <[email protected]>
Git
DVCS(Distributed Version Control System)
Made By Linus Torvalds To Manage Linux
http://git-scm.com/images/logos/downloads/Git-Logo-2Color.png
http://cdn.memegenerator.net/instances/400x/37078331.jpg
Git
Many Projects Use Git Because It’s Awesomehttp://git-scm.com/images/logos/downloads/Git-Logo-2Color.png
http://blog.appliedis.com/wp-content/uploads/2013/11/android1.pnghttp://upload.wikimedia.org/wikipedia/en/4/40/Octocat,_a_Mascot_of_Github.jpghttp://upload.wikimedia.org/wikipedia/commons/thumb/3/35/Tux.svg/512px-Tux.svg.png
Git: Learning Curve
Some People Says Hard To Learnhttp://git-scm.com/images/logos/downloads/Git-Logo-2Color.png
This Time, We Will...
See How Git Works From The Scratch
Just For Fun...Or To Be Friend With Git
Forget About TheComplicated CommandsThis Time
https://lh4.googleusercontent.com/gBpfuABUjSNi2RagtJrGi8TW-pmtgak_0qtGOGubihvKH-5-umreO9CwJgjX2kaA9E7RkLwtEwiDnoMtOgm4iMJ0IWhvXlzlKL1kNVUYWuNa-gLRtRoyNjkVYg
In Short,
Git Is A Content-Addressable File System
Blob, Tree, Commit, Reference. That’s It =3
http://www.juliagiff.com/wp-content/uploads/2014/03/tldr_trollcat.jpg
Git: Unsung Heroes Behind
● Git Looks Graceful Owing To Plumbing Commands Consisting Them○ The Wounded Foots Are What We Interested In
http://cfile4.uf.tistory.com/image/182FF7244CFDDFB33CC999http://cfile29.uf.tistory.com/image/18574F224CFDD89B163073
Why VCS?
Usual Life Of File
FileA ver 0 FileB ver 0
Why VCS?
Usual Life Of File
FileA ver 0 FileB ver 1
Why VCS?
Usual Life Of File
FileB ver 1 FileA ver 1
Why VCS?
Usual Life Of File
FileB ver 2FileA ver 1
Why VCS?
Usual Life Of File
FileB ver 2FileA ver 1
We Need Version Control System
VCS Would...Record Every Changes SafelyAble To Check Out Any VersionEasy To Read History
Brute VCS: File System
Rename / Backup Every Files Whenever Change Made
Brute VCS: File System
Rename / Backup Every Files Whenever Change Made
$ ls
foo.c
foo_20140111.c
foo_final.c
foo_realfinal.c
foo_planb.c
foo_finalfinal.c
Brute VCS: File System
Rename / Backup Every Files Whenever Change Made
$ ls
foo.c
foo_20140111.c
foo_final.c
foo_realfinal.c
foo_planb.c
foo_finalfinal.c
GIT vs FileSystem
● GIT: Content-Addressable FileSystem
● Key Is SHA-1 Hash Of Object’s Content, Value Is The Content○ Same Content Never Saved Twice
Save / Load ‘test content’
$ mkdir olaf; cd olaf; git init
Initialized empty Git repository in olaf/.git/
$ echo ‘test content’ | git hash-object -w --stdin
d670460b4b4aece5915caf5c68d12f560a9fe3e4
$
Save / Load ‘test content’
$ mkdir olaf; cd olaf; git init
Initialized empty Git repository in olaf/.git/
$ echo ‘test content’ | git hash-object -w --stdin
d670460b4b4aece5915caf5c68d12f560a9fe3e4
$ find .git/objects/ -type f.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4
$
Save / Load ‘test content’
$ mkdir olaf; cd olaf; git init
Initialized empty Git repository in olaf/.git/
$ echo ‘test content’ | git hash-object -w --stdin
d670460b4b4aece5915caf5c68d12f560a9fe3e4
$ find .git/objects/ -type f.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4
$ git cat-file -p d67046
test content
$ git cat-file -t d67046
blob
What hash-object do
content = “test content”
header = “blob %d\0”, length_of(content)
store = header + content
What hash-object do
content = “test content”
header = “blob %d\0”, length_of(content)
store = header + content
sha1 = sha1_of(store)
dir = “.git/objects/” + sha1[0:2] + “/”
filename = sha1[2:]
What hash-object do
content = “test content”
header = “blob %d\0”, length_of(content)
store = header + content
sha1 = sha1_of(store)
dir = “.git/objects/” + sha1[0:2] + “/”
filename = sha1[2:]
write(dir + filename, store)
# Save compressed header + content at sha1 path
Version Control Using Hash Value
$ echo “eyes, mouth” > head.txt
$ git hash-object -w head.txt
a134fc2477395ee1a59664a0b660085edde63d04
$
Version Control Using Hash Value
$ echo “eyes, mouth” > head.txt
$ git hash-object -w head.txt
a134fc2477395ee1a59664a0b660085edde63d04
$ echo “eyes, nose, mouth” > head.txt
$ git hash-object -w head.txt
6546481b73fb62d0c627812e17e355d43d6ed30e
$
Version Control Using Hash Value
$ echo “eyes, mouth” > head.txt
$ git hash-object -w head.txt
a134fc2477395ee1a59664a0b660085edde63d04
$ echo “eyes, nose, mouth” > head.txt
$ git hash-object -w head.txt
6546481b73fb62d0c627812e17e355d43d6ed30e
$ git cat-file -p a134f > head.txt
$ cat head.txt
eyes, mouth
Version Control Using Hash Value
● Pros:○ Light Volume
● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files
https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
Version Control Using Hash Value
● Pros:○ Light Volume
● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History
https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
Version Control Using Hash Value
● Pros:○ Light Volume
● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values
https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
tree Object
Point Other Objects(Using Hash) With Name
tree Object
Point Other Objects(Using Hash) With Name
tree
blob blob tree
blob
a113f2main.c
b8934olaf.c
c9240include
d9b13true_love.h
tree Object
Point Other Objects(Using Hash) With Name
“A Root tree Object Is A Snapshot”
tree
blob blob tree
blob
a113f2main.c
b8934olaf.c
c9240include
d9b13true_love.h
tree object$ mkdir favorites; echo ‘fantastic’ > favorites/warm_hug
$ git update-index --add head.txt favorites/warm_hug
$ git write-tree
567167268c5c71bb647ca728bdb25f388d027f57
$
tree object$ mkdir favorites; echo ‘fantastic’ > favorites/warm_hug
$ git update-index --add head.txt favorites/warm_hug
$ git write-tree
567167268c5c71bb647ca728bdb25f388d027f57
$ git cat-file -p 56716
040000 tree 799cf15c89acb88d76321b7b1529c8a9888fb9e2favorites
100644 blob a134fc2477395ee1a59664a0b660085edde63d04head.txt
$ git cat-file -p 799cf
100644 blob 7cc07dcddbcf92487065d4c12011e8a12f62a1bdwarm_hug
Internal Data Structure
tree
blob tree
a134fhead.txt
799cffavorites
Internal Data Structure
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
Version Control Using tree Object
$ echo “eyes, nose, mouth” > head.txt
$ git update-index --add head.txt
$ git write-tree
e4885f26f1d82b59c42bf1ed207fec4f60655c35
$
Version Control Using tree Object
$ echo “eyes, nose, mouth” > head.txt
$ git update-index --add head.txt
$ git write-tree
e4885f26f1d82b59c42bf1ed207fec4f60655c35
$ git cat-file -p e4885040000 tree 799cf15c89acb88d76321b7b1529c8a9888fb9e2favorites
100644 blob 6546481b73fb62d0c627812e17e355d43d6ed30ehead.txt
$ git cat-file -p 65464
eyes, nose, mouth
Internal Data Structure
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
Internal Data Structure
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
tree
blob
65464head.txt799cf
favorites
Version Control Using Hash Value
● Pros:○ Light Volume
● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values
https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpghttps://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently
● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values
Version Control Using tree Object
https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpghttps://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
commit Object
Describe Who / When / Why The Change Made
Point A tree Object With Information Above
http://modthink.com/wp-content/uploads/2013/05/WhoWhatWhenWhereWHY.jpg
commit Object
$ echo '1st commit' | git commit-tree 56716
d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c
$
commit Object
$ echo '1st commit' | git commit-tree 56716
d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c
$
$ git cat-file -p d075c
tree 567167268c5c71bb647ca728bdb25f388d027f57author SeongJae Park <s**@gmail.com> 1401359546 +0900
committer SeongJae Park <s**@gmail.com> 1401359546 +0900
1st commit
$
commit Object
$ echo '1st commit' | git commit-tree 56716
d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c
$
$ git cat-file -p d075c
tree 567167268c5c71bb647ca728bdb25f388d027f57author SeongJae Park <s**@gmail.com> 1401359546 +0900
committer SeongJae Park <s**@gmail.com> 1401359546 +0900
1st commit
$
Who When
Why
Version Control Using commit Object
$ echo '2nd commit' | git commit-tree e4885 -p d075c
a9cd7374ce4951ab93aac75d78a45a245e27f414
$
Version Control Using commit Object
$ echo '2nd commit' | git commit-tree e4885 -p d075c
a9cd7374ce4951ab93aac75d78a45a245e27f414
$
$ git cat-file -p a9cd7
tree e4885f26f1d82b59c42bf1ed207fec4f60655c35
parent d075cbd627bc3159be9c77e96b4dc44d8e9d8c4cauthor SeongJae Park <s**@gmail.com> 1401360590 +0900
committer SeongJae Park <s**@gmail.com> 1401360590 +0900
2nd commit
$
Internal Data Structure
That’s Why People Says, “A Commit is a snapshot”
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
tree
blob
65464head.txt799cf
favorites
commit commit
tree
parent
tree
● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently
● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values
Version Control Using tree Object
https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpghttps://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
Version Control Using commit Object
● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently○ Easy To Know Who / When / Why Made A
Change
● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values
https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
Git References
File Storing SHA-1 Value
Resides In .git/refs/
Git References Using echo
$ echo "d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c" > .git/refs/heads/first
$
Git References Using echo
$ echo "d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c" > .git/refs/heads/first
$
$ git log --pretty=oneline first
d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit
$
Git References Using echo
$ echo "d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c" > .git/refs/heads/first
$
$ git log --pretty=oneline first
d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit
$
$ find .git/refs/heads -type f
.git/refs/heads/first
.git/refs/heads/master
$
Git References Using update-ref
$ git update-ref refs/heads/master a9cd7
$ git log --pretty=oneline master
a9cd7374ce4951ab93aac75d78a45a245e27f414 2nd commit
d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit
$
Git References Using update-ref
$ git update-ref refs/heads/master a9cd7
$ git log --pretty=oneline master
a9cd7374ce4951ab93aac75d78a45a245e27f414 2nd commit
d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit
$
$ find .git/refs/heads -type f
.git/refs/heads/first
.git/refs/heads/master
$
Git References Using update-ref
$ git update-ref refs/heads/master a9cd7
$ git log --pretty=oneline master
a9cd7374ce4951ab93aac75d78a45a245e27f414 2nd commit
d075cbd627bc3159be9c77e96b4dc44d8e9d8c4c 1st commit
$
$ find .git/refs/heads -type f
.git/refs/heads/first
.git/refs/heads/master
$
$ cat .git/refs/heads/master
a9cd7374ce4951ab93aac75d78a45a245e27f414
Internal Data Structure
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
tree
blob
65464head.txt799cf
favorites
commit commit
tree
parent
tree
Internal Data Structure
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
tree
blob
65464head.txt799cf
favorites
commit commit
tree
parent
tree
refs/heads/master
refs/heads/first
Version Control Using commit Object
● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently○ Easy To Know Who / When / Why Made A
Change
● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values
https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
Version Control Using Reference
● Pros:○ Light Volume○ Managing Multiple Files Space Efficiently○ Easy To Know Who / When / Why Made A Change○ Easy To Point A Snapshot
● Cons:○ Only Content, No Title○ Hard To Manage Multiple Files○ Hard To Manage History○ Hard To Remember Hash Values
https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
How Git Knows Current Commit?
Answer: HEAD
How Git Knows Current Commit?
Answer: HEAD
HEAD Points reference Using ref format(Not SHA-1)
How Git Knows Current Commit?
Answer: HEAD
HEAD Points reference Using ref format(Not SHA-1)
$ cat .git/HEADref: refs/heads/master
HEAD$ cat .git/HEAD
ref: refs/heads/master
$
HEAD$ cat .git/HEAD
ref: refs/heads/master
$ git branch
first
* master
$
HEAD$ cat .git/HEAD
ref: refs/heads/master
$ git branch
first
* master
$
$ git symbolic-ref HEAD refs/heads/first
$ cat .git/HEAD
ref: refs/heads/first
$ git branch
* first
master
Internal Data Structure
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
tree
blob
65464head.txt799cf
favorites
commit commit
tree
parent
tree
refs/heads/master
refs/heads/first
Internal Data Structure
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
tree
blob
65464head.txt799cf
favorites
commit commit
tree
parent
tree
refs/heads/master
refs/heads/first .git/HEAD
One More ThingCloned. Now Fetch Or Pull ?
Fetch / Pull
Fetch Or Pull To Get Latest Code?
Fetch
● Just Fetch Remote Repository’s Objects And References To Git Internal Storage
● If You Need The Changes On Your Working Directory,○ Manually Merge Them Using git-merge Or,○ Checkout
Fetch
Refspec Describes Source / Destination
$ cat .git/config | grep remote
[remote "origin"]
url = git://127.0.0.1/git/olaf.git
fetch = +refs/heads/*:refs/remotes/origin/*
Source Destination
Fetchurl = git://10.0.0.1/git/olaf.git
fetch = +refs/heads/*:refs/remotes/origin/*
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
tree
blob
65464head.txt
799cffavorites
commit commit
tree
parent
tree
refs/heads/master
.git/HEAD
git://10.0.0.1/git/olaf.git
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
commit
tree
refs/heads/master
.git/HEAD
file:///home/sjpark/olaf
Fetchurl = git://10.0.0.1/git/olaf.git
fetch = +refs/heads/*:refs/remotes/origin/*
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
tree
blob
65464head.txt
799cffavorites
commit commit
tree
parent
tree
refs/heads/master
.git/HEAD
git://10.0.0.1/git/olaf.git
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
tree
blob
65464head.txt
799cffavorites
commit commit
tree
parent
tree
refs/remotes/
origin/master
refs/heads/master
.git/HEAD
file:///home/sjpark/olaf
git merge origin/master
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
tree
blob
65464head.txt
799cffavorites
commit commit
tree
parent
tree
refs/remotes/
origin/master
refs/heads/
first
.git/HEAD
tree
blob tree
blob
a134fhead.txt
799cffavorites
7cc07warm_hug
tree
blob
65464head.txt
799cffavorites
commit commit
tree
parent
tree
refs/remotes/
origin/master
refs/heads/
first
.git/HEAD
Pull
Pull Is Just An Command Of Fetch then Merge
May Merge Conflict Occur…
Pull Is Sufficient For Simple Project
In Short,
Git Is A Content-Addressable File System
Blob, Tree, Commit, Reference. That’s It =3
http://www.juliagiff.com/wp-content/uploads/2014/03/tldr_trollcat.jpg
Thank you :)
http://ecache.ilbe.com/files/attach/new/20130724/377678/1231265/1642033319/19fb4341dbb9b69541a3ec76aa068df0.png
Slide-share
http://www.slideshare.net/SeongJaePark1/deep-darkside-ofgit
References
http://git-scm.com/http://git-scm.com/bookhttp://www.youtube.com/watch?v=4XpnKHJAok8http://disney.wikia.com/wiki/Frozen
This work by SeongJae Park is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported
License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/.