Top Banner
Tracking huge files with Git LFS LARS SCHNEIDER GIT SOLUTIONS LEAD • AUTODESK @KIT3BUS STEVE SMITH DEVOPS ADVOCATE • ATLASSIAN @TARKASTEVE
70

Tracking Huge Files with Git LFS

Mar 19, 2017

Download

Technology

Atlassian
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Tracking Huge Files with Git LFS

Tracking huge files with Git LFS

LARS SCHNEIDER • GIT SOLUTIONS LEAD • AUTODESK • @KIT3BUS

STEVE SMITH • DEVOPS ADVOCATE • ATLASSIAN • @TARKASTEVE

Page 2: Tracking Huge Files with Git LFS

T H E G I T L F S M O D E L

T H E P R O B L E M W I T H B I G F I L E S

Agenda

M I G R AT I O N

G I T L F S P E R S O N A S

T H E G I T D ATA M O D E L

Page 3: Tracking Huge Files with Git LFS

data model

Page 4: Tracking Huge Files with Git LFS

$> git init$> tree .git/objects.git/objects── info

└── pack

2 directories

Page 5: Tracking Huge Files with Git LFS

$> touch some-file.txt $> git add some-file.txt

Page 6: Tracking Huge Files with Git LFS

$> tree .git/objects.git/objects── e6

│   └── 9de29bb2d1d6434b8b29ae775ad8c2e48c5391── info

└── pack

3 directories, 1 filezlib compressed

SHA1

Page 7: Tracking Huge Files with Git LFS

Type Chapter title here

master

98ca9..

bab1e..

fad3d.. cat .git/refs/heads/master$

fad3dd41d0cf3d1b6aa2d8ad0549ab2fcb1575d1

“Directed Acyclic Graph”

Page 8: Tracking Huge Files with Git LFS

master

98ca9..

bab1e..

fad3d..

434bb..tree

bab1e..parent

Tim P <kannonboy@…> 1455209277 -0800committer

Tim P <kannonboy@…> 1455209277 -0800author

My life is my commit message.

git cat-file -p 98ca9$

Page 9: Tracking Huge Files with Git LFS

git cat-file -p 434bb

ace23..100644 blob .gitignoredbdbd..100644 blob README.mda0bc3..040000 tree app33d33..040000 tree configb1de7..100755 blob deploy-prod.sh7011e..100755 blob deploy-staging.sh

typefilemode SHA-1

master

98ca9..

bab1e..

fad3d..$

434bb..

Page 10: Tracking Huge Files with Git LFS

master

98ca9..

bab1e..

fad3d..

434bb..

Page 11: Tracking Huge Files with Git LFS

master

98ca9..

bab1e..

fad3d..

434bb..

Page 12: Tracking Huge Files with Git LFS

98ca9..

bab1e..

fad3d..

master

Page 13: Tracking Huge Files with Git LFS

98ca9..

bab1e..

fad3d..

master

Page 14: Tracking Huge Files with Git LFS

98ca9..

bab1e..

fad3d..

master

Page 15: Tracking Huge Files with Git LFS

50mb

100mb

150mb98ca9..

bab1e..

fad3d..

master

Page 16: Tracking Huge Files with Git LFS
Page 17: Tracking Huge Files with Git LFS

(Large File Storage)

Git LFS

Page 18: Tracking Huge Files with Git LFS

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

$

LFS store

Git host

Page 19: Tracking Huge Files with Git LFS

Git host

LFS store

$

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

Page 20: Tracking Huge Files with Git LFS

LFS store

git push$

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

Git host

Page 21: Tracking Huge Files with Git LFS

git pull$

LFS store

Git host

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

4749d..

bdd12..

778aa..

Page 22: Tracking Huge Files with Git LFS

git checkout bab1e$

LFS store

Git host

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

4749d..

bdd12..

778aa..HEAD

Page 23: Tracking Huge Files with Git LFS

https://git-lfs.github.com/spec/v1version

sha256:325ddfb…oid

29342295size

git cat-file -p 4749d$ ☞

dabad..

98ca9..

bab1e..

fad3d..

86753..

434bb..

4749d..

bdd12..

778aa..

Page 24: Tracking Huge Files with Git LFS

massive_video.mp4

Work tree

dev

.git/lfs/objects

Clean filter(git-lfs clean)

Index

massive_video.mp4

$

.git/objects

git add

Page 25: Tracking Huge Files with Git LFS

$

dev

.git/lfs/objects

Smudge filter(git-lfs smudge)

Work tree

massive_video.mp4

Commit tree

massive_video.mp4.git/objects

LFS Store

git checkout

Page 26: Tracking Huge Files with Git LFS

.git/lfs/objects

.git/objectsHosted repo

LFS store

git push / pull

Page 27: Tracking Huge Files with Git LFS

$ brew install git-lfs

$ git lfs install

Page 28: Tracking Huge Files with Git LFS

$ cat ~/.gitconfig

[filter "lfs"] clean = git-lfs clean %f smudge = git-lfs smudge %f required = true

Page 29: Tracking Huge Files with Git LFS

$ git lfs track “*.mp4”

$ cat .gitattributes

*.mp4 filter=lfs diff=lfs merge=lfs -text

Page 30: Tracking Huge Files with Git LFS

@kit3bus

Lars Schneider Autodesk Inc.

Git and Git LFS contributor

Technical Lead forGit at Autodesk

Page 31: Tracking Huge Files with Git LFS

@kit3bus

Who are we?• Best known for AutoCAD

2D and 3D computer-aided design

• 33 years in business

• 4000 engineers, hundreds of products, terabytes of code and asset data

Page 32: Tracking Huge Files with Git LFS

@kit3bus

Architecture, Engineering and Construction

Image by Dave Tyner, Autodesk Plant 3D - P&ID

Page 33: Tracking Huge Files with Git LFS

@kit3bus

Manufacturing

Page 34: Tracking Huge Files with Git LFS

@kit3bus

Media and Entertainment

Page 35: Tracking Huge Files with Git LFS

@kit3bus

3D Printing

"Future of Making Things"

Image courtesy of Local Motors Inc.

Page 36: Tracking Huge Files with Git LFS

@kit3bus

What do we use Git LFS for?

Integration Test Data(3D Models, ...)

Auxiliary Data(Documentation, Images, Videos, ...)

Build Artifacts(not recommended)

Page 37: Tracking Huge Files with Git LFS

@kit3bus

D E V E L O P E R

M I G R ATO R

A D M I N I S T R ATO R

- Git LFS Usage - What have we learned?

Page 38: Tracking Huge Files with Git LFS

@kit3bus

Migrator

Page 39: Tracking Huge Files with Git LFS

@kit3bus

Migration Process

1 Identify an engineer with deep code knowledge

Create a "demo" migration on Git migration server

Iterate on "demo" migration until repo and CI are OK

Ask broader team to "play" with the "demo" migration

Perform migration on Git production server

2

3

4

5

Page 40: Tracking Huge Files with Git LFS

@kit3bus

How to migrate?

Page 41: Tracking Huge Files with Git LFS

@kit3bus

How to migrate?

git-svn / git-p4 / git-tfs ...

Page 42: Tracking Huge Files with Git LFS

@kit3bus

How to migrate?

+

git-svn / git-p4 / git-tfs ...

git filter branch / git-lfs-migrate

Page 43: Tracking Huge Files with Git LFS

@kit3bus

How to migrate?

+

git-p4

git-svn / git-p4 / git-tfs ...

git filter branch / git-lfs-migrate

Page 44: Tracking Huge Files with Git LFS

@kit3bus

Git LFSMigration Gotchas

Discard large file history

1998 2007 2016

code code code +++

Page 45: Tracking Huge Files with Git LFS

@kit3bus

Git LFSMigration Gotchas

Avoid "orphaned" LFS files after history rewrite

LFSPtr

Git Repo LFS Storage

Page 46: Tracking Huge Files with Git LFS

@kit3bus

( INCLUDES DESIGNER, TESTER, . . . )

Developer

Page 47: Tracking Huge Files with Git LFS

@kit3bus

Teach why "Large" files are a problem!

All history is local. Good for source files.

Problem for large files.

Page 48: Tracking Huge Files with Git LFS

@kit3bus

What is a "problematic" file?

Files that do not compress well...

Page 49: Tracking Huge Files with Git LFS

@kit3bus

What is a "problematic" file?

... and change frequently.

Mon Tue Wed

Page 50: Tracking Huge Files with Git LFS

@kit3bus

What is a "problematic" file?

Files smaller than 500kb are OK.

Rule of

Thumb

Page 51: Tracking Huge Files with Git LFS

@kit3bus

How to track Git LFS files?

git lfs track "*.png"

Page 52: Tracking Huge Files with Git LFS

@kit3bus

How to track Git LFS files?

git lfs track "*.png"

Page 53: Tracking Huge Files with Git LFS

@kit3bus

How to track Git LFS files?

git lfs track "*.lfs.*"

e.g. /images/elephant.lfs.png

Page 54: Tracking Huge Files with Git LFS

@kit3bus

How to track Git LFS files?

git lfs track "/big/*"

e.g. /big/elephant.png

Page 55: Tracking Huge Files with Git LFS

@kit3bus

How to track Git LFS files?

git lfs track "/xxl.png"

Page 56: Tracking Huge Files with Git LFS

@kit3bus

How to track Git LFS files?

Less than 1000 files in LFS are OK.

Rule of

Thumb

Up to 70x speed improvement pending!

Page 57: Tracking Huge Files with Git LFS

@kit3bus

git lfs track "*.png"

git lfs track "*.[pP][nN][gG]"

Case sensitive:

Case in-sensitive:

Git LFSGotchas

Page 58: Tracking Huge Files with Git LFS

@kit3bus

No line ending conversions on

LFS files!

Git LFS Gotchas

Page 59: Tracking Huge Files with Git LFS

@kit3bus

Use the latestGit / Git LFS

version!

Git LFS Tips & Tricks

Page 60: Tracking Huge Files with Git LFS

@kit3bus

Setup your Git credential helper

(or use SSH)!

Watch out for the "administrator" shell!

Git LFS Tips & Tricks

Page 61: Tracking Huge Files with Git LFS

@kit3bus

git lfs clone <URL>

Use Git 2.9+ if your Submodules contain Git LFS files.

Git LFS Tips & Tricks

Page 62: Tracking Huge Files with Git LFS

@kit3bus

Git LFS Tips & Tricks

Use Git Sparse Checkout if you have too many LFS files!

Page 63: Tracking Huge Files with Git LFS

@kit3bus

Administrator

Page 64: Tracking Huge Files with Git LFS

@kit3bus

How to make sure Git LFS is used properly?

Configure Git LFS on all platforms!

Enterprise Config for Githttps://git.io/vi1F4

Page 65: Tracking Huge Files with Git LFS

@kit3bus

How to make sure Git LFS is used properly?

"What happens in Git, stays in Git."

Page 66: Tracking Huge Files with Git LFS

@kit3bus

How to make sure Git LFS is used properly?

Rewriting history can cause a lot of

trouble!

Page 67: Tracking Huge Files with Git LFS

@kit3bus

How to make sure Git LFS is used properly?

Configure file size limit on Git server!

Page 68: Tracking Huge Files with Git LFS

@kit3bus

How to make sure Git LFS is used properly?

Configure file size limit with localGit pre-commit

hooks!

Page 69: Tracking Huge Files with Git LFS

@kit3bus

How to make sure Git LFS is used properly?

Use code reviews and limit write

access to shared branches!

At least initially.

Page 70: Tracking Huge Files with Git LFS

@kit3bus

Takeaways • Git LFS works

• Use the latest Git/Git LFS version

• Use `git lfs clone`

• Track problematic files in Git LFS

• Reject problematic files in Git

• Keep an eye on # of tracked files