AdamHarding_Git_demo

advertisement
Version Control with Git
xkcd.com/1597
adam-harding@uiowa.edu
Every project has 2 problems…
stuff.h
awesome.h
things.c
Problem 1: Progress
Project changes over time: Old? New? Both?
stuff.h
stuff.h
awesome.h
awesome.h
awesome.h
things.c
things.c
things.c
initial effort
(last month)
•
•
•
add stuff
(last week)
When was some feature added? Why?
Maintain previous version while working on new?
Fix the same bug once in old and new version?
change stuff
(yesterday)
Problem 1: Progress
Project changes over time: Old? New? Both?
stuff.h
stuff.h
awesome.h
awesome.h
awesome.h
things.c
things.c
things.c
initial effort
(last month)
•
•
•
add stuff
(last week)
change stuff
(yesterday)
When was some feature added? Why?
Maintain previous version while working on new?
Fix the same bug once in old and new version?
“Many people’s version-control method of choice is to copy files into another directory (perhaps a timestamped directory, if they’re clever). This approach is very common because it is so simple, but it is also
incredibly error prone.” –Pro Git
Difficult even if you are the only one!
But collaboration is totally impractical, because of…
Problem 2: Edit Wars
Multiple, simultaneous changes: Laptops? Colleagues?
stuff.h
stuff.h
stuff.h
awesome.h
awesome.h
awesome.h
things.c
things.c
things.c
int main(){}
int foo(){}
???
Can I keep both?
int main(){}
void foo(){}
Problem 2: Edit Wars
Multiple, simultaneous changes: Laptops? Colleagues?
stuff.h
stuff.h
stuff.h
awesome.h
awesome.h
awesome.h
things.c
things.c
things.c
int main(){}
int foo(){}
???
int main(){}
void foo(){}
Can I keep both?
Different files:
Same file, but lines don’t overlap:
•
Overwrite old version of each.
(Trivial. A computer could do it without hints!)
•
Edit the file to incorporate each line modification.
(Manually? Hmm…)
Same file, same lines:
•
Conflicts! For each line, I choose what to do.
(Can the computer help? Hmm..)
Solution: VCS
Version Control System
Repository
VCS tool
awesome.h
things.c
v8
awesome.h
things.c
v7
Solution: VCS
Version Control System
Repository
awesome.h
things.c
1: Get bits
VCS tool
awesome.h
things.c
v8
awesome.h
things.c
v7
Solution: VCS
Version Control System
Repository
stuff.h
awesome.h
things.c
v9
stuff.h
awesome.h
2: Store
changes
things.c
1: Get bits
VCS tool
awesome.h
things.c
v8
awesome.h
things.c
v7
Solution: VCS
Version Control System
Repository
stuff.h
awesome.h
things.c
v9
stuff.h
awesome.h
2: Store
changes
things.c
1: Get bits
VCS tool
awesome.h
things.c
v8
Why Git:
• Basic features are really useful, and…
…advanced features if you want them
• Robustly retains your data, but…
…can undo almost anything if needed
• Adaptable: you, the lab, MegaCorp…
awesome.h
things.c
v7
Git: Outline
• Using Git to record your project while you work
• How Git records your progress
• Using Git to organize your progress
• Sharing a project with other Git users
Commit
• Primary unit for storing work
• Exists inside a repository
• A snapshot of what your project’s files looked like when you made the
commit
• Contains: author, email address, timestamp, snapshot data, other stuff,
unique hash of contents
Commit
History
• Unless an “orphan”, each commit descends (as the “child”) from at least
one earlier commit (“parent”)
• Project history is all the project’s commits in order
• Yes, history is a Directed Acyclic Graph whose nodes are commits. Hold
that thought!
Repository: 3 Areas
•
Working Copy
The file tree you see on your
filesystem
•
Staging Area
List of changes to the previous
snapshot you will apply as the next
snapshot
•
Commit History
The snapshots you stored previously
stuff.h
awesome.h
things.c
Working Copy
Staging Area
Commit History
Repository: 3 Areas
•
Working Copy
The file tree you see on your
filesystem
•
Staging Area
List of changes to the previous
snapshot you will apply as the next
snapshot
•
Commit History
The snapshots you stored previously
stuff.h
awesome.h
things.c
Working Copy
Staging Area
Commit History
Repository: 3 Areas
•
Working Copy
The file tree you see on your
filesystem
•
Staging Area
List of changes to the previous
snapshot you will apply as the next
snapshot
•
Commit History
The snapshots you stored previously
stuff.h
awesome.h
things.c
Working Copy
Staging Area
Commit History
Repository: 3 Areas
•
Working Copy
The file tree you see on your
filesystem
•
Staging Area
List of changes to the previous
snapshot you will apply as the next
snapshot
•
Commit History
The snapshots you stored previously
stuff.h
awesome.h
things.c
Working Copy
Staging Area
Demo:
Store a new project!
Commit History
1) Files are ready!
(^>^) cd somefolder
awesome.h
things.c
somefolder/
0) (Need a repository first)
(^>^) git init # Create a new, empty repository in the current directory
awesome.h
things.c
Staging Area
Commit History
somefolder/.git/
somefolder/
1) (Files still untracked)
(^>^) git status
awesome.h # untracked
things.c # untracked
# Git sees any file, but doesn’t care about it until you say so.)
awesome.h
things.c
Staging Area
Commit History
somefolder/.git/
somefolder/
2) Add changes
(^>^) git add things.c awesome.h
(^>^) # Changes are staged.
# Git starts tracking a file the first time you add its changes!
awesome.h
things.c
Staging Area
Commit History
somefolder/.git/
somefolder/
2) (These are now tracked)
(^>^) git status
new file:
awesome.h
new file:
things.c
awesome.h
things.c
Staging Area
Commit History
somefolder/.git/
somefolder/
3) Commit staged changes
(^>^) git commit -m ‘add initial functions’
(^>^)
(^>^) # Always write a nice commit message!
awesome.h
things.c
Staging Area
Commit History
somefolder/.git/
somefolder/
Commit terminology
Yes, this means you “commit a commit”.
Sorry.
awesome.h
things.c
Staging Area
Commit History
somefolder/.git/
somefolder/
Repeat: 1) Files are ready
(^>^) $EDITOR things.c stuff.h
stuff.h
awesome.h
things.c
Staging Area
Commit History
somefolder/.git/
somefolder/
Repeat: 1) (some tracked, some new)
(^>^) git status
stuff.h # untracked (The previous commit doesn’t have this file.)
things.c # modified (The previous commit has a version of this file.)
stuff.h
awesome.h
things.c
Staging Area
Commit History
somefolder/.git/
somefolder/
Repeat: 2) Add changes
(^>^) git add things.c stuff.h
stuff.h
awesome.h
things.c
Staging Area
Commit History
somefolder/.git/
somefolder/
Repeat: Commit Stuff
(^>^) git commit -m ‘new parms in somefunction’
stuff.h
awesome.h
things.c
Staging Area
Commit History
somefolder/.git/
somefolder/
Check the log
(^>^) git log
add initial functions
new parms in somefunction
stuff.h
awesome.h
things.c
Staging Area
Commit History
somefolder/.git/
somefolder/
Good advice
Use commit messages:
Separate your build artifacts:
Somebody in the future needs the help
Use a build directory
stuff.h
awesome.h
things.c
Staging Area
Commit History
somefolder/.git/
somefolder/build/
somefolder/
Good advice
Use commit messages:
Separate your build artifacts:
Avoid committing build artifacts:
“Release” means you published binaries:
Somebody in the future needs the help
Use a build directory
If you can get the project, you can build it!
Others can download from the website (or FTP site, etc.)
stuff.h
awesome.h
things.c
Staging Area
Commit History
somefolder/.git/
somefolder/build/
somefolder/
Git: Outline
• Using Git to record your project while you work
Commits
• How Git records your progress
• Using Git to organize your progress
• Sharing a project with other Git users
The DAG
History:
The DAG grows as commits arrive.
What about a new feature I want to keep separate?
Commit History
The DAG
History:
•
Work on a new feature, but keep it separated until it’s ready
The DAG
History:
•
•
Work on a new feature, but keep it separated until it’s ready
I drew another “column”, but this history is just a string!
The DAG
History:
•
•
•
Work on a new feature, but keep it separated until it’s ready
I drew another “column”, but this history is just a string!
Work continues on the stable version: a commit can have more than one child
Git: Outline
• Using Git to record your project while you work
Commits
• How Git records your progress
The DAG
• Using Git to organize your progress
• Sharing a project with other Git users
The DAG
History:
•
Easy to find the history of any commit: follow links to visit every ancestor
C
B
A
P
The DAG: Branches
History:
•
Easy to find the history of any commit: follow links to visit every ancestor
Branches:
•
We only need to store a single reference to identify each path; call it a branch ref (or just “branch”)
C
B
A
P
feat42
The DAG: Branches
History:
•
Easy to find the history of any commit: follow links to visit every ancestor
Branches:
•
•
We only need to store a single reference to identify each path; call it a branch ref (or just “branch”)
Git creates “master” branch by default, and stores commits there unless told otherwise
master
C
B
A
P
feat42
The DAG: Branches
History:
•
Easy to find the history of any commit: follow links to visit every ancestor
Branches:
•
•
We only need to store a single reference to identify each path; call it a branch ref (or just “branch”)
Git creates “master” branch by default, and stores commits there unless told otherwise
Just a drawing convention!
Only topology matters.
master
C
B
A
P
feat42
The DAG: Branches
Branches:
Q: What are you working on right now?
A: Current location is simply what the HEAD ref is pointing to
Git appends your staged commit onto the DAG at the current branch, then moves the branch to that commit.
HEAD
master
C
B
A
P
feat42
Branch details
Commits A, B, and P are “on the feat42 branch”, so feat42 usually means both of these identically:
“the ref named feat42”
“all the commits in the history of the ref named feat42”
But if somebody says you can “delete your feat42 branch”, you know they mean the ref!
HEAD
master
C
B
A
P
feat42
Branch details
Deleting your feat42 branch is removing the branch ref!
• P is no longer in the history of ANY branch
• You can still access it, but…
• …Git will eventually garbage-collect such unreferenced commits (30 days by default)
HEAD
master
C
B
A
P
(poof)
Branch demo
OK, same example again.
This time, note how some commands modify the branch refs.
HEAD
master
B
A
feat42
Branch demo
(^>^) git branch # list the branches; ‘*’ indicates current branch
* master
HEAD
master
B
A
Branch demo
(^>^) git branch feat42 # create the branch ‘feat42’ at current location
HEAD
master
B
A
feat42
Branch demo
(^>^) git branch # notice we are still on master!
feat42
* master
HEAD
master
B
A
feat42
Branch demo
(^>^) git checkout feat42
(^>^) git branch
* feat42
master
master
B
A
feat42
HEAD
Branch demo
(^>^) $EDITOR newfeature.c
(^>^) git add newfeature.c
(^>^) git commit -m ‘add function for feat42’
P
master
B
A
feat42
HEAD
Branch demo
(^>^) git checkout master
P
HEAD
master
B
A
feat42
Branch demo
(^>^) $EDITOR mainprog.c
(^>^) git add mainprog.c
(^>^) git commit -m ‘fixed a bug’
HEAD
master
C
B
A
P
feat42
Problem:
•
•
Commit C contains a bugfix
Commit P contains your new feature
•
•
But your feature needs the bugfix, and
Commit C is only in the history of master, not feat42
HEAD
master
C
B
A
P
feat42
Problem:
•
•
Commit C contains a bugfix
Commit P contains your new feature
• But your feature needs the bugfix, and
• Commit C is only in the history of master, not feat42
That is, you need the changes between B and C to be in the history of your feat42 branch:
HEAD
master
C
B
A
P
feat42
Merge
The most common technique is called merging:
• Snapshot Pm will combine the image of C with the image of P
• Pm will have more than one parent, making it a merge commit
Pm
HEAD
master
C
B
A
P
feat42
Merge demo
git merge <source>
Git will merge the branch you specify into your current branch (be careful!):
(^>^) git checkout feat42
master
C
B
A
Pm
feat42
HEAD
P
feat42
HEAD
Merge demo
git merge <source>
Git will merge the branch you specify into your current branch (be careful!):
(^>^) git checkout feat42
(^>^) git merge master
Pm
master
C
B
A
P
feat42
HEAD
Merge conflicts
Notice that Git must figure out how to merge two snapshots.
Recall:
Different files:
Same file, but lines don’t overlap:
•
Overwrite old version of each.
(Trivial. Git does this automatically.)
•
Edit the file to incorporate each line modification.
(Git is very good at figuring out how to do this.)
Same file, same lines:
•
Conflicts! For each line, I choose what to do.
(Git identifies conflicting files, and marks any conflicts
inside files.)
Pm
master
C
B
A
P
feat42
HEAD
Merge conflicts
•
•
•
•
•
•
Conflicts! For each line, I choose what to do.
(Git identifies conflicting files, and marks any conflicts
inside files.)
Resolution: often interactively (meld, kdiff3, etc.)
Simply part of life; not unique to Git.
Not discussed further here.
OK in small numbers, but pile up quickly.
You can avoid many of them entirely!
Pm
master
C
B
A
P
feat42
HEAD
Merge conflicts
branches diverge a lot between merges == a lot of merge conflicts at the next merge
feat42 is done.
I bet this merge takes all day.
I wish I saw those changed lines before I edited them!
F
master
E
R
D
Q
C
P
B
A
feat42
Merge conflicts
branches diverge a little between merges == fewer merge conflicts at the next merge
feat42 is done.
Easy!
Git combined most changes from master before I edited them!
R
Qm
master
E
Q
D
Pm
C
P
B
A
feat42
Merge conflicts
branches diverge a little between merges == fewer merge conflicts at the next merge
feat42 is done.
Easy!
Git combined most changes from master before I edited them! Merging feat42 onto master will be easy!
R
Qm
master
E
Q
D
Pm
C
P
B
A
feat42
Merge conflicts
branches diverge a little between merges == fewer merge conflicts at the next merge
feat42 is done.
Easy!
Git combined most changes from master before I edited them! Merging feat42 onto master will be easy!
feat42
R
master
Git only had to move the master ref forward from
E to R to indicate feat42 is in its history.
No merge commit! This is a “fast-forward merge”.
Qm
E
Q
D
Pm
C
P
B
A
Merge conflicts
branches diverge a little between merges == fewer merge conflicts at the next merge
feat42 is done.
Easy!
Git combined most changes from master before I edited them! Merging feat42 onto master will be easy!
Discipline counts:
• Many small commits!
• Separate changes logically!
• Merge often!
feat42
R
master
Git only had to move the master ref forward from
E to R to indicate feat42 is in its history.
No merge commit! This is a “fast-forward merge”.
Qm
E
Q
D
Pm
C
P
B
A
Git: Outline
• Using Git to record your project while you work
Commits
• How Git records your progress
The DAG
• Using Git to organize your progress
Branches
• Sharing a project with other Git users
Sharing
•
•
Repositories are identified by URIs, but each repository can name them for convenience
You transfer commits between a repository and its remote repositories (“remotes”)
/home/alice/
master
https://foo.org/bar.git
Sharing
When you first access a project whose files are in a Git repository, you perform the following sequence:
1. create a new repo
/home/alice/bar/
master
https://foo.org/bar.git
Sharing
When you first access a project whose files are in a Git repository, you perform the following sequence:
1. create a new repo
2. add to it the remote you specified (naming it “origin” by convention)
origin
https://foo.org/bar.git
/home/alice/bar/
master
https://foo.org/bar.git
Sharing
When you first access a project whose files are in a Git repository, you perform the following sequence:
1. create a new repo
2. add to it the remote you specified (naming it “origin” by convention)
3. copy origin’s branches and commits into your new repo (so you can easily keep your repo up-to-date)
origin
https://foo.org/bar.git
master
/home/alice/bar/
master
https://foo.org/bar.git
Remotes
When you first access a project whose files are in a Git repository, you perform the following sequence:
1. create a new repo
2. add to it the remote you specified (naming it “origin” by convention)
3. copy origin’s branches and commits into your new repo (so you can easily keep your repo up-to-date)
VERY common, so Git offers “clone” as a shortcut:
origin
https://foo.org/bar.git
master
/home/alice/bar/
(^>^) git clone https://foo.org/bar.git
master
https://foo.org/bar.git
Remote tracking branches
“clone” also sets up a branch ref which always points to the same commit in your repository as it does in the
remote.
This remote tracking branch is read-only: Git moves it for you based on where it was when you last checked.
Here, your master branch is the tracking branch paired with origin/master.
origin
https://foo.org/bar.git
master
origin/master
/home/alice/bar/
master
https://foo.org/bar.git
Notice that remote tracking branches
are qualified by remote names!
Syncing remotes
Project members commit new commits to the master branch on the origin repository…
origin
https://foo.org/bar.git
master
origin/master
/home/alice/bar/
master
https://foo.org/bar.git
Syncing remotes
Project members commit new commits to the master branch on the origin repository…
…so you need to update your repository! This happens in 2 steps:
origin
https://foo.org/bar.git
master
origin/master
/home/alice/bar/
master
https://foo.org/bar.git
Syncing remotes
Project members commit new commits to the master branch on the origin repository…
…so you need to update your repository! This happens in 2 steps:
1. fetch the state of tracked branches from the remote, along with any new commits
origin
https://foo.org/bar.git
origin/master
master
/home/alice/bar/
master
https://foo.org/bar.git
Syncing remotes
Project members commit new commits to the master branch on the origin repository…
…so you need to update your repository! This happens in 2 steps:
1. fetch the state of tracked branches from the remote, along with any new commits
2. merge origin/master onto your master branch
origin
https://foo.org/bar.git
master
origin/master
/home/alice/bar/
master
https://foo.org/bar.git
Syncing remotes
Project members commit new commits to the master branch on the origin repository…
…so you need to update your repository! This happens in 2 steps:
1. fetch the state of tracked branches from the remote, along with any new commits
2. merge origin/master onto your master branch
VERY common, so Git offers “pull” as a shortcut for both steps:
origin
https://foo.org/bar.git
master
origin/master
/home/alice/bar/
master
https://foo.org/bar.git
(^>^) git pull origin
Syncing remotes
Project members commit new commits to the master branch on the origin repository…
…and that means you (or maybe Bob); here’s how he did it:
origin
https://foo.org/bar.git
master
origin/master
/home/alice/bar/
master
https://foo.org/bar.git
origin
https://foo.org/bar.git
master
origin/master
/home/bob/bar/
Syncing remotes
Project members commit new commits to the master branch on the origin repository…
…and that means you (or maybe Bob); here’s how he did it:
1. git pull origin # Just in case Alice put new commits there! Do this often!
origin
https://foo.org/bar.git
master
origin/master
/home/alice/bar/
master
https://foo.org/bar.git
origin
https://foo.org/bar.git
master
origin/master
/home/bob/bar/
Syncing remotes
Project members commit new commits to the master branch on the origin repository…
…and that means you (or maybe Bob); here’s how he did it:
1. git pull origin # Just in case Alice put new commits there! Do this often!
2. place the new commits into the desired branch (master in this case); merge, direct commit, whatever
origin
https://foo.org/bar.git
master
origin/master
/home/alice/bar/
master
https://foo.org/bar.git
origin
https://foo.org/bar.git
master
origin/master
/home/bob/bar/
Syncing remotes
Project members commit new commits to the master branch on the origin repository…
…and that means you (or maybe Bob); here’s how he did it:
1. git pull origin # Just in case Alice put new commits there! Do this often!
2. place the new commits into the desired branch (master in this case); merge, direct commit, whatever
3. git push origin # Naturally, this updates origin/master automatically
origin
https://foo.org/bar.git
master
origin/master
/home/alice/bar/
master
origin
https://foo.org/bar.git
master
origin/master
https://foo.org/bar.git
/home/bob/bar/
Project organization
•
•
•
There are multiple copies of the repository. (Your laptop, your workstation, Bob’s laptop…)
In principle, you can transfer commits from any repository to any other repository
You and your team choose one to be the central repository by convention
Github.com, Bitbucket.com, and similar services provide nice tools for organizing this.
origin
https://foo.org/bar.git
master
origin/master
/home/alice/bar/
master
origin
https://foo.org/bar.git
master
origin/master
https://foo.org/bar.git
/home/bob/bar/
Git: Outline
• Using Git to record your project while you work
Commits
• How Git records your progress
The DAG
• Using Git to organize your progress
Branches
• Sharing a project with other Git users
Remotes
The remainder is elaboration and convenience.
So…
How about now?
xkcd.com/1597
“If that doesn't fix it, git.txt contains the phone number of a friend of mine who
understands git. Just wait through a few minutes of 'It's really pretty simple, just think
of branches as...' and eventually you'll learn the commands that will fix everything.”
atlassian.com/git
git-scm.com/book
gitguys.com
Thanks! Questions?
atlassian.com/git
git-scm.com/book
gitguys.com
icts.uiowa.edu/confluence
events.uiowa.edu/event/git_workshop
see “Version Control with Git”
Prof. Hans Johnson
Michigan Room, IMU
Monday, 1:30-3:00
Next: hands-on session in the lab
adam-harding@uiowa.edu
Download