Introduction to Version Control with Git CSC/ECE 517, Fall 2014 A joint project of the CSC/ECE 517 staff, including Titus Barik, Gaurav Tungatkar, Govind Menon, and Krunal Jhaveri Local version control: RCS check out File • Keep many copies of files • Error prone • RCS stores deltas Version 3 Version 2 Version 1 Centralized Version Control • If you need to work with other programmers … Computer A File check out Computer B File Version 3 Version 2 check out Version 1 File Server vs. Version-Control Server At first glance, the client-server architecture of a version-control system looks much like a typical file server. So why do we need version control? File-Sharing Issues The problem is that users are stepping on each other’s feet! Image: Version Control with Subversion Approach 1: Lock, Modify, Unlock 1. Locking may cause administrative problems. 2. Locking may cause unnecessary serialization. 3. Locking may create a false sense of security. Image: Version Control with Subversion Approach 2: Copy-Modify-Merge Sounds chaotic, but in practice, runs extremely smoothly. Question: When is locking necessary? Image: Version Control with Subversion Exercise 1 • Answer these questions • Give one advantage of using a version-control server for source-code management over using a fileserver. • Explain how locking can cause administrative problems. • Explain how locking can create a false sense of security. • With copy-modify-merge, when is locking necessary? Branches and Tags Trunk: Location where main development occurs. Branches: Location used to isolate changes to another development line (e.g., experimental features). Tags: Snapshot of the content (e.g., RTM, service packs, EOL). Image: http://en.wikipedia.org/wiki/Subversion_(software) Traditional Repository Format A Subversion repository layout—typical of older version-control systems. The folder names are just a convention, and have no special meaning to the repository. Image: Version Control with Subversion Creating a Branch—by Copying In Subversion, the underlying mechanism of a branch is implemented by performing a simple directory copy. Image: Version Control with Subversion Exercise 2 • Answer these questions about branches. o Suppose, in fixing a bug, you modify three lines of code in two source files. Should you create a new branch? Why or why not? o Which would probably be more common, branches or tags? o What are some of the risks of copying files in a repository? How do version-control systems minimize this risk? Distributed Version Control • Clients don’t check out individual files; • they mirror the Computer A repository. File • What’s the advantage? Version 3 Version 2 Version 1 Computer B File Version 3 Version 3 Version 2 Version 2 Version 1 Version 1 Git • Came out of the Linux project, in 2005. • Simple design • Strong support for non-linear development (thousands of parallel branches) • Fully distributed • Able to handle large projects like the Linux kernel efficiently (speed and data size) Integrity & Checksums • Everything checksummed with an SHA-1 hash – 40-character string – composed of hex characters – calculated based on the contents of a file or directory structure in Git • Example – 24b9da6552252987aa493b52f8696cd6d3b00373 – But, you don’t have to type the whole SHA … • Git knows everything by hash, not filename Snapshots, not Diffs • See http://git-scm.com/book/ch1-3.html • Every time you commit, Git takes a snapshot of your files. • Files that have not changed are not copied. Almost all ops are local • browse history • commit 3 States of a File in Git • Modified • Staged working directory • Committed git directory (repository) staging area check out the project stage files commit File Status Lifecycle untracked unmodified modified staged edit the file add the file remove the file stage the file Checking Status • To check the status of your files: $ git status # On branch master nothing to commit (working directory clean) • Creating new files $ vim README $ git status # On branch master # Untracked files: # (use "git add <file>..." to include in what will be committed) # # README nothing added to commit but untracked files present (use "git add" to track) Checking status, cont. • Begin to track the file: $ git add README • The file is now tracked: $ # # # # # git status On branch master Changes to be committed: (use "git reset HEAD <file>..." to unstage) new file: README # • For more info: http://git-scm.com/book/ch2-2.html Remotes • On a project, you may be working with several remote directories. • “Origin” is the server you cloned your repository from $ git clone git://github.com/schacon/ticgit.git Initialized empty Git repository in /private/tmp/ticgit/.git/ remote: Counting objects: 595, done. remote: Compressing objects: 100% (269/269), done. remote: Total 595 (delta 255), reused 589 (delta 253) Receiving objects: 100% (595/595), 73.31 KiB, done. Resolving deltas: 100% (255/255), done. $ cd ticgit $ git remote origin • http://git-scm.com/book/ch2-5.html Pulling, pushing to remote • $ git fetch [remote-name] • E.g., git fetch origin • git push origin master Common Workflow using Git • Centralized workflow … • http://git-scm.com/book/ch5-1.html • Integration-manager workflow … • Common use cases: http://git-scm.com/book/ch5-2.html Pull requests • After you’ve finished a project, you need to notify the maintainer. • This is done via a pull request. • You say which repository to pull from, and • give a summary of your changes. • http://git-scm.com/book/ch5-2.html Guidelines for Commits • What happens if you • • • • download a repo in a zip file, do your project, then save it with a single commit? (Think of someone else trying to merge your changes with another programmer’s changes.) Your code … a=a+b … Repository code … a=c … • Is the difference because— • you changed a = c to a = a + b, • or because someone else changed a = a + b to a = c while you were working on your project? Guidelines for Commits • Which is worse, • Downloading the repo as a zip file, and being scrupulously careful to make multiple commits with reasonable commit comments, or • Downloading the repo with its commit history, but committing your whole project in one commit? • Why? • Of course, you shouldn’t do either! Guidelines for Commits • In your work, save the commit history. • Each commit should be on one topic. • A commit comment should be 1 line, • certainly no more than one sentence. Exercise 3 Visit https://github.ncsu.edu/grmenon/versionControl Clone the repository using the HTTPS clone url $ git clone [https_url] // clone an existing repo $ git branch // List branches Exercise 3, cont. $ git branch $ git branch [unityId] // create a new branch from the current HEAD $ git branch $ git checkout [unityId] // switch to that branch Exercise 3, cont. Add a new file (you can take a look at the test file and create a new test) $ git status $ git add [filename] $ git status $ git commit -m “Commit message” $ git push origin [unityId] // push on to the branch on the remote Exercise 3, cont. [Pause here, until next week.] $ git fetch origin // Update other branches $ git merge origin/master //Merge any new changes on master into current branch Exercise 3, cont. $ git checkout master $ git merge [unityId] // merge changes from your branch back into master Exercise 3, cont. Master branch Your new branch Local Commit 1 Commit 2 Commit 1 Commit 1.5 Commit 1 Commit 2 Commit 1 Commit 1.5 Remote (Origin) Exercise 3, cont. Your new branch Local Commit 1 Commit 1.5 Commit 2 Commit 1 Commit 1.5 Commit 2 Remote (Origin) Exercise 3, cont. Master branch Local Commit 1 Commit 1.5 Commit 2 Commit 1 Commit 1.5 Commit 2 Remote (Origin)