Version Control Systems: SVN and GIT How do VCS support SW development teams? CS 435/535 The College of William and Mary Agile manifesto We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value: Individuals and interactions over processes and tools Working software over comprehensive documentation Customer collaboration over contract negotiation Responding to change over following a plan That is, while there is value in the items on the right, we value the items on the left more. What is needed? Chapter 3 Agile software development 2 Plan-driven and agile specification Plan-based development Requirements engineering Design and implementation Requirements specification Requirements change requests Agile development Requirements engineering Design and implementation Chapter 3 Agile software development 3 The extreme programming release cycle Select user stories for this release Evaluate system Break down stories to tasks Release software Chapter 3 Agile software development Plan release Develop/integrate/ test software 4 The Scrum process Assess Outline planning and architectural design Select Project closure Review Develop Sprint cycle Chapter 3 Agile software development 5 Scrum benefits The product is broken down into a set of manageable and understandable chunks. Unstable requirements do not hold up progress. The whole team have visibility of everything and consequently team communication is improved. Customers see on-time delivery of increments and gain feedback on how the product works. Trust between customers and developers is established and a positive culture is created in which everyone expects the project to succeed. Chapter 3 Agile software development 6 Software Engineering is Team Work • Enabling technology for productivity Remember SVN from CS 301? • must support parallelization • must support communication • Documentation as preserved communication • must support management of tasks & people • What needs to be done? When? By whom? • What has been done? By whom? What does it support? Version Control Systems Centralized •CVS – 1990 •SVN - 2000 Distributed •Bitkeeper - 1997 •Git – 2005 •Bazaar – 2005 •Mercurial - 2005 *More VCS at http://en.wikipedia.org/wiki/Comparison_of_revision_control_software Version Control Systems • Version control system • supports concurrent software development on shared code base • keeps track of changes, • integrates versions / recognizes conflicts, • allows for recovery, documentation of changes • Common set up: • IDE as front end, VCS as back end (shared, persistent storage) Subclipse: eclipse plugin for SVN Subclipse: eclipse plugin for SVN http://subclipse.tigris.org/update_1.8.x EGit: eclipse plugin for Git EGit: eclipse plugin for Git http://download.eclipse.org/egit/updates EGit: eclipse plugin for Git http://eclipsesource.com/blogs/tutorials/egit-tutorial/ http://wiki.eclipse.org/EGit/User_Guide#Overview Centralized vs Distributed Version Control Systems Centralized Architecture: Image from http://git-scm.com/book/en/Getting-Started-About-Version-Control Distributed Version Control Systems Distributed Architecture: Image from http://git-scm.com/book/en/Getting-Started-About-Version-Control Centralized vs Distributed VCS • What are the pros & cons? • Software engineering is much about scalability: • Project size in # of developers • about 10 • up to 100 • more than 100 Workflows: Centralized • Small teams • Typical workflow for SVN and CVS • Repository is a single point of failure Image from http://git-scm.com/book/en/Distributed-Git-Distributed-Workflows Workflows: Integration - Manager • Supported by CVS and SVN using branches • More easily supported by distributed version control systems Image from http://git-scm.com/book/en/Distributed-Git-Distributed-Workflows Workflows: Director and Lieutenants • Supported by CVS and SVN using branches • More easily supported by distributed version control systems • Generally used by huge projects (e.g., Linux kernel) Image from http://git-scm.com/book/en/Distributed-Git-Distributed-Workflows Versions, Revisions, and Snapshots • CVS: each commit generates a new version for each file modified • SVN: each commit generates new state of the file system tree, called a revision • GIT: same than SVN; keeps a snapshot of the system but instead of saving the deltas it saves the changed files and references to the unchanged ones your project in Git, it basically takes a picture of what all your files look like at t moment and stores a reference to that snapshot. To be efficient, if files have not nged, Git doesn’t store the file again—just a link to the previous identical file it has eady stored. Git thinks about its data more like Figure 1.5. Git follows idea of a file system with snapshots Figure 1.5: Git stores data as snapshots of the project over time. This is an important distinction between Git and nearly all other VCSs. It makes reconsider almost every aspect of version control that most other systems copied m the previous generation. This makes Git more like a mini filesystem with some redibly powerful tools built on top of it, rather than simply a VCS. We’ll explore PTER 1 G ETTING S TARTED on) think of the information they keep as a set of files and the changes made to each SVN et al: over time, as illustrated in Figure 1.4. ure 1.4: Other systems tend to store data as changes to a base version of each file. Git doesn’t think of or store its data this way. Instead, Git thinks of its data more a set of snapshots of a mini filesystem. Every time you commit, or save the state your project in Git, it basically takes a picture of what all your files look like at t moment and stores a reference to that snapshot. To be efficient, if files have not Versions, Revisions, and Snapshots SVN and Git use global revision numbers Image from http://svnbook.red-bean.com/en/1.7/svn.basic.in-action.html Operations and states (CVS and SVN) Workspace Repository Checkout Commit Operations and states (Git) Workspace Staging area (Index) Repository Checkout Stage Commit Operations and commands - Git http://osteele.com/posts/2008/05/commit-policies Workflows: Integration - Manager • Supported by CVS and SVN using branches • More easily supported by distributed version control systems Image from http://git-scm.com/book/en/Distributed-Git-Distributed-Workflows Operations and commands Operation CVS SVN Git Init init create init Import import import commit Checkout checkout checkout clone Checkout branch checkout checkout checkout Commit/Checkin commit commit commit, push Update update update fetch, pull Operations and commands - SVN+Eclipse Operations and commands - SVN+Eclipse Operations and commands - SVN+Eclipse Operations and commands - SVN+Eclipse Operations and commands - SVN+Eclipse Operations and commands - SVN+Eclipse Workflows and issues • Workflow: 1) get code base 2) make changes 3) deliver changes • Issue: Read/write access to remote repository • Protected: User authentication, registration, account/pw necessary in communication, IDE stores/uses account/pw for convenience • Issue: Conflicts • Changes do not fit together, automatically recognized at some level of granularity (same file, same method, same line of code) • Automatically recognized, manually fixed • Issue: Documentation / Communication • What changed, how trustworthy are the changes, what needs to be changed as an effect • Finding the right historical version to undo some changes Tagging • Useful for marking specific points in history, in particular: Releases • Two types: lightweight vs annotated • annotated: full objects in Git DB, check summed, contain tagger C 2 G B name, email, date, tagging message, can be signed & verified HAPTER • $ git tag -a v1.4 -m ’my version 1.4’ IT ASICS $ git show v1.4 tag v1.4 Tagger: Scott Chacon <schacon@gee-mail.com> Date: Mon Feb 9 14:45:11 2009 -0800 my version 1.4 commit 15027957951b64cf874c3557a0f3547bd83b3ff6 Merge: 4a447f7... a6b4c97... Author: Scott Chacon <schacon@gee-mail.com> Date: Sun Feb 8 19:02:46 2009 -0800 Merge branch ’experiment’ That shows the tagger information, the date the commit was ta tation message before showing the commit information. Branching • CVS: simple process for creating branches on the repository • SVN: has no internal concept of a branch; branches are managed as copies of a directory. • GIT: very simple process for creating local and remote branches Merging 4 Branch 1 2 3 4 Branch 1 2 5 3 5 6 Merge 6 7 Branching in SVN Branching in SVN Branching in SVN Branching in SVN Branching in SVN Branching in SVN Branching in SVN Branching in Git • Branches are lightweight movable pointers to commits • The default branch is the MASTER (trunk) Images from http://git-scm.com/book/en/Git-Branching-Basic-Branching-and-Merging Branching in Git Initial layout for three commits New branch pointer (iss53) Images from http://git-scm.com/book/en/Git-Branching-Basic-Branching-and-Merging Branching in Git New commit on the branch Hot fix branch on master Images from http://git-scm.com/book/en/Git-Branching-Basic-Branching-and-Merging Merging in Git Images from http://git-scm.com/book/en/Git-Branching-Basic-Branching-and-Merging Merging in Git Images from http://git-scm.com/book/en/Git-Branching-Basic-Branching-and-Merging Branching & Merging in Git • Key concept, well supported • Local workflow • people create branches for any issue / task /assignment they deal with, sometimes called “topic” branch • optional: rebase instead of merge to obtain a linear history • only recommended for local repository • Remote repository: merge with master • For integration manager with blessed repository: pull request What is missing so far? • Documentation of problems, bug reports • Work assignments, who does what and till when Issue tracking Github Issue Tracker • Filter by open and closed issues, assignees, labels, and milestones. • Sort by issue age, number of comments, and update time. • Milestones / labels Github Workflow: Code review & Pull request • Pull request starts conversation around proposed changes. Additional commits may add to branch before merging into master. • Pull Request = Code + Issue + Code Comments Software Engineering is Team Work • Enabling technology for productivity • must support parallelization • must support communication • Documentation as preserved communication • must support management of tasks & people • What needs to be done? When? By whom? • What has been done? By whom?