VERSION CONTROL SYSTEMS

TABLE OF CONTENTS

  • What is a Version control system?
  • Why we use VCS?
  • CVS
  • SVN
  • Git

What is a Version control system?

VCS

  • The management of changes to computer programs and other collections of information. Changes are usually identified by a number or letter code, termed the "revision number", "revision level", or simply revision".
  • a.k.a Revision control or Source control
  • A system that records changes to a file or set of files over time so that you can recall specific versions later.

Why we use VCS

  • We need a logical way to organize and control revisions of our source code
  • It allows us to revert files back to a previous state, revert the entire project back to a previous state, compare changes over time, see who last modified something that might be causing a problem, who introduced an issue and when, and more.
  • Using a VCS also generally means that if you screw things up or lose files, you can easily recover. In addition, you get all this for very little overhead.

Source-management models

  • Atomic operations
    • The commit operation is usually the most critical in this sense. Commits tell the revision control system to make a group of changes final, and available to all users.
    • Not all revision control systems have atomic commits (e.g. CVS)

Source-management models (2)

  • File locking
    • The simplest method of preventing "concurrent access" problems involves locking files. So that only one developer at a time has write access to the central "repository" copies of those files.
    • Once one developer "checks out" a file, others can read that file, but no one else may change that file until that developer "checks in the updated version (or cancels the checkout).

Source-management models (3)

  • Version merging
    • Most version control systems allow multiple developers to edit the same file at the same time
    • The system may provide facilities to merge further changes into the central repository
  • Baselines, labels and tags
    • Use only one of these similar terms (baseline, label, tag) to refer to the action of identifying a snapshot ("label the project")
    • Some snapshots are more significant than others (e.g. releases, branches)

Source-management models (4)

  • Traditional revision control systems use a centralized model where all the revision control functions take place on a shared server.

Local VCS

  • Some people's version-control method of choice is to just copy files into another directory
  • Error prone - it is easy to forget which directory you're in and accidentally write to the wrong
  • To deal with this issue, we can use local VCSs that had a simple database that keep all the changes to files under revision control

Distributed revision control

  • P2P approach, as opposed to the client-server approach of centralized systems
  • No canonical, reference copy of the codebase exists by default, only working copies
  • Common operations (such as commits, viewing history, and reverting changes) are fast, because there is no need to communicate with a central server

Distributed revision control (2)

  • Each working copy is a remote backup of the codebase and of its change-history

Common terminology

Common terminology

  • Baseline
    • An approved revision of a document or source file from which subsequent changes can be made
  • Checkout
    • To create a local working copy from the repo
  • Branch
    • A set of files under version control may be branched or forked at a point in time so that, from that time forward, two copies of those files may develop at different speeds or in different ways independently of each other

Common terminology (2)

  • Commit
    • a.k.a check in, is to write or merge the changes made in the working copy back to the repo
  • Conflict
    • Occurs when different people make changes to the same code, and the system is unable to resolve the changes automatically.
    • A user must resolve the conflict by combining the changes, or by selecting one change in favour of the other

Common terminology (3)

  • Trunk
    • Unique line of development that is not a branch (a.k.a Baseline, Mainline or Master)
  • Merge
    • a.k.a Integration is an operation in which two sets of changes are applied to a file or set of files.
    • The general case is: A user, working on a set of files, updates or syncs their working copy with changes made, and checked into the repository, by others

Common terminology (4)

  • Merge (2)
    • A user tries to check in files that have been updated by others since the files were checked out, and the revision control software automatically merges the files
    • A branch is created, the code in the files is independently edited, and the updated branch is later incorporated into a single, unified trunk
    • A set of files is branched, a problem (bug) that existed before the branching is fixed in one branch, and the fix is then merged into the other branch

Common terminology (5)

  • Pull, push
    • Copy revisions from one repository into another. Pull is initiated by the receiving repository, while push is initiated by the source. Fetch means a pull followed by an update.
  • Resolve
    • User intervention to deal with a conflict between different changes to the same code

Concurrent Versions System

Concurrent Versions System

  • Keeps track of all work and all changes in a set of files, and allows several developers (potentially widely separated in space and time) to collaborate.
  • Each user editing files within their own "workceig copy" of the project, and sending (or checking in) their modifications to the server.
  • To avoid the possibility of conflicts, the server only accepts changes made to the most recent version of a file.

Concurrent Versions System (2)

  • Therefore we should constantly keep our working copy up-to-date by incorporating other people's changes on a regular basis
  • Many Successors, e.g. Subversion

Subversion

  • Commits as true atomic operations (interrupted commit operations in CVS would cause repository inconsistency or corruption)
  • The system maintains versioning for directories, renames, and file metadata
  • Users can move and/or copy entire directory-trees very quickly, while retaining full revision history
  • Branching is a cheap operation, independent of file size
  • Does not distinguish between a branch and a directory

Subversion (2)

    Limitations and problems
  • The implementation of the file and directory rename operation
  • Lacks some repository-administration and management features. Deletes some records and history.
  • tores additional copies of data on the local machine
  • Can become an issue with very large projects or files, or if developers work on multiple branches simultaneously

Git

Git

    Fundamentals
  • Conceptually, most other systems store information as a list of file-based changes.
  • They think of the information kept as a set of files and the changes made to each file over time
  • Git just a link to the previous identical file it has already stored.
  • Git thinks about its data more like a stream of snapshots
  • More like a mini filesystem with some incredibly powerful tools built on top of it

Design goals

  • Speed
  • Simple design
  • Strong support for thousands of parallel branches
  • Fully distributed
  • Able to handle large projects (like Linux kernel) effectively

The Three States

    Git has three main states that your files can reside in:
  • Committed
    • Data is safely stored in your local database
  • Modified
    • You have changed the file but have not committed it to your database yet
  • Staged
    • You have marked a modified file in its current version to go into your next commit snapshot

Common commands

Add new file

	$ git add README.rst
							
Commit changes

	$ git commit -m 'First commit'
							
Show log

	$ git log -20 --oneline
							

Common commands (2)

Create new branch

$ git branch feature/my-func
$git checkout -b feature/my-func master
						
Switch branch

$ git checkout feature/my-func
						
Show all branches

$ git branch --all
     master
     *feature/my-func
						

Common commands (3)

Push to remote

$ git push origin branch_name
						
Pull

$ git pull
						
Fetch

$ git fetch --all --prune
						
office@e-dojo.it

EXERCISE

Problem 1: Register and create repository

Create an account on GitHub (if you don't have any) and create a new project.
NOTE: Don't forget the README file!

Problem 2: My first Git repo

Using the already created repository get all your (created so far) course projects into GitHub.

EXERCISE (2)

Problem 3: Branching

Create new functionality for at least 2 of your (created so far) course projects. Cut new branches of your GitHub repo and push your changes.

Problem 4: Pull requests

Add the already created branches to a single PR, request peer review and merge if changes are acceptable.

EXERCISE (3)

Problem 5: Collaboration

Choose one of the already created repositories (by the other teammates) and start contributing to it.
Challenge: Add missing tests to at least 2 of the projects in the repo.

EXERCISE (4)