In software engineering, version control (also known as revision control, source control, or source code management) is a class of systems responsible for managing changes to computer programs, documents, large web sites, or other collections of information.
– - Wikipedia
Version control systems (VCS)
… systems responsible for managing changes …
In an ideal world, things develop linearly:
Every new version is an improvement upon the previous version.
Everyone known what everyone else is doing
In the end, things are simply finished.
digraph {
rankdir=LR
Mon[label="Monday's\n improvements"] [fixedsize=circle]
Tue[label="Tuesday's\n improvements"] [fixedsize=circle]
Wed[label="Wednesday's\n improvements"] [fixedsize=circle]
Mon -> Tue
Tue -> Wed
}
In real world, things develop non-linearly:
A new version can anything between
a complete catastrophe and
a major breakthrough.
People do not know what others are doing
Sometimes we are simply fixing earlier mistakes …
digraph {
rankdir=LR
Mon[label="Monday's\n improvements"] [fixedsize=circle]
Tue[label="Tuesday's\n mistakes"] [fixedsize=circle]
Wed[label="Wednesday's\n corrections"] [fixedsize=circle]
Mon -> Tue
Tue -> Wed
}
Going back to an earlier version
Sometimes, it is easier to simply backtrack to an earlier version …
digraph {
rankdir=LR
Mon[label="Monday's\n improvements"] [fixedsize=circle]
Tue[label="Tuesday's\n mistakes"] [fixedsize=circle]
Wed[label="Wednesday's\n improvements"] [fixedsize=circle]
Mon -> Tue
Mon -> Wed
}
Where is this earlier version ?
CTRL + Z
my_file.txt, my_file.txt.old, …
My project/
2020-08-12/
2020-08-13/
…
Daily home directory backup
Challenges and obstacles
Prone to mistakes
CTRL + Z has limits, overwritten/deleted files, human/hardware error
How much to save?
Individual files? Everything? How much space is required?
How to organize versions?
What is the difference between different versions?
Overall, difficult to manage!
What about the granularity?
digraph {
rankdir=LR
subgraph cluster1 {
t1a [label="Component A\n improvement"] [fixedsize=circle]
t1b [label="Component B\n mistake"] [fixedsize=circle]
t1c [label="Component C\n improvement"] [fixedsize=circle]
label="Mondays's changes"
}
subgraph cluster2 {
t2a [label="Component A\n improvement"] [fixedsize=circle]
t2b [label="Component B\n correction"] [fixedsize=circle]
t2c [label="Component C\n mistake"] [fixedsize=circle]
label="Tuesday's changes"
}
subgraph cluster3 {
t3a [label="Component A\n mistake"] [fixedsize=circle]
t3b [label="Component B\n improvement"] [fixedsize=circle]
t3c [label="Component C\n correction"] [fixedsize=circle]
label="Wednesday's changes"
}
t1a -> t2a
t1b -> t2b
t1c -> t2c
t2a -> t3a
t2b -> t3b
t2c -> t3c
}
This compounds the problems!
How does VCS solve this?
Stores the history using snapshots (commits)
Each snapshot represents the project in a given point of time
Manages snapshots and associated metadata
Naming (tags), comments, dates, authors, etc
Easy to move between different snapshots
Can handle different degrees of granularity
Can handle multiple development paths (branches)
Comparing and joining
VCS makes it easy to compare different snapshots
Named revisions, comments, time information, author information
Diff tools
Search tools
Bisection search
VCS also allows the joining (merging) of different snapshots
Easy to experiment with ideas
Collaboration
One of the primary functions of VCS is to allow collaboration
Usual setup: server (remote) + multiple clients
People work locally and send (push) the changes to the server
VCS keeps track of what has been done and by whom
Safer since mistakes can be easily remedied
The contributions of several people can be merged
Backup
VCS functions as an backup
Locally, the system maintains a copy of each file
Usually only the changes or the files that have changed are stored
Globally, lost files can be recovered from the server
Integration
VCSs such as Git have been integrated with several services
Services such as GitHub can do almost everything for you
Store history, distribute, testing / continuous integration, bug reports, milestones, website, …
Practical use cases
What are the practical use cases for VCS?
Source code
Many VCSs are designed for managing source code
Manage deployment (production, development, testing, etc)
Manage published versions (v0.1 etc)
Manage (experimental) features
Bug hunting
Latex files
Track which version of a manuscript has been
submitted,
revised and/or
accepted
Collaboration between several authors
HPC: batch files and data
Track different version of your batch scripts
Easy to check the used configuration afterwords
Track input and output files
Limited to smallish files
Resume presentation
Introduction to Git – - Fall 2020 Lecture 1: Why use version control? Slides: https://hackmd.io/@hpc2n-git-2020/L1-motivation
{"metaMigratedAt":"2023-06-15T11:42:49.640Z","metaMigratedFrom":"YAML","title":"Lecture 1: Motivation","breaks":false,"description":"Why use version control?","contributors":"[{\"id\":\"bd9c8894-7661-4869-9e25-91a504129025\",\"add\":11675,\"del\":3486}]"}