owned this note
owned this note
Published
Linked with GitHub
# FreeBSD Git Questions/Concerns
## Background
This Q&A document is intended to (collect and) answer common questions that come up in the context of the FreeBSD Git Working Group's investigation into workflow and process changes that would be involved in a migration to Git as the canonical repository for FreeBSD (src, doc, and ports).
Nothing in here is intended to be perscriptive (i.e., describing decisions that have been made), but rather avoid having the same questions and responses be brought up multiple times on various mailing lists.
Ed Maste wrote most of this document, with input from several others. Please feel free to add comments on this hacmkd document (by clicking on the 💬 to the right of a paragraph), or email either the freebsd-git@ mailing list or emaste@ with comments, questions or concerns.
## What's wrong with Subversion?
Subversion has served FreeBSD well, and the Subversion developers have been good to the FreeBSD project (with respect to maintenance, implementing valuable features, etc.).
That said, today FreeBSD downstream consumers and general open source contributors almost universally use git, as do a sizeable fraction of FreeBSD committers. Having to learn an unfamiliar version control system and different processes is an additional hurdle for today's new developers. (Of course a similar argument can be made about existing FreeBSD developers already familiar with Subversion, having to learn Git -- but the group of developers new to FreeBSD grows over time.)
The current [svn→git GitHub mirror](https://github.com/freebsd/freebsd) works well enough but imposes additional hurdles for bringing work developed in Git back into the canonical Subversion repository and precludes the use of Git for certain operations (such as MFCs and vendor source updates). These operations can be performed only by Subversion committers.
Also, as LLVM developers pointed out in their own Subversion to Git migration plan, many tools are being built on top of git with support for other version control systems added later, or poorly, or not at all.
Finally, Subversion can be very slow depending on the operation being performed and the state of the network between the developer and the repository.
## I heard Git is not a Version Control System. How then could we use it?
This argument is misplaced. The question should not be whether or not git fits into the traditional VCS box, but rather whether it meets the project's requirements in creating reproducible releases and keeping running history of the major lines of development. Git can do those things, in addition to allowing easier and more productive collaboration. Finally, notions of what is or is not an acceptable list of features for a VCS has evolved over time. git and other DVCS have caused a sea change in thinking in this area.
A specific feature of git that's often cited is its ability to rewrite history without metadata to record those details (you can't easily see the "actual" history of a branch). This is mitigated in two ways: the source of truth repo can refuse to accept these changes. Second, git actually does keep recent history of branches (via reflog). It will not preserve that history, though, as in a git world these changes are intended to be used to refine bodies of work that's submitted elsewhere. The history of changes is more logical than physical as changes curated in this way are often logical changes. The mechanics of how the logical changes were created is viewed as an uninteresting detail.
## What's with the funny revision hashes? I want revision numbers.
Revision hashes are a detail of Git's implementation that's exposed to the end user. They are a unique identification of the tree's file content and metadata and the history that lead to that state.
Revision numbers can be simulated via `git rev-list --count HEAD`, which counts the number of commits since the initial commit. This is calculated at run time, so is not instantaneous.
`rev-list --count` is relative to the branch (not repo-wide) and so is not unique across branches. The tuple *(branch, rev-list count)* is unique, and is consistent between clones.
There's a [Phabricator review](https://reviews.freebsd.org/D20462) open for adding this count to uname.
This is an area under active development in Git; see for example this [Microsoft blog post on Generations and Graph Algorithms](https://devblogs.microsoft.com/devops/supercharging-the-git-commit-graph-iii-generations/). With a little work we could implement a Git front-end command to immediately report a suitable proxy for the revision number.
## Are Git clones large (on-disk)?
Because a git clone includes all history (by default) it is somewhat larger than a Subversion checkout (which by default includes an unmodified copy of every file). That said, the difference is minimal.
Ed gave a Git status update at MeetBSD 2018, and reported the following results:
| VCS | Total Size | .git/.svn size |
| --- | ---------- | -------------- |
| Git | 3.9GB | 2.3GB |
| SVN | 3.3GB | 1.6GB |
With defaults for both cases the Git clone was a little under 20% larger than the Subversion checkout.
Git also supports multiple *worktree*s sharing the same *.git* directory, so a developer with two working trees could actually achieve a disk space savings of a little under 20% with Git. Developers with many working trees will see greater savings.
## Do I have to have all of the history locally?
No. Git supports *shallow clones* which include a configurable amount of history. Ed tested this with git version 2.21.0.
A shallow clone can be created with `git clone --depth 1`. Commit hashes are maintained (i.e., are the same as a full clone), and `git push` works.
It is possible to combine shallow cloning with a clone of a single branch, for an even smaller clone: `git clone -b <branch> --depth 1`.
**TODO:** What does `git rev-list HEAD --count` show for a shallow clone?
## What about `$FreeBSD$` tags?
Git does not support expansion of `$` tags (`$Id$`, `$FreeBSD$`, `$Date$` etc.) on the server. It does support [keyword expansion](https://git-scm.com/book/en/v2/Customizing-Git-Git-Attributes) on the client side, expanding them on checkout and squasing them upon commit.
Because the support is implemented client-side and is optional we would likely need to deprecate their use - allowing individual developers to have them expanded locally, but not require them to be expanded for standard processes (such as release engineering). It's a policy question for when or if we exapnd the keywords, not a tools question.
Having the expansion optionally happen on checkout means that the problems with diff collisions which are seen with svn are not seen with git.
## If we move to git can we still have commit mail?
Yes. Commit mail can be generated with a commit hook, as with Subversion. The exact approach taken would vary depending on the hosting model.
## Does git have an appropriate license?
The standard Git client is licensed under the GNU General Public License, v2 (GPLv2). Various Git libraries and other implementations exist under a variety of copyfree, copyleft, and proprietary licenses.
[Game Of Trees](http://gameoftrees.org/) is a copyfree VCS from some OpenBSD developers. It is compatible with Git repos and is now in the FreeBSD ports tree
[OpenGit](https://github.com/khanzf/opengit) shows promise as a BSDL tool, but is a very early work in progress and has not been updated recently.
[go-git](https://github.com/src-d/go-git) needs a simple 3-line wrapper to be sufficient to support cloning ports tree.
## Can I check out only part of the tree?
[Partial Clone](https://www.git-scm.com/docs/partial-clone) support is in progress.
A partial clone should be possible via a command like `git clone <URL> --filter=sparse:path=/bin/ls freebsd-bin-ls`, but note that it requires server-side support which is generally not yet available.
## Have other projects migrated from Subversion to Git?
Many projects have made the transition, and FreeBSD is currently one of very few large, well-established and active open source projects using Subversion.
See the [LLVM project's Git transition](https://llvm.org/docs/Proposals/GitHubMove.html) for a detailed example of a Git migration in a project with broadly similar size, rate of change, history, and community.
There's a Doctor Dobb's articles on [Atlasian's](http://www.drdobbs.com/architecture-and-design/migrating-from-subversion-to-git-and-the/240009175) move from subversion to git.
Many of Apache's project have migrated to git, for example [Maven](https://cwiki.apache.org/confluence/display/MAVEN/Git+Migration).
In addition to these specific examples, a google search shows dozens (if not hundreds) of projects that have migrated to git over the past 15 years. Many of these projects are open source.
## Does Git mandate a Linux-style hierarchical model?
No. Linux's work flow is but one of many ways to use git.
## Does Git track file renames and copies?
Not really; renames, copies and moves are inferred as necessary (e.g. in `git log`). We must determine how much of an issue this is in practice, which may have a different answer for different repositories (src vs doc vs ports).
## What is the recommended workflow to develop with Git?
Good question! There are multiple approaches we could take with a migration to Git, and one of the working group's tasks is to evaluate (prototype/test) several approaches and make a recommendation. One of the explicit tasks of the working group is to ensure that the random problems from unrestrained git usage are avoided by the workflow, or the pitfalls are documented to allow proper tradeoffs to be made depending on the situation.
Like was done with the Subversion migration, a git primer will be created. Like the developer's guide article on Subversion, it will document how to use git in FreeBSD's work flow.
## Can Git help me find out where a bug/regression/panic was introduced?
Yes, see the [git bisect](https://git-scm.com/docs/git-bisect) command. It can be used to choose and check out revisions for the binary search (leaving the build and test to the user), or can operate fully automatically (with a scripted build and test).
It is also able to bisect only changes that affect a subtree (e.g., bisecting only changes that touch files under *sys/* if looking for a kernel regression).
# Please add new questions below
## Use H2 (##) for the questions
## What about checkout by date?
## Any requirements on minimal git version?