6.7 Git Tools - Subtree Merging
Now that you’ve seen the difficulties of the submodule system, let’s look at an alternate way to solve the same problem. When Git merges, it looks at what it has to merge together and then chooses an appropriate merging strategy to use. If you’re merging two branches, Git uses a recursive strategy. If you’re merging more than two branches, Git picks the octopus strategy. These strategies are automatically chosen for you because the recursive strategy can handle complex three-way merge situations — for example, more than one common ancestor — but it can only handle merging two branches. The octopus merge can handle multiple branches but is more cautious to avoid difficult conflicts, so it’s chosen as the default strategy if you’re trying to merge more than two branches.
However, there are other strategies you can choose as well. One of them is the subtree merge, and you can use it to deal with the subproject issue. Here you’ll see how to do the same rack embedding as in the last section, but using subtree merges instead.
The idea of the subtree merge is that you have two projects, and one of the projects maps to a subdirectory of the other one and vice versa. When you specify a subtree merge, Git is smart enough to figure out that one is a subtree of the other and merge appropriately — it’s pretty amazing.
You first add the Rack application to your project. You add the Rack project as a remote reference in your own project and then check it out into its own branch:
$ git remote add rack_remote firstname.lastname@example.org:schacon/rack.git $ git fetch rack_remote warning: no common commits remote: Counting objects: 3184, done. remote: Compressing objects: 100% (1465/1465), done. remote: Total 3184 (delta 1952), reused 2770 (delta 1675) Receiving objects: 100% (3184/3184), 677.42 KiB | 4 KiB/s, done. Resolving deltas: 100% (1952/1952), done. From email@example.com:schacon/rack * [new branch] build -> rack_remote/build * [new branch] master -> rack_remote/master * [new branch] rack-0.4 -> rack_remote/rack-0.4 * [new branch] rack-0.9 -> rack_remote/rack-0.9 $ git checkout -b rack_branch rack_remote/master Branch rack_branch set up to track remote branch refs/remotes/rack_remote/master. Switched to a new branch "rack_branch"
Now you have the root of the Rack project in your
rack_branch branch and your own project in the
master branch. If you check out one and then the other, you can see that they have different project roots:
$ ls AUTHORS KNOWN-ISSUES Rakefile contrib lib COPYING README bin example test $ git checkout master Switched to branch "master" $ ls README
You want to pull the Rack project into your
master project as a subdirectory. You can do that in Git with
git read-tree. You’ll learn more about
read-tree and its friends in Chapter 9, but for now know that it reads the root tree of one branch into your current staging area and working directory. You just switched back to your
master branch, and you pull the
rack branch into the
rack subdirectory of your
master branch of your main project:
$ git read-tree --prefix=rack/ -u rack_branch
When you commit, it looks like you have all the Rack files under that subdirectory — as though you copied them in from a tarball. What gets interesting is that you can fairly easily merge changes from one of the branches to the other. So, if the Rack project updates, you can pull in upstream changes by switching to that branch and pulling:
$ git checkout rack_branch $ git pull
Then, you can merge those changes back into your master branch. You can use
git merge -s subtree and it will work fine; but Git will also merge the histories together, which you probably don’t want. To pull in the changes and prepopulate the commit message, use the
--no-commit options as well as the
-s subtree strategy option:
$ git checkout master $ git merge --squash -s subtree --no-commit rack_branch Squash commit -- not updating HEAD Automatic merge went well; stopped before committing as requested
All the changes from your Rack project are merged in and ready to be committed locally. You can also do the opposite — make changes in the
rack subdirectory of your master branch and then merge them into your
rack_branch branch later to submit them to the maintainers or push them upstream.
To get a diff between what you have in your
rack subdirectory and the code in your
rack_branch branch — to see if you need to merge them — you can’t use the normal
diff command. Instead, you must run
git diff-tree with the branch you want to compare to:
$ git diff-tree -p rack_branch
Or, to compare what is in your
rack subdirectory with what the
master branch on the server was the last time you fetched, you can run
$ git diff-tree -p rack_remote/master