Git - Blog Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency. Reset Mon, 11 Jul 2011 00:00:00 +0000 <p>One of the topics that I didn't cover in depth in the Pro Git book is the <code>reset</code> command. Most of the reason for this, honestly, is that I never strongly understood the command beyond the handful of specific use cases that I needed it for. I knew what the command did, but not really how it was designed to work.</p> <p>Since then I have become more comfortable with the command, largely thanks to <a href="">Mark Dominus's article</a> re-phrasing the content of the man-page, which I always found very difficult to follow. After reading that explanation of the command, I now personally feel more comfortable using <code>reset</code> and enjoy trying to help others feel the same way.</p> <p>This post assumes some basic understanding of how Git branching works. If you don't really know what HEAD and the Index are on a basic level, you might want to read chapters 2 and 3 of this book before reading this post.</p> <h2>The Three Trees of Git</h2> <img src="/images/reset/trees.png"/><br/> <p>The way I now like to think about <code>reset</code> and <code>checkout</code> is through the mental frame of Git being a content manager of three different trees. By 'tree' here I really mean "collection of files", not specifically the data structure. (Some Git developers will get a bit mad at me here, because there are a few cases where the Index doesn't exactly act like a tree, but for our purposes it is easier - forgive me).</p> <p>Git as a system manages and manipulates three trees in its normal operation. Each of these is covered in the book, but let's review them.</p> <table id="threetrees"> <tr> <th class="title" colspan="2">Tree Roles</th> </tr><tr> <th>The HEAD</th><td>last commit snapshot, next parent</td> </tr><tr> <th>The Index</th><td>proposed next commit snapshot</td> </tr><tr> <th>The Working Directory</th><td>sandbox</td> </tr> </table> <h3 class="subtitle"> The HEAD <small>last commit snapshot, next parent</small> </h3> <p> The HEAD in Git is the pointer to the current branch reference, which is in turn a pointer to the last commit you made or the last commit that was checked out into your working directory. That also means it will be the parent of the next commit you do. It's generally simplest to think of it as <b>HEAD is the snapshot of your last commit</b>. </p> <p>In fact, it's pretty easy to see what the snapshot of your HEAD looks like. Here is an example of getting the actual directory listing and SHA checksums for each file in the HEAD snapshot:</p> <pre> $ cat .git/HEAD ref: refs/heads/master $ cat .git/refs/heads/master e9a570524b63d2a2b3a7c3325acf5b89bbeb131e $ git cat-file -p e9a570524b63d2a2b3a7c3325acf5b89bbeb131e tree cfda3bf379e4f8dba8717dee55aab78aef7f4daf author Scott Chacon <> 1301511835 -0700 committer Scott Chacon <> 1301511835 -0700 initial commit $ git ls-tree -r cfda3bf379e4f8dba8717dee55aab78aef7f4daf 100644 blob a906cb2a4a904a152... README 100644 blob 8f94139338f9404f2... Rakefile 040000 tree 99f1a6d12cb4b6f19... lib </pre> <h3 class="subtitle"> The Index <small>next proposed commit snapshot</small> </h3> <p> The Index is your proposed next commit. Git populates it with a list of all the file contents that were last checked out into your working directory and what they looked like when they were originally checked out. It's not technically a tree structure, it's a flattened manifest, but for our purposes it's close enough. When you run <code>git commit</code>, that command only looks at your Index by default, not at anything in your working directory. So, it's simplest to think of it as <b>the Index is the snapshot of your next commit</b>. </p> <pre> $ git ls-files -s 100644 a906cb2a4a904a152e80877d4088654daad0c859 0 README 100644 8f94139338f9404f26296befa88755fc2598c289 0 Rakefile 100644 47c6340d6459e05787f644c2447d2595f5d3a54b 0 lib/simplegit.rb </pre> <h3 class="subtitle"> The Working Directory <small>sandbox, scratch area</small> </h3> <p> Finally, you have your working directory. This is where the content of files are placed into actual files on your filesystem so they're easily edited by you. <b>The Working Directory is your scratch space, used to easily modify file content.</b> </p> <pre> $ tree . ├── README ├── Rakefile └── lib └── simplegit.rb 1 directory, 3 files </pre> <h2>The Workflow</h2> <p>So, Git is all about recording snapshots of your project in successively better states by manipulating these three trees, or collections of contents of files.</p> <center><img width="400px" src="/images/reset/workflow.png"/></center><br/> <p>Let's visualize this process. Say you go into a new directory with a single file in it. We'll call this V1 of the file and we'll indicate it in blue. Now we run <code>git init</code>, which will create a Git repository with a HEAD reference that points to an unborn branch (aka, <i>nothing</i>)</p> <center><img width="500px" src="/images/reset/ex2.png"/></center><br/> <p>At this point, only the <b>Working Directory</b> tree has any content.</p> <p>Now we want to commit this file, so we use <code>git add</code> to take content in your Working Directory and populate our Index with the updated content</p> <center><img width="500px" src="/images/reset/ex3.png"/></center><br/> <p>Then we run <code>git commit</code> to take what the Index looks like now and save it as a permanent snapshot pointed to by a commit, which HEAD is then updated to point at.</p> <center><img width="500px" src="/images/reset/ex4.png"/></center><br/> <p>At this point, all three of the trees are the same. If we run <code>git status</code> now, we'll see no changes because they're all the same.</p> <p>Now we want to make a change to that file and commit it. We will go through the same process. First we change the file in our working directory.</p> <center><img width="500px" src="/images/reset/ex5.png"/></center><br/> <p>If we run <code>git status</code> right now we'll see the file in red as "changed but not updated" because that entry differs between our Index and our Working Directory. Next we run <code>git add</code> on it to stage it into our Index.<p> <center><img width="500px" src="/images/reset/ex6.png"/></center><br/> <p>At this point if we run <code>git status</code> we will see the file in green under 'Changes to be Committed' because the Index and HEAD differ - that is, our proposed next commit is now different from our last commit. Those are the entries we will see as 'to be Committed'. Finally, we run <code>git commit</code> to finalize the commit.</p> <center><img width="500px" src="/images/reset/ex7.png"/></center><br/> <p>Now <code>git status</code> will give us no output because all three trees are the same.</p> <p>Switching branches or cloning goes through a similar process. When you checkout a branch, it changes <b>HEAD</b> to point to the new commit, populates your <b>Index</b> with the snapshot of that commit, then checks out the contents of the files in your <b>Index</b> into your <b>Working Directory</b>.</p> <h2>The Role of Reset</h2> <p>So the <code>reset</code> command makes more sense when viewed in this context. It directly manipulates these three trees in a simple and predictable way. It does up to three basic operations.</p> <h3 class="subtitle"> Step 1: Moving HEAD <small>killing me --soft ly</small> </h3> <p> The first thing <code>reset</code> will do is move what HEAD points to. Unlike <code>checkout</code> it does not move what branch HEAD points to, it directly changes the SHA of the reference itself. This means if HEAD is pointing to the 'master' branch, running <code>git reset 9e5e6a4</code> will first of all make 'master' point to <code>9e5e6a4</code> before it does anything else. </p> <center><img width="500px" src="/images/reset/reset-soft.png"/></center><br/> <p>No matter what form of <code>reset</code> with a commit you invoke, this is the first thing it will always try to do. If you add the flag <code>--soft</code>, this is the <b>only</b> thing it will do. With <code>--soft</code>, <code>reset</code> will simply stop there.</p> <p>Now take a second to look at that diagram and realize what it did. It essentially undid the last commit you made. When you run <code>git commit</code>, Git will create a new commit and move the branch that <code>HEAD</code> points to up to it. When you <code>reset</code> back to <code>HEAD~</code> (the parent of HEAD), you are moving the branch back to where it was without changing the Index (staging area) or Working Directory. You could now do a bit more work and <code>commit</code> again to accomplish basically what <code>git commit --amend</code> would have done.</p> <h3 class="subtitle"> Step 2: Updating the Index <small>having --mixed feelings</small> </h3> <p>Note that if you run <code>git status</code> now you'll see in green the difference between the Index and what the new HEAD is.</p> <p>The next thing <code>reset</code> will do is to update the Index with the contents of whatever tree HEAD now points to so they're the same.</p> <center><img width="500px" src="/images/reset/reset-mixed.png"/></center><br/> <p>If you specify the <code>--mixed</code> option, <code>reset</code> will stop at this point. This is also the default, so if you specify no option at all, this is where the command will stop.</p> <p>Now take another second to look at THAT diagram and realize what it did. It still undid your last <code>commit</code>, but also <i>unstaged</i> everything. You rolled back to before you ran all your <code>git add</code>s <i>AND</i> <code>git commit</code>. </p> <h3 class="subtitle"> Step 3: Updating the Working Directory <small>math is --hard, let's go shopping</small> </h3> <p>The third thing that <code>reset</code> will do is to then make the Working Directory look like the Index. If you use the <code>--hard</code> option, it will continue to this stage.</p> <center><img width="500px" src="/images/reset/reset-hard.png"/></center><br/> <p>Finally, take yet a third second to look at <i>that</i> diagram and think about what happened. You undid your last commit, all the <code>git add</code>s, <i>and</i> all the work you did in your working directory.</p> <p>It's important to note at this point that this is the only way to make the <code>reset</code> command dangerous (ie: not working directory safe). Any other invocation of <code>reset</code> can be pretty easily undone, the <code>--hard</code> option cannot, since it overwrites (without checking) any files in the Working Directory. In this particular case, we still have <b>v3</b> version of our file in a commit in our Git DB that we could get back by looking at our <code>reflog</code>, but if we had not committed it, Git still would have overwritten the file.</p> <h3>Overview</h3> <p> That is basically it. The <code>reset</code> command overwrites these three trees in a specific order, stopping when you tell it to. </p> <ul> <li>#1) Move whatever branch HEAD points to <small>(stop if <code>--soft</code>)</small> <li>#2) THEN, make the Index look like that <small>(stop here unless <code>--hard</code>)</small> <li>#3) THEN, make the Working Directory look like that </ul> <p>There are also <code>--merge</code> and <code>--keep</code> options, but I would rather keep things simpler for now - that will be for another article.</p> <p>Boom. You are now a <code>reset</code> master.</p> <h2>Reset with a Path</h2> <p> Well, I lied. That's not actually all. If you specify a path, <code>reset</code> will skip the first step and just do the other ones but limited to a specific file or set of files. This actually sort of makes sense - if the first step is to move a pointer to a different commit, you can't make it point to <i>part</i> of a commit, so it simply doesn't do that part. However, you can use <code>reset</code> to update part of the Index or the Working Directory with previously committed content this way. </p> <p>So, assume we run <code>git reset file.txt</code>. This assumes, since you did not specify a commit SHA or branch that points to a commit SHA, and that you provided no reset option, that you are typing the shorthand for <code>git reset --mixed HEAD file.txt</code>, which will: <ul> <li><strike>#1) Move whatever branch HEAD points to <small>(stop if <code>--soft</code>)</strike></small> <li>#2) THEN, make the Index look like that <small><strike>(stop here unless <code>--hard</code>)</strike></small> </ul> <p>So it essentially just takes whatever <code>file.txt</code> looks like in HEAD and puts that in the Index.</p> <center><img width="500px" src="/images/reset/reset-path1.png"/></center><br/> <p>So what does that do in a practical sense? Well, it <i>unstages</i> the file. If we look at the diagram for that command vs what <code>git add</code> does, we can see that it is simply the opposite. This is why the output of the <code>git status</code> command suggests that you run this to unstage a file.</p> <center><img width="500px" src="/images/reset/reset-path2.png"/></center><br/> <p>We could just as easily not let Git assume we meant "pull the data from HEAD" by specifying a specific commit to pull that file version from to populate our Index by running something like <code>git reset eb43bf file.txt</code>. <center><img width="500px" src="/images/reset/reset-path3.png"/></center><br/> <p>So what does that mean? That functionally does the same thing as if we had reverted the content of the file to <b>v1</b>, ran <code>git add</code> on it, then reverted it back to <b>v3</b> again. If we run <code>git commit</code>, it will record a change that reverts that file back to <b>v1</b>, even though we never actually had it in our Working Directory again.</p> <p>It's also pretty interesting to note that like <code>git add --patch</code>, the <code>reset</code> command will accept a <code>--patch</code> option to unstage content on a hunk-by-hunk basis. So you can selectively unstage or revert content.</p> <h2>A fun example</h2> <p>I may use the term "fun" here a bit loosely, but if this doesn't sound like fun to you, you may drink while doing it. Let's look at how to do something interesting with this newfound power - squashing commits.</p> <p>If you have this history and you're about to push and you want to squash down the last N commits you've done into one awesome commit that makes you look really smart (vs a bunch of commits with messages like "oops.", "WIP" and "forgot this file") you can use <code>reset</code> to quickly and easily do that (as opposed to using <code>git rebase -i</code>).</p> <p>So, let's take a slightly more complex example. Let's say you have a project where the first commit has one file, the second commit added a new file and changed the first, and the third commit changed the first file again. The second commit was a work in progress and you want to squash it down.</p> <center><img width="500px" src="/images/reset/squash-r1.png"/></center><br/> <p>You can run <code>git reset --soft HEAD~2</code> to move the HEAD branch back to an older commit (the first commit you want to keep):</p> <center><img width="500px" src="/images/reset/squash-r2.png"/></center><br/> <p>And then simply run <code>git commit</code> again:</p> <center><img width="500px" src="/images/reset/squash-r3.png"/></center><br/> <p> Now you can see that your reachable history, the history you would push, now looks like you had one commit with the one file, then a second that both added the new file and modified the first to its final state. </p> <h2>Check it out</h2> <p>Finally, some of you may wonder what the difference between <code>checkout</code> and <code>reset</code> is. Well, like <code>reset</code>, <code>checkout</code> manipulates the three trees and it is a bit different depending on whether you give the command a file path or not. So, let's look at both examples separately. </p> <h3>git checkout [branch]</h3> <p>Running <code>git checkout [branch]</code> is pretty similar to running <code>git reset --hard [branch]</code> in that it updates all three trees for you to look like <code>[branch]</code>, but there are two important differences. </p> <p>First, unlike <code>reset --hard</code>, <code>checkout</code> is working directory safe in this invocation. It will check to make sure it's not blowing away files that have changes to them. Actually, this is a subtle difference, because it will update all of the working directory except the files you've modified if it can - it will do a trivial merge between what you're checking out and what's already there. In this case, <code>reset --hard</code> will simply replace everything across the board without checking. </p> <p>The second important difference is how it updates HEAD. Where <code>reset</code> will move the branch that HEAD points to, <code>checkout</code> will move HEAD itself to point to another branch.</p> <p>For instance, if we have two branches, 'master' and 'develop' pointing at different commits, and we're currently on 'develop' (so HEAD points to it) and we run <code>git reset master</code>, 'develop' itself will now point to the same commit that 'master' does.</p> <p>On the other hand, if we instead run <code>git checkout master</code>, 'develop' will not move, HEAD itself will. HEAD will now point to 'master'. So, in both cases we're moving HEAD to point to commit A, but <i>how</i> we do so is very different. <code>reset</code> will move the branch HEAD points to, <code>checkout</code> moves HEAD itself to point to another branch.</p> <center><img width="500px" src="/images/reset/reset-checkout.png"/></center><br/> <h3>git checkout [branch] file</h3> <p>The other way to run <code>checkout</code> is with a file path, which like <code>reset</code>, does not move HEAD. It is just like <code>git reset [branch] file</code> in that it updates the index with that file at that commit, but it also overwrites the file in the working directory. Think of it like <code>git reset --hard [branch] file</code> - it would be exactly the same thing, it is also not working directory safe and it also does not move HEAD. The only difference is that <code>reset</code> with a file name will not accept <code>--hard</code>, so you can't actually run that.</p> <p>Also, like <code>git reset</code> and <code>git add</code>, <code>checkout</code> will accept a <code>--patch</code> option to allow you to selectively revert file contents on a hunk-by-hunk basis.</p> <h2>Cheaters Gonna Cheat</h2> <p>Hopefully now you understand and feel more comfortable with the <code>reset</code> command, but are probably still a little confused about how exactly it differs from <code>checkout</code> and could not possibly remember all the rules of the different invocations.</p> <p>So to help you out, I've created something that I pretty much hate, which is a table. However, if you've followed the article at all, it may be a useful cheat sheet or reminder. The table shows each class of the <code>reset</code> and <code>checkout</code> commands and which of the three trees it updates.</p> <p>Pay especial attention to the 'WD Safe?' column - if it's red, really think about it before you run that command.</p> <table class="rdata"> <tr> <th></th> <th>head</th> <th>index</th> <th>work dir</th> <th>wd safe</th> </tr> <tr class="level"> <th>Commit Level</th> <td colspan="4">&nbsp;</th> </tr> <tr class="even"> <th class="cmd">reset --soft [commit]</th> <td class="yes">REF</td> <td class="no">NO</td> <td class="no">NO</td> <td class="yes-wd">YES</td> </tr> <tr class="odd"> <th class="cmd">reset [commit]</th> <td class="yes">REF</td> <td class="yes">YES</td> <td class="no">NO</td> <td class="yes-wd">YES</td> </tr> <tr class="even"> <th class="cmd">reset --hard [commit]</th> <td class="yes">REF</td> <td class="yes">YES</td> <td class="yes">YES</td> <td class="no-wd">NO</td> </tr> <tr class="odd"> <th class="cmd">checkout [commit]</th> <td class="yes">HEAD</td> <td class="yes">YES</td> <td class="yes">YES</td> <td class="yes-wd">YES</td> </tr> <tr class="level"> <th>File Level</th> <td colspan="4">&nbsp;</th> </tr> <tr class="even"> <th class="cmd">reset (commit) [file]</th> <td class="no">NO</td> <td class="yes">YES</td> <td class="no">NO</td> <td class="yes-wd">YES</td> </tr> <tr class="odd"> <th class="cmd">checkout (commit) [file]</th> <td class="no">NO</td> <td class="yes">YES</td> <td class="yes">YES</td> <td class="no-wd">NO</td> </tr> </table> <p>Good night, and good luck.</p> Notes Wed, 25 Aug 2010 00:00:00 +0000 <p>One of the cool things about Git is that it has strong cryptographic integrity. If you change any bit in the commit data or any of the files it keeps, all the checksums change, including the commit SHA and every commit SHA since that one. However, that means that in order to amend the commit in any way, for instance to add some comments on something or even sign off on a commit, you have to change the SHA of the commit itself.</p> <p>Wouldn&#39;t it be nice if you could add data to a commit without changing its SHA? If only there existed an external mechanism to attach data to a commit without modifying the commit message itself. Happy day! It turns out there exists just such a feature in newer versions of Git! As we can see from the Git 1.6.6 release notes where this new functionality was first introduced:</p> <pre><code>* &quot;git notes&quot; command to annotate existing commits. </code></pre> <p>Need any more be said? Well, maybe. How do you use it? What does it do? How can it be useful? I&#39;m not sure I can answer all of these questions, but let&#39;s give it a try. First of all, how does one use it? </p> <p>Well, to add a note to a specific commit, you only need to run <code>git notes add [commit]</code>, like this:</p> <pre><code>$ git notes add HEAD </code></pre> <p>This will open up your editor to write your commit message. You can also use the <code>-m</code> option to provide the note right on the command line:</p> <pre><code>$ git notes add -m &#39;I approve - Scott&#39; master~1 </code></pre> <p>That will add a note to the first parent on the last commit on the master branch. Now, how to view these notes? The easiest way is with the <code>git log</code> command.</p> <pre><code>$ git log master commit 0385bcc3bc66d1b1ec07346c237061574335c3b8 Author: Ryan Tomayko &lt;; Date: Tue Jun 22 20:09:32 2010 -0700 yield to run block right before accepting connections commit 06ca03a20bb01203e2d6b8996e365f46cb6d59bd Author: Ryan Tomayko &lt;; Date: Wed May 12 06:47:15 2010 -0700 no need to delete these header names now Notes: I approve - Scott </code></pre> <p>You can see the notes appended automatically in the log output. You can only have one note per commit in a namespace though (I will explain namespaces in the next section), so if you want to add a note to that commit, you have to instead edit the existing one. You can either do this by running:</p> <pre><code>$ git notes edit master~1 </code></pre> <p>Which will open a text editor with the existing note so you can edit it:</p> <pre><code>I approve - Scott # # Write/edit the notes for the following object: # # commit 06ca03a20bb01203e2d6b8996e365f46cb6d59bd # Author: Ryan Tomayko &lt;; # Date: Wed May 12 06:47:15 2010 -0700 # # no need to delete these header names now # # kidgloves.rb | 2 -- # 1 files changed, 0 insertions(+), 2 deletions(-) ~ ~ ~ &quot;.git/NOTES_EDITMSG&quot; 13L, 338C </code></pre> <p>Sort of weird, but it works. If you just want to add something to the end of an existing note, you can run <code>git notes append SHA</code>, but only in newer versions of Git (I think 1.7.1 and above).</p> <h2>Notes Namespaces</h2> <p>Since you can only have one note per commit, Git allows you to have multiple namespaces for your notes. The default namespace is called &#39;commits&#39;, but you can change that. Let&#39;s say we&#39;re using the &#39;commits&#39; notes namespace to store general comments but we want to also store bugzilla information for our commits. We can also have a &#39;bugzilla&#39; namespace. Here is how we would add a bug number to a commit under the bugzilla namespace:</p> <pre><code>$ git notes --ref=bugzilla add -m &#39;bug #15&#39; 0385bcc3 </code></pre> <p>However, now you have to tell Git to specifically look in that namespace:</p> <pre><code>$ git log --show-notes=bugzilla commit 0385bcc3bc66d1b1ec07346c237061574335c3b8 Author: Ryan Tomayko &lt;; Date: Tue Jun 22 20:09:32 2010 -0700 yield to run block right before accepting connections Notes (bugzilla): bug #15 commit 06ca03a20bb01203e2d6b8996e365f46cb6d59bd Author: Ryan Tomayko &lt;; Date: Wed May 12 06:47:15 2010 -0700 no need to delete these header names now Notes: I approve - Scott </code></pre> <p>Notice that it also will show your normal notes. You can actually have it show notes from all your namespaces by running <code>git log --show-notes=*</code> - if you have a lot of them, you may want to just alias that. Here is what your log output might look like if you have a number of notes namespaces:</p> <pre><code>$ git log -1 --show-notes=* commit 0385bcc3bc66d1b1ec07346c237061574335c3b8 Author: Ryan Tomayko &lt;; Date: Tue Jun 22 20:09:32 2010 -0700 yield to run block right before accepting connections Notes: I approve of this, too - Scott Notes (bugzilla): bug #15 Notes (build): build successful (8/13/10) </code></pre> <p>You can also switch the current namespace you&#39;re using so that the default for writing and showing notes is not &#39;commits&#39; but, say, &#39;bugzilla&#39; instead. If you export the variable <code>GIT_NOTES_REF</code> to point to something different, then the <code>--ref</code> and <code>--show-notes</code> options are not necessary. For example:</p> <pre><code>$ export GIT_NOTES_REF=refs/notes/bugzilla </code></pre> <p>That will set your default to &#39;bugzilla&#39; instead. It has to start with the &#39;refs/notes/&#39; though.</p> <h2>Sharing Notes</h2> <p>Now, here is where the general usability of this really breaks down. I am hoping that this will be improved in the future and I put off writing this post because of my concern with this phase of the process, but I figured it has interesting enough functionality as-is that someone might want to play with it.</p> <p>So, the notes (as you may have noticed in the previous section) are stored as references, just like branches and tags. This means you can push them to a server. However, Git has a bit of magic built in to expand a branch name like &#39;master&#39; to what it really is, which is &#39;refs/heads/master&#39;. Unfortunately, Git has no such magic built in for notes. So, to push your notes to a server, you cannot simply run something like <code>git push origin bugzilla</code>. Git will do this:</p> <pre><code>$ git push origin bugzilla error: src refspec bugzilla does not match any. error: failed to push some refs to &#39;; </code></pre> <p>However, you can push anything under &#39;refs/&#39; to a server, you just need to be more explicit about it. If you run this it will work fine:</p> <pre><code>$ git push origin refs/notes/bugzilla Counting objects: 3, done. Delta compression using up to 2 threads. Compressing objects: 100% (2/2), done. Writing objects: 100% (3/3), 263 bytes, done. Total 3 (delta 0), reused 0 (delta 0) To * [new branch] refs/notes/bugzilla -&gt; refs/notes/bugzilla </code></pre> <p>In fact, you may want to just make that <code>git push origin refs/notes/*</code> which will push all your notes. This is what Git does normally for something like tags. When you run <code>git push origin --tags</code> it basically expands to <code>git push origin refs/tags/*</code>.</p> <h2>Getting Notes</h2> <p>Unfortunately, getting notes is even more difficult. Not only is there no <code>git fetch --notes</code> or something, you have to specify both sides of the refspec (as far as I can tell).</p> <pre><code>$ git fetch origin refs/notes/*:refs/notes/* remote: Counting objects: 12, done. remote: Compressing objects: 100% (8/8), done. remote: Total 12 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (12/12), done. From * [new branch] refs/notes/bugzilla -&gt; refs/notes/bugzilla </code></pre> <p>That is basically the only way to get them into your repository from the server. Yay. If you want to, you can setup your Git config file to automatically pull them down though. If you look at your <code>.git/config</code> file you should have a section that looks like this:</p> <pre><code>[remote &quot;origin&quot;] fetch = +refs/heads/*:refs/remotes/origin/* url = </code></pre> <p>The &#39;fetch&#39; line is the refspec of what Git will try to do if you run just <code>git fetch origin</code>. It contains the magic formula of what Git will fetch and store local references to. For instance, in this case it will take every branch on the server and give you a local branch under &#39;remotes/origin/&#39; so you can reference the &#39;master&#39; branch on the server as &#39;remotes/origin/master&#39; or just &#39;origin/master&#39; (it will look under &#39;remotes&#39; when it&#39;s trying to figure out what you&#39;re doing). If you change that line to <code>fetch = +refs/heads/*:refs/remotes/manamana/*</code> then even though your remote is named &#39;origin&#39;, the master branch from your &#39;origin&#39; server will be under &#39;manamana/master&#39;. </p> <p>Anyhow, you can use this to make your notes fetching easier. If you add multiple <code>fetch</code> lines, it will do them all. So in addition to the current <code>fetch</code> line, you can add a line that looks like this:</p> <pre><code> fetch = +refs/notes/*:refs/notes/* </code></pre> <p>Which says also get all the notes references on the server and store them as though they were local notes. Or you can namespace them if you want, but that can cause issues when you try to push them back again.</p> <h2>Collaborating on Notes</h2> <p>Now, this is where the main problem is. Merging notes is super difficult. This means that if you pull down someone&#39;s notes, you edit any note in a namespace locally and the other developer edits any note in that same namespace, you&#39;re going to have a hard time getting them back in sync. When the second person tries to push their notes it will look like a non-fast-forward just like a normal branch update, but unlike a normal branch you can&#39;t just run <code>git pull</code> and then try again. You have to check out your notes ref as if it were a normal branch, which will look ridiculously confusing and then do the merge and then switch back. It is do-able, but probably not something you really want to do.</p> <p>Because of this, it&#39;s probably best to namespace your notes or better just have an automated process create them (like build statuses or bugzilla artifacts). If only one entity is updating your notes, you won&#39;t have merge issues. However, if you want to use them to comment on commits within a team, it is going to be a bit painful.</p> <p>So far, I&#39;ve heard of people using them to have their ticketing system attach metadata automatically or have <a href="">a system</a> attach associated mailing list emails to commits they concern. Other people just use them entirely locally without pushing them anywhere to store reminders for themselves and whatnot.<br> Probably a good start, but the ambitious among you may come up with something else interesting to do. Let me know!</p> Pro Git Zh Wed, 9 Jun 2010 00:00:00 +0000 <p>The amazing <a href="">Chunzi</a>, in addition to translating and helping to coordinate the translation of the Chinese version of Pro Git, has just sent me a draft epub of the book in Chinese.</p> <p>As he says:</p> <p>&quot;下载地址:</p> <p>此书基本完成翻译,但仍有大半尚未审阅。今心血来潮,做了 epub 版本,可在 iPad 上阅读。生成代码在。</p> <p>既然是开源图书,所以优先级比较低。后续更新时再作新版。&quot;</p> <p>I have also uploaded the file to S3 in addition to his link. If you want Pro Git in Chinese on your iPad, you can download it <a href="">here</a>.</p> <p>祝你好运!</p> Pro Git On Kindle Sun, 6 Jun 2010 00:00:00 +0000 <p>When Pro Git was first released, I asked about being able to get it on my Kindle. In fact, one of the very first people to read the book was <a href="">@adelcambre</a> with a mobi file I generated myself. It was horrible looking, because I didn&#39;t do it very well, but it did work. My editor at Apress wanted to get a professional Kindle version produced, but wasn&#39;t sure if it was going to get done anytime soon. </p> <p>Well, I just randomly found out that it did in fact finally get done. About 9 months after it was first published, I saw on my Amazon referal account that I had sold a Kindle version, which confused me. However, I went to Amazon and there it was:</p> <p><a href=""><img border="0" src=",BottomRight,-3,34_AA300_SH20_OU01_.jpg"></a><img src="" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></p> <p>I assume it looks much, much better than the version I originally did for Andy. (I should probably get myself a copy). So, if you feel like reading it on the Kindle, <a href="">have at it</a>. There have also been a number of people who have asked me how they can support the book without getting a dead-tree version, so here&#39;s your chance. You can even get a Kindle reader for your computer, so you can search through it and whatnot. Enjoy!</p> <p>[editor&#39;s note: since this post was published, the complete contents of the book have been made available as <a href=""><code>.mobi</code> and <code>.epub</code> e-books</a>; you may still buy the Kindle version if you want to financially support the authors].</p> Progit For The Ipad Mon, 17 May 2010 00:00:00 +0000 <p>The awesome guys at <a href="">Media Temple</a> have converted the Pro Git book into ePub format that looks great on the iBook reader on the iPad.</p> <p>You can download it <a href="">here</a> and just drop the ePub file onto iTunes to upload it into your iPad. For reading Pro Git on the road, this is a great format and MT did an amazing job at making it look good on that platform.</p> <p>Check it out - here is Pro Git in your bookshelf:</p> <p><img src=""/></p> <p>The title page:</p> <p><img src=""/></p> <p>And a chapter with illustrations:</p> <p><img src=""/></p> <p>Sweet - thanks guys!</p> <p><em>Update:</em> I&#39;ve added a Stanza Atom catalog feed for those of you who want to load the epub book onto your Android running <a href="">Aldiko</a> or iPhone running <a href="">Stanza</a>. To download the book, simply add a custom catalog with this URL:</p> <pre> </pre> Progit Cliffnotes Thu, 22 Apr 2010 00:00:00 +0000 <p>Jason Meridth has gone through the whole Pro Git book and compiled a list of notes of some useful tips and whatnot in the book. If you have the print version, he marks which pages all the tips were found on. </p> <p>I thought it might be a useful companion, you can find it <a href="">here</a>.</p> Environment Sun, 11 Apr 2010 00:00:00 +0000 <p>One of the things that people that come from the Subversion world tend to find pretty cool about Git is that there is no <code>.svn</code> directory in every subdirectory of your project, but instead just one <code>.git</code> directory in the root of your project. Actually, it&#39;s even better than that. The <code>.git</code> directory does not even need to actually be within your project. Git allows you to tell it where your <code>.git</code> directory is, and there are a couple of ways to do that.</p> <p>Let&#39;s say you have your project and want to move the <code>.git</code> directory somewhere else. First let&#39;s see what happens when we move our <code>.git</code> directory without telling Git.</p> <pre><code>$ git log --oneline 6e948ec my second commit fda8c93 my initial commit $ mv .git /opt/myproject.git $ git log fatal: Not a git repository (or any of the parent directories): .git </code></pre> <p>Well, since Git can&#39;t find a <code>.git</code> directory, it appears that you are simply in a directory that is not controlled by Git. However, it&#39;s pretty easy to tell Git to look elsewhere by providing the <code>--git-dir</code> option to any Git call:</p> <pre><code>$ git --git-dir=/opt/myproject.git log --oneline 6e948ec my second commit fda8c93 my initial commit </code></pre> <p>However, you probably don&#39;t want to do that for every Git call, as that is a lot of typing. You could create a shell alias, but you can also export an environment variable called <code>GIT_DIR</code>.</p> <pre><code>$ export GIT_DIR=/opt/myproject.git $ git log --oneline 6e948ec my second commit fda8c93 my initial commit </code></pre> <p>There are a number of ways to customize Git functionality via specific environment variables. You can also tell Git where your working directory is with <code>GIT_WORK_TREE</code>, so you can run the Git commands from any directory you are in, not just the current working directory. To see this, first we&#39;ll change a file and then change directories and run <code>git status</code>.</p> <pre><code>$ echo &#39;test&#39; &gt;&gt; README $ git status --short M README </code></pre> <p>OK, but now if we change working directories, we&#39;ll get weird output.</p> <pre><code>$ cd /tmp $ git status --short D README ?? .ksda.1F5/ ?? aprKhGx02 ?? qlm.log ?? qlmlog.txt ?? smsi02122 </code></pre> <p>Now Git is comparing your last commit to what is in your current working directory. However, you can tell it where your real Git working directory is without being in it, either with the <code>--work-tree</code> option or by exporting the <code>GIT_WORK_TREE</code> variable:</p> <pre><code>$ git --work-tree=/tmp/myproject status --short M README $ export GIT_WORK_TREE=/tmp/myproject $ git status --short M README </code></pre> <p>Now you&#39;re doing operations on a Git repository outside of your working directory, which you&#39;re not even in.</p> <p>The last interesting variable you can set is your staging area. That is normally in the <code>.git/index</code> file, but again, you can set it somewhere else, so that you can have multiple staging areas that you can switch between if you want.</p> <pre><code>$ export GIT_INDEX_FILE=/tmp/index1 $ git add README $ git status # On branch master # Changes to be committed: # (use &quot;git reset HEAD &lt;file&gt;...&quot; to unstage) # # modified: README # </code></pre> <p>Now we have the README file changed staged in the new index file. If we switch back to our original index, we can see that the file is no longer staged:</p> <pre><code>$ export GIT_INDEX_FILE=/opt/myproject.git/index $ git status # On branch master # Changed but not updated: # (use &quot;git add &lt;file&gt;...&quot; to update what will be committed) # (use &quot;git checkout -- &lt;file&gt;...&quot; to discard changes in working directory) # # modified: README # no changes added to commit (use &quot;git add&quot; and/or &quot;git commit -a&quot;) </code></pre> <p>This is not quite as useful in day to day work, but it is pretty cool for building arbitrary trees and whatnot. We&#39;ll explore how to use that to do neat things in a future post when we talk more about some of the lower level Git plumbing commands.</p> Replace Wed, 17 Mar 2010 00:00:00 +0000 <p>In another of my series of &quot;seemingly hidden Git features&quot;, I would like to introduce <code>git replace</code>. Now, documentation exists for <code>git replace</code>, but it is rather unclear on what you would actually use it for (or even what it really does), so let&#39;s go through a couple of examples as it really is quite powerful.</p> <p>The <code>replace</code> command basically will take one object in your Git database and for most purposes replace it with another object. This is most commonly useful for replacing one commit in your history with another one.</p> <p>For example, say you want to split your history into one short history for new developers and one much longer and larger history for people interested in data mining. You can graft one history onto the other by <code>replace</code>ing the earliest commit in the new line with the latest commit on the older one. This is nice because it means that you don&#39;t actually have to rewrite every commit in the new history, as you would normally have to do to join them together (because the parentage affects the SHAs).</p> <p>Let&#39;s try this out. Let&#39;s take an existing repository, split it into two repositories, one recent and one historical, and then we&#39;ll see how we can recombine them without modifying the recent repositories SHA values via <code>replace</code>.</p> <p>We&#39;ll use a simple repository with five simple commits:</p> <pre><code>$ git log --oneline ef989d8 fifth commit c6e1e95 fourth commit 9c68fdc third commit 945704c second commit c1822cf first commit </code></pre> <p>We want to break this up into two lines of history. One line goes from commit one to commit four - that will be the historical one. The second line will just be commits four and five - that will be the recent history.</p> <p><center><img src="/images/replace1.png"></center></p> <p>Well, creating the historical history is easy, we can just put a branch in the history and then push that branch to the master branch of a new remote repository.</p> <pre><code>$ git branch history c6e1e95 $ git log --oneline --decorate ef989d8 (HEAD, master) fifth commit c6e1e95 (history) fourth commit 9c68fdc third commit 945704c second commit c1822cf first commit </code></pre> <p><center><img src="/images/replace2.png"></center></p> <p>Now we can push the new history branch to the master branch of our new repository:</p> <pre><code>$ git remote add history $ git push history history:master Counting objects: 12, done. Delta compression using up to 2 threads. Compressing objects: 100% (4/4), done. Writing objects: 100% (12/12), 907 bytes, done. Total 12 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (12/12), done. To * [new branch] history -&gt; master </code></pre> <p>OK, so our history is published. Now the harder part is truncating our current history down so it&#39;s smaller. We need an overlap so we can replace a commit in one with an equivalent commit in the other, so we&#39;re going to truncate this to just commits four and five (so commit four overlaps).</p> <pre><code>$ git log --oneline --decorate ef989d8 (HEAD, master) fifth commit c6e1e95 (history) fourth commit 9c68fdc third commit 945704c second commit c1822cf first commit </code></pre> <p>I think it&#39;s useful in this case to create a base commit that has instructions on how to expand the history, so other developers know what to do if they hit the first commit in the truncated history and need more. So, what we&#39;re going to do is create an initial commit object as our base point with instructions, then rebase the remaining commits (four and five) on top of it. To do that, we need to choose a point to split at, which for us is the third commit, which is <code>9c68fdc</code> in SHA-speak. So, our base commit will be based off of that tree. We can create our base commit using the <code>commit-tree</code> command, which just takes a tree and will give us a brand new, parentless commit object SHA back.</p> <pre><code>$ echo &#39;get history from blah blah blah&#39; | git commit-tree 9c68fdc^{tree} 622e88e9cbfbacfb75b5279245b9fb38dfea10cf </code></pre> <p><center><img src="/images/replace3.png"></center></p> <p>OK, so now that we have a base commit, we can rebase the rest of our history on top of that with <code>git rebase --onto</code>. The <code>--onto</code> argument will be the SHA we just got back from <code>commit-tree</code> and the rebase point will be the third commit (<code>9c68fdc</code> again):</p> <pre><code>$ git rebase --onto 622e88 9c68fdc First, rewinding head to replay your work on top of it... Applying: fourth commit Applying: fifth commit </code></pre> <p><center><img src="/images/replace4.png"></center></p> <p>OK, so now we&#39;ve re-written our recent history on top of a throw away base commit that now has instructions in it on how to reconstitute the entire history if we wanted to. Now let&#39;s see what those instructions would be (here is where <code>replace</code> finally comes into play).</p> <p>So to get the history data after cloning this truncated repository, one would have to add a remote for the historical repository and fetch:</p> <pre><code>$ git remote add history git:// $ git fetch history From git:// * [new branch] master -&gt; history/master </code></pre> <p>Now the collaborator would have their recent commits in the &#39;master&#39; branch and the historical commits in the &#39;history/master&#39; branch.</p> <pre><code>$ git log --oneline master e146b5f fifth commit 81a708d fourth commit 622e88e get history from blah blah blah $ git log --oneline history/master c6e1e95 fourth commit 9c68fdc third commit 945704c second commit c1822cf first commit </code></pre> <p>To combine them, you can simply call <code>git replace</code> with the commit you want to replace and then the commit you want to replace it with. So we want to replace the &#39;fourth&#39; commit in the master branch with the &#39;fourth&#39; commit in the &#39;history/master&#39; branch:</p> <pre><code>$ git replace 81a708d c6e1e95 </code></pre> <p>Now, if you look at the history of the <code>master</code> branch, it looks like this:</p> <pre><code>$ git log --oneline e146b5f fifth commit 81a708d fourth commit 9c68fdc third commit 945704c second commit c1822cf first commit </code></pre> <p>Cool, right? Without having to change all the SHAs upstream, we were able to replace one commit in our history with an entirely different commit and all the normal tools (<code>bisect</code>, <code>blame</code>, etc) will work how we would expect them to.</p> <p><center><img src="/images/replace5.png"></center></p> <p>Interestingly, it still shows <code>81a708d</code> as the SHA, even though it&#39;s actually using the <code>c6e1e95</code> commit data that we replaced it with. Even if you run a command like <code>cat-file</code>, it will show you the replaced data:</p> <pre><code>$ git cat-file -p 81a708d tree 7bc544cf438903b65ca9104a1e30345eee6c083d parent 9c68fdceee073230f19ebb8b5e7fc71b479c0252 author Scott Chacon &lt;; 1268712581 -0700 committer Scott Chacon &lt;; 1268712581 -0700 fourth commit </code></pre> <p>Remember that the actual parent of <code>81a708d</code> was our placeholder commit (<code>622e88e</code>), not <code>9c68fdce</code> as it states here.</p> <p>The other cool thing is that this is kept in our references:</p> <pre><code>$ git for-each-ref e146b5f14e79d4935160c0e83fb9ebe526b8da0d commit refs/heads/master c6e1e95051d41771a649f3145423f8809d1a74d4 commit refs/remotes/history/master e146b5f14e79d4935160c0e83fb9ebe526b8da0d commit refs/remotes/origin/HEAD e146b5f14e79d4935160c0e83fb9ebe526b8da0d commit refs/remotes/origin/master c6e1e95051d41771a649f3145423f8809d1a74d4 commit refs/replace/81a708dd0e167a3f691541c7a6463343bc457040 </code></pre> <p>This means that it&#39;s easy to share our replacement with others, because we can push this to our server and other people can easily download it. This is not that helpful in the history grafting scenario we&#39;ve gone over here (since everyone would be downloading both histories anyhow, so why separate them?) but it can be useful in other circumstances. I&#39;ll cover some other interesting scenarios in another post - I think this is probably enough to process for now.</p> Bundles Wed, 10 Mar 2010 00:00:00 +0000 <p>The scenario is thus: you need to sneakernet a <code>git push</code>. Maybe your network is down and you want to send changes to your co-workers. Perhaps you&#39;re working somewhere onsite and don&#39;t have access to the local network for security reasons. Maybe your wireless/ethernet card just broke. Maybe you don&#39;t have access to a shared server for the moment, you want to email someone updates and you don&#39;t want to transfer 40 commits via <code>format-patch</code>.</p> <p>Enter <code>git bundle</code>. The <code>bundle</code> command will package up everything that would normally be pushed over the wire with a <code>git push</code> command into a binary file that you can email or sneakernet around, then unbundle into another repository.</p> <p>Let&#39;s see a simple example. Let&#39;s say you have a repository with two commits:</p> <pre><code>$ git log commit 9a466c572fe88b195efd356c3f2bbeccdb504102 Author: Scott Chacon &lt;; Date: Wed Mar 10 07:34:10 2010 -0800 second commit commit b1ec3248f39900d2a406049d762aa68e9641be25 Author: Scott Chacon &lt;; Date: Wed Mar 10 07:34:01 2010 -0800 first commit </code></pre> <p>If you want to send that repository to someone and you don&#39;t have access to a repository to push to, or simply don&#39;t want to set one up, you can bundle it.</p> <pre><code>$ git bundle create repo.bundle master Counting objects: 6, done. Delta compression using up to 2 threads. Compressing objects: 100% (2/2), done. Writing objects: 100% (6/6), 441 bytes, done. Total 6 (delta 0), reused 0 (delta 0) </code></pre> <p>Now you have a file named <code>repo.bundle</code> that has all the data needed to re-create the repository. You can email that to someone else, or put it on a USB drive and walk it over.</p> <p>Now on the other side, say you are sent this <code>repo.bundle</code> file and want to work on the project.</p> <pre><code>$ git clone repo.bundle -b master repo Initialized empty Git repository in /private/tmp/bundle/repo/.git/ $ cd repo $ git log --oneline 9a466c5 second commit b1ec324 first commit </code></pre> <p>I had to specify <code>-b master</code> because otherwise it couldn&#39;t find the HEAD reference for some reason, but you may not need to do that. The point is, you have now cloned directly from a file, rather than from a remote server.</p> <p>Now let&#39;s say you do three commits on it and want to send the new commits back via a bundle on a usb stick or email.</p> <pre><code>$ git log --oneline 71b84da last commit - second repo c99cf5b fourth commit - second repo 7011d3d third commit - second repo 9a466c5 second commit b1ec324 first commit </code></pre> <p>First we need to determine the range of commits we want to include in the bundle. The easiest way would have been to drop a branch when we started, so we could say <code>start_branch..master</code> or <code>master ^start_branch</code>, but if we didn&#39;t we can just list the starting SHA explicitly:</p> <pre><code>$ git log --oneline master ^9a466c5 71b84da last commit - second repo c99cf5b fourth commit - second repo 7011d3d third commit - second repo </code></pre> <p>So we have the list of commits we want to include in the bundle, let&#39;s bundle em up. We do that with the <code>git bundle create</code> command, giving it a filename we want our bundle to be and the range of commits we want to go into it.</p> <pre><code>$ git bundle create commits.bundle master ^9a466c5 Counting objects: 11, done. Delta compression using up to 2 threads. Compressing objects: 100% (3/3), done. Writing objects: 100% (9/9), 775 bytes, done. Total 9 (delta 0), reused 0 (delta 0) </code></pre> <p>Now we will have a <code>commits.bundle</code> file in our directory. If we take that and send it to our partner, she can then import it into the original repository, even if more work has been done there in the meantime.</p> <p>When she gets the bundle, she can inspect it to see what it contains before she imports it into her repository. The first command is the <code>bundle verify</code> command that will make sure the file is actually a valid Git bundle and that you have all the necessary ancestors to reconstitute it properly.</p> <pre><code>$ git bundle verify ../commits.bundle The bundle contains 1 ref 71b84daaf49abed142a373b6e5c59a22dc6560dc refs/heads/master The bundle requires these 1 ref 9a466c572fe88b195efd356c3f2bbeccdb504102 second commit ../commits.bundle is okay </code></pre> <p>If the bundler had created a bundle of just the last two commits they had done, rather than all three, the original repository would not be able to import it, since it is missing requisite history. The <code>verify</code> command would have looked like this instead:</p> <pre><code>$ git bundle verify ../commits-bad.bundle error: Repository lacks these prerequisite commits: error: 7011d3d8fc200abe0ad561c011c3852a4b7bbe95 third commit - second repo </code></pre> <p>However, our first bundle is valid, so we can fetch in commits from it. If you want to see what branches are in the bundle that can be imported, there is also a command to just list the heads:</p> <pre><code>$ git bundle list-heads ../commits.bundle 71b84daaf49abed142a373b6e5c59a22dc6560dc refs/heads/master </code></pre> <p>The <code>verify</code> sub-command will tell you the heads, too, as will a normal <code>git ls-remote</code> command, which you may have used for debugging before. The point is to see what can be pulled in, so you can use the <code>fetch</code> or <code>pull</code> commands to import commits from this bundle. Here we&#39;ll fetch the &#39;master&#39; branch of the bundle to a branch named &#39;other-master&#39; in our repository:</p> <pre><code>$ git fetch ../commits.bundle master:other-master From ../commits.bundle * [new branch] master -&gt; other-master </code></pre> <p>Now we can see that we have the imported commits on the &#39;other-master&#39; branch as well as any commits we&#39;ve done in the meantime in our own &#39;master&#39; branch.</p> <pre><code>$ git log --oneline --decorate --graph --all * 8255d41 (HEAD, master) third commit - first repo | * 71b84da (other-master) last commit - second repo | * c99cf5b fourth commit - second repo | * 7011d3d third commit - second repo |/ * 9a466c5 second commit * b1ec324 first commit </code></pre> <p>So, <code>git bundle</code> can be really useful for doing network-y, share-y operations when you don&#39;t have the proper network or shared repository to do so.</p> Rerere Mon, 8 Mar 2010 00:00:00 +0000 <p>One of the things I didn&#39;t touch on at all in the book is the <code>git rerere</code> functionality. This also came up recently during one of my trainings, and I realize that a lot of people probably could use this, so I wanted to let you all know about it.</p> <p>The <a href=""><code>git rerere</code></a> functionality is a bit of a hidden feature (Git actually has a lot of cool hidden features, if you haven&#39;t figured that out yet). The name stands for &quot;reuse recorded resolution&quot; and as the name implies, it allows you to ask Git to remember how you&#39;ve resolved a hunk conflict so that the next time it sees the same conflict, Git can automatically resolve it for you.</p> <p>There are a number of scenarios in which this functionality might be really handy. One of the examples that is mentioned in the documentation is if you want to make sure a long lived topic branch will merge cleanly but don&#39;t want to have a bunch of intermediate merge commits. With <code>rerere</code> turned on you can merge occasionally, resolve the conflicts, then back out the merge. If you do this continuously, then the final merge should be easy because <code>rerere</code> can just do everything for you automatically.</p> <p>This same tactic can be used if you want to keep a branch rebased so you don&#39;t have to deal with the same rebasing conflicts each time you do it. Or if you want to take a branch that you merged and fixed a bunch of conflicts and then decide to rebase it instead - you likely won&#39;t have to do all the same conflicts again.</p> <p>The other situation I can think of is where you merge a bunch of evolving topic branches together into a testable head occasionally. If the tests fail, you can rewind the merges and re-do them without the topic branch that made the tests fail without having to re-resolve the conflicts again.</p> <p>To enable the rerere functionality, you simply have to run this config setting:</p> <pre><code>$ git config --global rerere.enabled true </code></pre> <p>You can also turn it on by creating the <code>.git/rr-cache</code> directory in a specific repository, but I think the config setting is clearer, and it can be done globally.</p> <p>Now let&#39;s see a simple example. If we have a file that looks like this:</p> <pre><code>#! /usr/bin/env ruby def hello puts &#39;hello world&#39; end </code></pre> <p>and in one branch we change the word &#39;hello&#39; to &#39;hola&#39;, then in another branch we change the &#39;world&#39; to &#39;mundo&#39;.</p> <p><img src="/images/rerere1.png"></p> <p>When we merge the two branches together, we&#39;ll get a merge conflict:</p> <pre><code>$ git merge i18n-world Auto-merging hello.rb CONFLICT (content): Merge conflict in hello.rb Recorded preimage for &#39;hello.rb&#39; Automatic merge failed; fix conflicts and then commit the result. </code></pre> <p>You should notice the new line <code>Recorded preimage for FILE</code> in there. Otherwise it should look exactly like a normal merge conflict. At this point, <code>rerere</code> can tell us some stuff. Normally, you might run <code>git status</code> at this point to see what all conflicted:</p> <pre><code>$ git status # On branch master # Unmerged paths: # (use &quot;git reset HEAD &lt;file&gt;...&quot; to unstage) # (use &quot;git add &lt;file&gt;...&quot; to mark resolution) # # both modified: hello.rb # </code></pre> <p>However, <code>git rerere</code> will also tell you what it has recorded the pre-merge state for with <code>git rerere status</code>:</p> <pre><code>$ git rerere status hello.rb </code></pre> <p>And <code>git rerere diff</code> will show the current state of the resolution - what you started with to resolve and what you&#39;ve resolved it to.</p> <pre><code>$ git rerere diff --- a/hello.rb +++ b/hello.rb @@ -1,11 +1,11 @@ #! /usr/bin/env ruby def hello -&lt;&lt;&lt;&lt;&lt;&lt;&lt; - puts &#39;hello mundo&#39; -======= +&lt;&lt;&lt;&lt;&lt;&lt;&lt; HEAD puts &#39;hola world&#39; -&gt;&gt;&gt;&gt;&gt;&gt;&gt; +======= + puts &#39;hello mundo&#39; +&gt;&gt;&gt;&gt;&gt;&gt;&gt; i18n-world end </code></pre> <p>Also (and this isn&#39;t really related to <code>rerere</code>), you can use <code>ls-files -u</code> to see the conflicted files and the before, left and right versions:</p> <pre><code>$ git ls-files -u 100644 39804c942a9c1f2c03dc7c5ebcd7f3e3a6b97519 1 hello.rb 100644 a440db6e8d1fd76ad438a49025a9ad9ce746f581 2 hello.rb 100644 54336ba847c3758ab604876419607e9443848474 3 hello.rb </code></pre> <p>Anyhow, so now you resolve it to just be &quot;puts &#39;hola mundo&#39;&quot; and you can run the <code>rerere diff</code> command again to see what rerere will remember:</p> <pre><code>$ git rerere diff --- a/hello.rb +++ b/hello.rb @@ -1,11 +1,7 @@ #! /usr/bin/env ruby def hello -&lt;&lt;&lt;&lt;&lt;&lt;&lt; - puts &#39;hello mundo&#39; -======= - puts &#39;hola world&#39; -&gt;&gt;&gt;&gt;&gt;&gt;&gt; + puts &#39;hola mundo&#39; end </code></pre> <p>So that basically says, when I see a hunk conflict that has &#39;hello mundo&#39; on one side and &#39;hola world&#39; on the other, resolve it to &#39;hola mundo&#39;.</p> <p>Now we can mark it as resolved and commit it:</p> <pre><code>$ git add hello.rb $ git commit Recorded resolution for &#39;hello.rb&#39;. [master 68e16e5] Merge branch &#39;i18n&#39; </code></pre> <p>You can see that it &quot;Recorded resolution for FILE&quot;.</p> <p><img src="/images/rerere2.png"></p> <p>Now, let&#39;s undo that merge and then rebase it on top of our master branch instead.</p> <pre><code>$ git reset --hard HEAD^ HEAD is now at ad63f15 i18n the hello </code></pre> <p>Our merge is undone. Now let&#39;s rebase the topic branch.</p> <pre><code>$ git checkout i18n-world Switched to branch &#39;i18n-world&#39; $ git rebase master First, rewinding head to replay your work on top of it... Applying: i18n one word Using index info to reconstruct a base tree... Falling back to patching base and 3-way merge... Auto-merging hello.rb CONFLICT (content): Merge conflict in hello.rb Resolved &#39;hello.rb&#39; using previous resolution. Failed to merge in the changes. Patch failed at 0001 i18n one word </code></pre> <p>Now, we got the same merge conflict like we expected, but check out the <code>Resolved FILE using previous resolution</code> line. If we look at the file, we&#39;ll see that it&#39;s already been resolved:</p> <pre><code>$ cat hello.rb #! /usr/bin/env ruby def hello puts &#39;hola mundo&#39; end </code></pre> <p>Also, <code>git diff</code> will show you how it was automatically re-resolved:</p> <pre><code>$ git diff diff --cc hello.rb index a440db6,54336ba..0000000 --- a/hello.rb +++ b/hello.rb @@@ -1,7 -1,7 +1,7 @@@ #! /usr/bin/env ruby def hello - puts &#39;hola world&#39; - puts &#39;hello mundo&#39; ++ puts &#39;hola mundo&#39; end </code></pre> <p><img src="/images/rerere3.png"></p> <p>You can also recreate the conflicted file state with the <code>checkout</code> command:</p> <pre><code>$ git checkout --conflict=merge hello.rb $ cat hello.rb #! /usr/bin/env ruby def hello &lt;&lt;&lt;&lt;&lt;&lt;&lt; ours puts &#39;hola world&#39; ======= puts &#39;hello mundo&#39; &gt;&gt;&gt;&gt;&gt;&gt;&gt; theirs end </code></pre> <p>That might be a new command to you as well, the <code>--conflict</code> option to <code>git checkout</code>. You can actually have <code>checkout</code> do a couple of things in this situation to help you resolve conflicts. Another interesting value for that option is &#39;diff3&#39;, which will give you left, right and common to help you resolve the conflict manually:</p> <pre><code>$ git checkout --conflict=diff3 hello.rb $ cat hello.rb #! /usr/bin/env ruby def hello &lt;&lt;&lt;&lt;&lt;&lt;&lt; ours puts &#39;hola world&#39; ||||||| puts &#39;hello world&#39; ======= puts &#39;hello mundo&#39; &gt;&gt;&gt;&gt;&gt;&gt;&gt; theirs end </code></pre> <p>Anyhow, then you can re-resolve it by just running <code>rerere</code> again:</p> <pre><code>$ git rerere Resolved &#39;hello.rb&#39; using previous resolution. $ cat hello.rb #! /usr/bin/env ruby def hello puts &#39;hola mundo&#39; end </code></pre> <p>Magical re-resolving! Then you can add and continue the rebase to complete it.</p> <pre><code>$ git add hello.rb $ git rebase --continue Applying: i18n one word </code></pre> <p>So, if you do a lot of re-merges, or want to keep a topic branch up to date with your master branch without a ton of merges, or you rebase often or any of the above, turn on <code>rerere</code> to help your life out a bit.</p> Smart Http Thu, 4 Mar 2010 00:00:00 +0000 <p>When I was done writing Pro Git, the only transfer protocols that existed were the <code>git://</code>, <code>ssh://</code> and basic <code>http://</code> transports. I wrote about the basic strengths and weaknesses of each in <a href="/book/ch4-1.html">Chapter 4</a>. At the time, one of the big differences between Git and most other VCS&#39;s was that HTTP was not a mainly used protocol - that&#39;s because it was read-only and very inefficient. Git would simply use the webserver to ask for individual objects and packfiles that it needed. It would even ask for big packfiles even if it only needed one object from it.</p> <p>As of the release of version 1.6.6 at the end of last year, however, Git can now use the HTTP protocol just about as efficiently as the <code>git</code> or <code>ssh</code> versions (thanks to the amazing work by Shawn Pearce, who also happened to have been the technical editor of Pro Git). Amusingly, it has been given very little fanfare - the release notes for 1.6.6 state only this:</p> <pre><code>* &quot;git fetch&quot; over http learned a new mode that is different from the traditional &quot;dumb commit walker&quot;. </code></pre> <p>Which is a huge understatement, given that I think this will become the standard Git protocol in the very near future. I believe this because it&#39;s both efficient and can be run either secure and authenticated (https) or open and unauthenticated (http). It also has the huge advantage that most firewalls have those ports (80 and 443) open already and normal users don&#39;t have to deal with <code>ssh-keygen</code> and the like. Once most clients have updated to at least v1.6.6, <code>http</code> will have a big place in the Git world.</p> <h2>What is "Smart" HTTP?</h2> <p>Before version 1.6.6, Git clients, when you clone or fetch over HTTP would basically just do a series of GETs to grab individual objects and packfiles on the server from bare Git repositories, since it knows the layout of the repo. This functionality is documented fairly completely in <a href="">Chapter 9</a>. Conversations over this protocol used to look like this:</p> <pre><code>$ git clone Initialized empty Git repository in /private/tmp/simplegit-progit/.git/ got ca82a6dff817ec66f44342007202690a93763949 walk ca82a6dff817ec66f44342007202690a93763949 got 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7 Getting alternates list for Getting pack list for Getting index for pack 816a9b2334da9953e530f27bcac22082a9f5b835 Getting pack 816a9b2334da9953e530f27bcac22082a9f5b835 which contains cfda3bf379e4f8dba8717dee55aab78aef7f4daf walk 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7 walk a11bef06a3f659402fe7563abf99ad00de2209e6 </code></pre> <p>It is a completly passive server, and if the client needs one object in a packfile of thousands, the server cannot pull the single object out, the client is forced to request the entire packfile.</p> <p><img src="/images/smarthttp1.png"></p> <p>In contrast, the smarter protocols (<code>git</code> and <code>ssh</code>) would instead have a conversation with the <code>git upload-pack</code> process on the server which would determine the exact set of objects the client needs and build a custom packfile with just those objects and stream it over.</p> <p>The new clients will now send a request with an extra GET parameter that older servers will simply ignore, but servers running the smart CGI will recognize and switch modes to a multi-POST mode that is similar to the conversation that happens over the <code>git</code> protocol. Once this series of POSTs is complete, the server knows what objects the client needs and can build a custom packfile and stream it back.</p> <p><img src="/images/smarthttp2.png"></p> <p>Furthermore, in the olden days if you wanted to push over http, you had to setup a DAV-based server, which was rather difficult and also pretty inefficient compared to the smarter protocols. Now you can push over this CGI, which again is very similar to the push mechanisms for the <code>git</code> and <code>ssh</code> protocols. You simply have to authenticate via an HTTP-based method, like basic auth or the like (assuming you don&#39;t want your repository to be world-writable).</p> <p>The rest of this article will explain setting up a server with the &quot;smart&quot;-http protocol, so you can test out this cool new feature. This feature is referred to as &quot;smart&quot; HTTP vs &quot;dumb&quot; HTTP because it requires having the Git binary installed on the server, where the previous incantation of HTTP transfer required only a simple webserver. It has a real conversation with the client, rather than just dumbly pushing out data.</p> <h2>Setting up Smart HTTP</h2> <p>So, Smart-HTTP is basically just enabling the new CGI script that is provided with Git called <a href=""><code>git-http-backend</code></a> on the server. This CGI will read the path and headers sent by the revamped <code>git fetch</code> and <code>git push</code> binaries who have learned to communicate in a specific way with a smart server. If the CGI sees that the client is smart, it will communicate smartly with it, otherwise it will simply fall back to the dumb behavior (so it is backward compatible for reads with older clients).</p> <p>To set it up, it&#39;s best to walk through the instructions on the <a href=""><code>git-http-backend</code></a> documentation page. Basically, you have to install Git v1.6.6 or higher on a server with an Apache 2.x webserver (it has to be Apache, currently - other CGI servers don&#39;t work, last I checked). Then you add something similar to this to your http.conf file:</p> <pre><code>SetEnv GIT_PROJECT_ROOT /var/www/git SetEnv GIT_HTTP_EXPORT_ALL ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ </code></pre> <p>Then you&#39;ll want to make writes be authenticated somehow, possibly with an Auth block like this:</p> <pre><code>&lt;LocationMatch &quot;^/git/.*/git-receive-pack$&quot;&gt; AuthType Basic AuthName &quot;Git Access&quot; Require group committers ... &lt;/LocationMatch&gt; </code></pre> <p>That is all that is really required to get this running. Now you have a smart http-based Git server that can do anonymous reads and authenticated writes with clients that have upgraded to 1.6.6 and above. How awesome is that? The <a href="">documentation</a> goes over more complex examples, like making it work with GitWeb and accelerating the dumb fallback reads, if you&#39;re interested.</p> <h2>Rack-based Git Server</h2> <p>If you&#39;re not a fan of Apache or you&#39;re running some other web server, you may want to take a look at an app that I wrote called <a href="">Grack</a>, which is a <a href="">Rack</a>-based application for Smart-HTTP Git. <a href="">Rack</a> is a generic webserver interface for Ruby (similar to WSGI for Python) that has adapters for a ton of web servers. It basically replaces <code>git http-backend</code> for non-Apache servers that can&#39;t run it.</p> <p>This means that I can write the web handler independent of the web server and it will work with any web server that has a Rack handler. This currently means any FCGI server, Mongrel (and EventedMongrel and SwiftipliedMongrel), WEBrick, SCGI, LiteSpeed, Thin, Ebb, Phusion Passenger and Unicorn. Even cooler, using <a href="">Warbler</a> and JRuby, you can generate a WAR file that is deployable in any Java web application server (Tomcat, Glassfish, Websphere, JBoss, etc).</p> <p>So, if you don&#39;t use Apache and you are interested in a Smart-HTTP Git server, you may want to check out Grack. At <a href="">GitHub</a>, this is the adapter we&#39;re using to eventually implement Smart-HTTP support for all the GitHub repositories. (It&#39;s currently a tad bit behind, but I&#39;ll be starting up on it again soon as I get it into production at GitHub - send pull requests if you find any issues)</p> <p>Grack is about half as fast as the Apache version for simple ref-listing stuff, but we&#39;re talking 10ths of a second. For most clones and pushes, the data transfer will be the main time-sink, so the load time of the app should be negligible.</p> <h2>In Conclusion</h2> <p>I think HTTP based Git will be a huge part of the future of Git, so if you&#39;re running your own Git server, you should really check it out. If you&#39;re not, GitHub and I&#39;m sure other hosts will soon be supporting it - upgrade your Git client to 1.7ish soon so you can take advantage of it when it happens.</p> Undoing Merges Tue, 2 Mar 2010 00:00:00 +0000 <p>I would like to start writing more here about general Git tips, tricks and upcoming features. There has actually been a lot of cool stuff that has happened since the book was first published, and a number of interesting things that I didn&#39;t get around to covering in the book. I figure if I start blogging about the more interesting stuff, it should serve as a pretty handy guide should I ever start writing a second edition.</p> <p>For the first such post, I&#39;m going to cover a topic that was asked about at a training I did recently. The question was about a workflow where long running branches are merged occasionally, much like the <a href="">Large Merging</a> workflow that I describe in the book. They asked how to unmerge a branch, either permenantly or allowing you to merge it in later.</p> <p>You can actually do this a number of ways. Let&#39;s say you have history that looks something like this:</p> <p><img src="/images/unmerge1.png"></p> <p>You have a couple of topic branches that you have developed and then integrated together by a series of merges. Now you want to revert something back in the history, say &#39;C10&#39; in this case.</p> <p>The first way to solve the problem could be to rewind &#39;master&#39; back to C3 and then merge the remaining two lines back in again. This requires that anyone you&#39;re collaborating with knows how to handle rewound heads, but if that&#39;s not an issue, this is a perfectly viable solution. This is basically how the &#39;pu&#39; branch is handled in the Git project itself.</p> <pre><code>$ git checkout master $ git reset --hard [sha_of_C3] $ git merge jk/post-checkout $ git merge db/push-cleanup </code></pre> <p>Once you rewind and remerge, you&#39;ll instead have a history that looks more like this:</p> <p><img src="/images/unmerge2.png"></p> <p>Now you can go back and work on that newly unmerged line and merge it again at a later point, or perhaps ignore it entirely.</p> <h2>Reverting a Merge</h2> <p>However, what if you didn&#39;t find this out until later, or perhaps you or one of your collaborators have done work after this merge series? What if your history looks more like this:</p> <p><img src="/images/unmerge3.png"></p> <p>Now you either have to revert one of the merges, or go back, remerge and then cherry-pick the remaining changes again (C10 and C11 in this case), which is confusing and difficult, especially if there are a lot of commits after those merges.</p> <p>Well, it turns out that Git is actually pretty good at reverting an entire merge. Although you&#39;ve probably only used the <code>git revert</code> command to revert a single commit (if you&#39;ve used it at all), you can also use it to revert merge commits.</p> <p>All you have to do is specify the merge commit you want to revert and the parent line you want to keep. Let&#39;s say that we want to revert the merge of the <code>jk/post-checkout</code> line. We can do so like this:</p> <pre><code>$ git revert -m 1 [sha_of_C9] Finished one revert. [master 88edd6d] Revert &quot;Merge branch &#39;jk/post-checkout&#39;&quot; 1 files changed, 0 insertions(+), 2 deletions(-) </code></pre> <p>That will introduce a new commit that undoes the changes introduced by merging in the branch in the first place - sort of like a reverse cherry pick of all of the commits that were unique to that branch. Pretty cool.</p> <p><img src="/images/unmerge4.png"></p> <p>However, we&#39;re not done.</p> <h2>Reverting the Revert</h2> <p>Let&#39;s say now that you want to re-merge that work again. If you try to merge it again, Git will see that the commits on that branch are in the history and will assume that you are mistakenly trying to merge something you already have.</p> <pre><code>$ git merge jk/post-checkout Already up-to-date. </code></pre> <p>Oops - it did nothing at all. Even more confusing is if you went back and committed on that branch and then tried to merge it in, it would only introduce the changes <em>since</em> you originally merged.</p> <p><img src="/images/unmerge5.png"></p> <p>Gah. Now that&#39;s really a strange state and is likely to cause a bunch of conflicts or confusing errors. What you want to do instead is revert the revert of the merge:</p> <pre><code>$ git revert 88edd6d Finished one revert. [master 268e243] Revert &quot;Revert &quot;Merge branch &#39;jk/post-checkout&#39;&quot;&quot; 1 files changed, 2 insertions(+), 0 deletions(-) </code></pre> <p><img src="/images/unmerge6.png"></p> <p>Cool, so now we&#39;ve basically reintroduced everything that was in the branch that we had reverted out before. Now if we have more work on that branch in the meantime, we can just re-merge it.</p> <pre><code>$ git merge jk/post-checkout Auto-merging test.txt Merge made by recursive. test.txt | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) </code></pre> <p><img src="/images/unmerge7.png"></p> <p>So, I hope that&#39;s helpful. This can be particularly useful if you have a merge-heavy development process. In fact, if you work mostly in topic branches before merging for integration purposes, you may want to use the <code>git merge --no-ff</code> option so that the first merge is not a fast forward and can be reverted out in this manner.</p> <p>Until next time.</p> Translate This Wed, 19 Aug 2009 00:00:00 +0000 <p>One of the things I love about Git is how easy it is to fork and contribute. More than once on GitHub I&#39;ve put up some content and a handful of helpful souls go about translating it into another language. Well, it&#39;s happened again with Pro Git. Since I released it under the Creative Commons license and put the code <a href="">up on GitHub</a>, several people have forked and started translating the book into other languages such as <a href="/book/de">German</a>, <a href="/book/zh">中文</a>, <a href="/book/ja">Japanese</a> and <a href="/book/ru">Russian</a>.</p> <p>I&#39;ve been merging this work into <a href="">my main repository</a> to make it easier for people who want to help and now I&#39;ve started publishing these translations on the main website. At the bottom of the page in the footer is now a list of ongoing translations (none of them are complete yet) that link to the translated content.</p> <p>If you know another language and would like to make this Git resource available to others in that language, please either jump into helping with one of the ongoing translations or start a new one - I&#39;ll begin publishing it at once it gets going.</p> <p>The process to help with translations is to fork <a href="">my main content repository</a>, clone your new fork, make some changes and then send me a pull request through GitHub. Actually, even if you don&#39;t send me a pull request, I tend to check out my network graph every few days and pull in things that look good.</p> <p>There have so far been more than 30 people who have contributed translated material, errata or formatting fixes for my book since I put it up here - thanks to everyone! However, special thanks for all the help in translating the book so far to Mike Limansky (ru), Victor Portnov (ru), Egor Levichev (ru), Sven Fuchs (de), marcel (de), Fritz Thielemann (de), Yu Inao (ja), chunzi(zh) and DaNmarner (zh). Also huge thanks to Shane (duairc) for all the PDF creation software work. Keep it up!</p> The Gory Details Tue, 28 Jul 2009 00:00:00 +0000 <p>First of all, thanks to everyone for spreading the word about this book and this site - I got <em>way</em> more attention on the first day live than I expected I might. More than 10,000 individuals visited the site in the first 24 hours of it being online for over 38k pageviews. We got on Hacker News, Reddit and Delicious Popular aggregators, not to mention the Twitterverse.</p> <p>In the comments of the initial post here, a user asked me about the &quot;gory details&quot; of writing this book. Specifically about &quot;what tools you used to create the book and its figures&quot;. So here it is.</p> <p>Somewhere else I read that someone liked that I used Markdown for writing the book, as you can download the Markdown source for the book <a href="">at GitHub</a>. Well, the entire writing process was unfortunately not done in Markdown. At Apress most of the editing and review process is still MS Word centric. Since I threw hissy fits at Word for the first few chapters the very nice people at Apress allowed me to write the remainder of the book in Markdown initially and the technical reviews were done via a Git repository on GitHub using the Markdown source.</p> <p>So, for most of the book, the process was : I would write the first draft of each chapter in Markdown, two reviewers would add comments inline. Then, I would fix whatever they commented on then move the text into Word to submit it to the copy editor. The copy editor would review the Word document and let the technical editor have another pass, then I would fix up anything they commented on. Finally I get the chapter back in PDF form and I would do a final pass. Then I took that text and put it back in Markdown to generate HTML from for the website.</p> <p>Fun, huh?</p> <p>For the diagrams, I always use OmniGraffle. I personally think it&#39;s one of the most amazing pieces of software ever created - I love it. I think normally an Apress designer would take whatever the author sketched and redo them, but in this case we actually just used the diagrams that I made. I just added the .graffle file to the <a href="">GitHub repo</a> if you&#39;re interested in it.</p> Moved To Github Pages Wed, 11 Feb 2009 00:00:00 +0000 <p>This is the first post on the Pro Git book website, which contains the full content of the book published by Apress and a blog for me to share Git tips and book news with everybody.</p> <p>I&#39;m incredibly excited to get this book published - it has been a really long time in the making, and I&#39;m glad Apress let me publish the content under a Creative Commons license, so I can share it online as well. If you find the content helpful, please support this kind of open documentation by buying a print version of the book at <a href="">Amazon</a>.</p> <p>The print version of the book is going to the presses soon and should be shipping around the end of August. I was first contacted by Apress at the very end of November of last year and had my first draft of the first chapter in on December 15. Since then, it has been a whirlwind of writing, reviewing, rewriting and re-reviewing of 9 long chapters. That&#39;s about 8 months from start to finish and has been one of the most monumental side-projects I&#39;ve ever done. I&#39;ve learned a ton about the publishing and book-writing process and also about Git.</p> <p>I hope you enjoy the book and I hope that it helps you to learn one of the most amazing tools you can add to your development arsenal.</p>