Git
Chapters ▾ 2nd Edition

10.6 Git Internals - Přenosové protokoly

Přenosové protokoly

Git can transfer data between two repositories in two major ways: the “dumb” protocol and the “smart” protocol. Tato část se ve stručnosti zaměří na to, jak tyto dva základní protokoly fungují.

Hloupý protokol

If you’re setting up a repository to be served read-only over HTTP, the dumb protocol is likely what will be used. This protocol is called “dumb” because it requires no Git-specific code on the server side during the transport process; the fetch process is a series of HTTP GET requests, where the client can assume the layout of the Git repository on the server.

Note

The dumb protocol is fairly rarely used these days. It’s difficult to secure or make private, so most Git hosts (both cloud-based and on-premises) will refuse to use it. It’s generally advised to use the smart protocol, which we describe a bit further on.

Let’s follow the http-fetch process for the simplegit library:

$ git clone http://server/simplegit-progit.git

První věcí, kterou příkaz udělá, je stažení souboru info/refs. Tento soubor se zapisuje příkazem update-server-info. To je také důvod, proč ho je nutné zapnout jako zásuvný modul post-receive, aby přenos dat prostřednictvím protokolu probíhal správně:

=> GET info/refs
ca82a6dff817ec66f44342007202690a93763949     refs/heads/master

Now you have a list of the remote references and SHA-1s. Next, you look for what the HEAD reference is so you know what to check out when you’re finished:

=> GET HEAD
ref: refs/heads/master

You need to check out the master branch when you’ve completed the process. At this point, you’re ready to start the walking process. Protože je vaším výchozím bodem objekt revize ca82a6, jak jste zjistili v souboru info/refs, začnete vyzvednutím tohoto objektu:

=> GET objects/ca/82a6dff817ec66f44342007202690a93763949
(179 bytes of binary data)

You get an object back – that object is in loose format on the server, and you fetched it over a static HTTP GET request. Objekt můžete rozbalit, extrahovat záhlaví a prohlédnout si obsah revize:

$ git cat-file -p ca82a6dff817ec66f44342007202690a93763949
tree cfda3bf379e4f8dba8717dee55aab78aef7f4daf
parent 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7
author Scott Chacon <schacon@gmail.com> 1205815931 -0700
committer Scott Chacon <schacon@gmail.com> 1240030591 -0700

changed the version number

Next, you have two more objects to retrieve – cfda3b, which is the tree of content that the commit we just retrieved points to; and 085bb3, which is the parent commit:

=> GET objects/08/5bb3bcb608e1e8451d4b2432f8ecbe6306e7e7
(179 bytes of data)

Tím získáte další objekt revize. Načtěte objekt stromu:

=> GET objects/cf/da3bf379e4f8dba8717dee55aab78aef7f4daf
(404 - Not Found)

Oops – it looks like that tree object isn’t in loose format on the server, so you get a 404 response back. There are a couple of reasons for this – the object could be in an alternate repository, or it could be in a packfile in this repository. Git nejprve zjistí, zda jsou k dispozici alternativní repozitáře:

=> GET objects/info/http-alternates
(empty file)

If this comes back with a list of alternate URLs, Git checks for loose files and packfiles there – this is a nice mechanism for projects that are forks of one another to share objects on disk. Protože však seznam v tomto případě neobsahuje žádné alternativní repozitáře, váš objekt musí být v balíčkovém souboru. Chcete-li zjistit, jaké balíčkové soubory jsou na serveru dostupné, pomůže vám soubor objects/info/packs, který obsahuje jejich seznam (rovněž generován příkazem update-server-info):

=> GET objects/info/packs
P pack-816a9b2334da9953e530f27bcac22082a9f5b835.pack

There is only one packfile on the server, so your object is obviously in there, but you’ll check the index file to make sure. To je rovněž užitečné, máte-li na serveru více balíčkových souborů. Zjistíte tak, který z nich obsahuje hledaný objekt:

=> GET objects/pack/pack-816a9b2334da9953e530f27bcac22082a9f5b835.idx
(4k of binary data)

Now that you have the packfile index, you can see if your object is in it – because the index lists the SHA-1s of the objects contained in the packfile and the offsets to those objects. Váš objekt se v tomto souboru nachází, a proto neváhejte a stáhněte celý balíčkový soubor:

=> GET objects/pack/pack-816a9b2334da9953e530f27bcac22082a9f5b835.pack
(13k of binary data)

Stáhli jste objekt stromu, a můžete tak pokračovat v procházení revizí. They’re all also within the packfile you just downloaded, so you don’t have to do any more requests to your server. Git provede checkout pracovní kopie větve master, na niž ukazovala reference HEAD, kterou jste stáhli na začátku.

Chytrý protokol

The dumb protocol is simple but a bit inefficient, and it can’t handle writing of data from the client to the server. The smart protocol is a more common method of transferring data, but it requires a process on the remote end that is intelligent about Git – it can read local data, figure out what the client has and needs, and generate a custom packfile for it. Existují dvě sady procesů pro přenos dat: jeden pár pro upload dat a jeden pár pro jejich stahování.

Upload dat

To upload data to a remote process, Git uses the send-pack and receive-pack processes. Proces send-pack se spouští na klientovi a připojuje se k procesu receive-pack na straně vzdáleného serveru.

SSH

Řekněme například, že ve svém projektu spustíte příkaz git push origin master a origin je definován jako URL používající protokol SSH. Git spustí proces send-pack, který iniciuje spojení se serverem přes SSH. Na vzdáleném serveru se pokusí spustit příkaz prostřednictvím volání SSH:

$ ssh -x git@server "git-receive-pack 'simplegit-progit.git'"
00a5ca82a6dff817ec66f4437202690a93763949 refs/heads/master□report-status \
	delete-refs side-band-64k quiet ofs-delta \
	agent=git/2:2.1.1+github-607-gfba4028 delete-refs
0000

The git-receive-pack command immediately responds with one line for each reference it currently has – in this case, just the master branch and its SHA-1. The first line also has a list of the server’s capabilities (here, report-status, delete-refs, and some others, including the client identifier).

Each line starts with a 4-character hex value specifying how long the rest of the line is. Your first line starts with 00a5, which is hexadecimal for 165, meaning that 165 bytes remain on that line. Následující řádek je 0000 – touto kombinací server označuje konec seznamu referencí.

Now that it knows the server’s state, your send-pack process determines what commits it has that the server doesn’t. Pro každou referenci, která bude tímto odesláním aktualizována, sdělí proces send-pack tuto informaci procesu receive-pack. For instance, if you’re updating the master branch and adding an experiment branch, the send-pack response may look something like this:

0076ca82a6dff817ec66f44342007202690a93763949 15027957951b64cf874c3557a0f3547bd83b3ff6 \
	refs/heads/master report-status
006c0000000000000000000000000000000000000000 cdfdb42577e2506715f8cfeacdbabc092bf63e8d \
	refs/heads/experiment
0000

Git sends a line for each reference you’re updating with the line’s length, the old SHA-1, the new SHA-1, and the reference that is being updated. The first line also has the client’s capabilities. The SHA-1 value of all '0’s means that nothing was there before – because you’re adding the experiment reference. Pokud byste mazali referenci, viděli byste pravý opak: samé nuly na pravé straně.

Next, the client sends a packfile of all the objects the server doesn’t have yet. Na závěr procesu server oznámí, zda se akce zdařila, nebo nezdařila:

000eunpack ok
HTTP(S)

This process is mostly the same over HTTP, though the handshaking is a bit different. The connection is initiated with this request:

=> GET http://server/simplegit-progit.git/info/refs?service=git-receive-pack
001f# service=git-receive-pack
00ab6c5f0e45abd7832bf23074a333f739977c9e8188 refs/heads/master□report-status \
	delete-refs side-band-64k quiet ofs-delta \
	agent=git/2:2.1.1~vmg-bitmaps-bugaloo-608-g116744e
0000

That’s the end of the first client-server exchange. The client then makes another request, this time a POST, with the data that send-pack provides.

=> POST http://server/simplegit-progit.git/git-receive-pack

The POST request includes the send-pack output and the packfile as its payload. The server then indicates success or failure with its HTTP response.

Stahování dat

When you download data, the fetch-pack and upload-pack processes are involved. Klient iniciuje proces fetch-pack, který vytvoří připojení k procesu upload-pack na straně vzdáleného serveru a dojedná, která data budou stažena.

SSH

If you’re doing the fetch over SSH, fetch-pack runs something like this:

$ ssh -x git@server "git-upload-pack 'simplegit-progit.git'"

After fetch-pack connects, upload-pack sends back something like this:

00dfca82a6dff817ec66f44342007202690a93763949 HEAD□multi_ack thin-pack \
	side-band side-band-64k ofs-delta shallow no-progress include-tag \
	multi_ack_detailed symref=HEAD:refs/heads/master \
	agent=git/2:2.1.1+github-607-gfba4028
003fe2409a098dc3e53539a9028a94b6224db9d6a6b6 refs/heads/master
0000

Informace se nápadně podobají těm, jimiž odpovídá proces receive-pack, liší se však schopnosti. In addition, it sends back what HEAD points to (symref=HEAD:refs/heads/master) so the client knows what to check out if this is a clone.

At this point, the fetch-pack process looks at what objects it has and responds with the objects that it needs by sending “want” and then the SHA-1 it wants. It sends all the objects it already has with “have” and then the SHA-1. At the end of this list, it writes “done” to initiate the upload-pack process to begin sending the packfile of the data it needs:

003cwant ca82a6dff817ec66f44342007202690a93763949 ofs-delta
0032have 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7
0009done
0000
HTTP(S)

The handshake for a fetch operation takes two HTTP requests. The first is a GET to the same endpoint used in the dumb protocol:

=> GET $GIT_URL/info/refs?service=git-upload-pack
001e# service=git-upload-pack
00e7ca82a6dff817ec66f44342007202690a93763949 HEAD□multi_ack thin-pack \
	side-band side-band-64k ofs-delta shallow no-progress include-tag \
	multi_ack_detailed no-done symref=HEAD:refs/heads/master \
	agent=git/2:2.1.1+github-607-gfba4028
003fca82a6dff817ec66f44342007202690a93763949 refs/heads/master
0000

This is very similar to invoking git-upload-pack over an SSH connection, but the second exchange is performed as a separate request:

=> POST $GIT_URL/git-upload-pack HTTP/1.0
0032want 0a53e9ddeaddad63ad106860237bbf53411d11a7
0032have 441b40d833fdfa93eb2908e52742248faf0ee993
0000

Again, this is the same format as above. The response to this request indicates success or failure, and includes the packfile.

Protocols Summary

This section contains a very basic overview of the transfer protocols. The protocol includes many other features, such as multi_ack or side-band capabilities, but covering them is outside the scope of this book. We’ve tried to give you a sense of the general back-and-forth between client and server; if you need more knowledge than this, you’ll probably want to take a look at the Git source code.