Recently my friend, Jonathan and I had the opportunity to attend Github’s Advanced Git training that Jim Van Fleet pulled together. Whilst being blown away by the in-depth knowledge Tim and Adam demonstrated, I managed to wrap my head around a few concepts and walk away with brain intact.
Git Hash Architecture
When troubleshooting Git issues, it’s helpful to have an understanding of how things work under the hood; only knowing git as “magical” isn’t quite helpful. If you have a good mental model for how Git does its business behind the scenes, you can better eliminate potential causes for a particular issue. You will also have a better feel for where to look when things go wrong. Tim started the day off with a graphic similar to the following:
The flow of this graphic begins at the top with a commit. Each commit is simply a text file with some metadata; usually the tree, parent, author and committer.
The tree is a representation of files changed according to your folder structure on disk. The associated hash is where you can find the tree. Inside the tree are references to additional trees, blobs, and other objects that are needed to make up the state of the filesystem at that commit.
A blob is a zlib compressed file of each chunk of changes in a file. If you have 3 changes in a file that are not side-by-side, then you will have 3 different blobs stored. A blob has no metadata attached. It is really just a blob of compressed data—opening it up in your text editor will display garble.
The hashed filename is the SHA1 of the contents. The first two characters of the hash is the name of the folder the object is stored in while the remaining 38 digits make up the filename.
I have never written an API of any depth, though I have consumed plenty and have considered what is involved in doing so. I recall reading an article a couple years ago by Microsoft doing usability studies on it’s C# API’s in Visual Studio. That article was the first time my mind saw usability going beyond consumers and users of websites and apps and into the realm of programmers developing against an API.
In a nutshell, the onionskin API approach is an API of multiple layers. In git there are two different layers which are commonly referred to as the porcelain and the plumbing. The porcelain commands are meant to solve 80% of everyone’s problems. Some developers will never need any more than this, while others will need a custom solution beyond what is provided.
For this remaining 20% there is the plumbing commands. The plumbing consists of the low-level functionality that is used to compose the porcelain and will allow a user of git to compose their additional layers on top.
For example, git flow is a set of commands on top of git that provides more porcelain to provide a documented and proven workflow.
Git has approximately 145 commands with around 1000 commandline switches. Maybe 15-20 of these are porcelain.
Hub. Github CLI commands.
My favorite piece of this is
git pull-request. This command will open up your default git text editor and allow you to compose your pull request right there. The first line becomes the title, skip a line, and then the remaining text is your body. Save and exit and hub will create
the pull request on github.com without you ever needing to leave the comfort of your editor.
At Skookum, we use Github for all of our projects and learning more about the tools we use daily is always a fascinating exercise that better equips us for the task at hand. I would highly recommend attending a github training session. Also, Pro Git by Scott Chacon is available for your reading pleasure online for free.
Now, go forth! Increase your git-fu! And amaze your friends.