Modeling Git Internals in Alloy, Half 2: Commits and Tags
Brian Hicks, April 10, 2023
Last week, we began modeling Git’s internals in Alloy. We added blobs (to retailer content material) and timber (to arrange it right into a filesystem.) We ended up with this mannequin:
summary sig Object {}
sig Blob extends Object {}
sig Tree extends Object {
kids: set Object
}
reality "timber can not consult with themselves" t in t.^kids
… which produces cases that appear to be this:
At present, we will add commits and tags to this mannequin!
Commits
Going again to the Git Internals – Git Objects chapter of the Git book, we are able to take a tree hash we produced within the final publish and make a commit with git commit-tree
:
$ git commit-tree 3ee29075 -m 'Commit message'
8cc0d4f4ddfde6efa9a8fced667d4d51574a36ec
We are able to add extra commits (and historical past) by repeating this command, however specifying a mother or father ID for every subsequent commit.
$ git commit-tree 3ee29075 -m 'Second commit' -p 8cc0d4
bc8d9d27a206d0e933be3e445c82cbef09da54d1
$ git commit-tree 3ee29075 -m 'Third commit' -p bc8d9d
844bcca25118c27b0322aacd49edb73d8fac827f
Then we are able to view the lineage of the latest commit with git log
:
$ git log 844bcc
commit 844bcca25118c27b0322aacd49edb73d8fac827f
Writer: Brian Hicks <brian@brianthicks.com>
Date: Fri Mar 3 12:36:14 2023 -0600
Third commit
commit bc8d9d27a206d0e933be3e445c82cbef09da54d1
Writer: Brian Hicks <brian@brianthicks.com>
Date: Fri Mar 3 12:35:49 2023 -0600
Second commit
commit 8cc0d4f4ddfde6efa9a8fced667d4d51574a36ec
Writer: Brian Hicks <brian@brianthicks.com>
Date: Fri Mar 3 12:32:37 2023 -0600
Commit message
However… would it not work to commit a blob hash, or does it solely work with tree? The guide would not say, so I attempted, and it appears just like the hash you go in as the primary argument to git commit-tree should be a tree. When you attempt to make a commit based mostly on a blob, git will not allow you to:
$ git cat-file -p 3ee29075
100644 blob 39528abd81b13b2731d47f86206351a61f1e6484 hello-alloy.txt
100644 blob 9b4b40c2bca67e781930105fa190b9b90235cfe5 hello-blob.txt
$ git cat-file -p 39528a
Hey, Alloy!
$ git commit-tree 39528a -m 'Are you able to commit a blob?'
deadly: 39528abd81b13b2731d47f86206351a61f1e6484 will not be a legitimate 'tree' object
So it appears like a commit has to have a tree, a message, and nil or extra dad and mom (you possibly can have multiple; that is how merge commits work.) All that is confirmed by man git-commit-tree
! We’ll depart messages out of our mannequin as a result of they do not matter for any properties we’d care about, however in any other case we’ll add this to our mannequin:
sig Commit extends Object {
mother or father: set Commit,
tree: one Tree,
}
Discovering mismatches between Git’s mannequin and ours
Let us take a look at the cases Alloy produces and see if we expect any of that feels off. To start out, we get comparatively normal-looking cases, equivalent to two commits with the identical tree:
However we additionally get some wilder cases. For instance, it appears like our mannequin permits timber to have commits as kids:
I am unsure whether or not that’d be allowed, however it’s simple to confirm by asking Git so as to add a decide to the staging space:
$ git update-index --add --cacheinfo 100644
8cc0d4 commits-are-stageable.txt
deadly: git update-index: --cacheinfo can not add 8cc0d4
Nope, would not work. That is positive. We’ll simply replace our definition of Tree
to say that they cannot have commits as kids. Since we’re coping with units right here, we are able to write “all objects in addition to commits” as Object - Commit
, which makes the brand new definition of Tree
appear to be this:
sig Tree extends Object {
kids: set Object - Commit
}
That is not all of the weirdness taken care of, although: we additionally get commits that are their very own dad and mom, or cycles of commits who’re one another’s dad and mom:
Like final time, that is technically doable: if yow will discover simply the correct content material for the commit messages and timber, you may conceivably get a decide to consult with itself. Like earlier than, although, that is prone to break git in some terrible methods (segfaults!) If we had been modeling Git to attempt to discover bugs or safety vulnerabilities, I might say we should always enable this. However, as earlier than, we’re attempting to learn the way that is supposed to work, so let’s disallow it in the identical method we disallowed timber being their very own mother or father:
reality "commits cannot be their very own mother or father" c in c.^mother or father
With commits performed, we now have just one extra object kind to mannequin: the tag. Tags are like commits, however as an alternative of pointing to a tree and mother or father they level to a commit, and you’ll transfer them later (versus all the things else we have seen to date, which is immutable.) This is how we might mannequin that:
sig Tag extends Object {
commit: one Commit,
}
Operating the mannequin like this exhibits that we have implicitly allowed timber to comprise tags (as a result of now Object - Commit
consists of Tag
) which we did not imply. We may say Object - Commit - Tag
, however at this level I feel it might be higher to rephrase Tree.kids
to comprise solely what we wish:
sig Tree extends Object {
kids: set Blob + Tree,
}
Now we are able to get tags on commits. Yay!
We have now reached the tip of the primary a part of our Git-modeling journey: we now have all of the objects! (There are additionally refs, although, which work like tags however aren’t saved with the git objects. You’ll be able to learn extra about these in the Git Internals – Git References chapter of the Git book.)
This is the mannequin we’re ending with:
summary sig Object {}
sig Blob extends Object {}
sig Tree extends Object {
kids: set Blob + Tree,
}
reality "timber can not consult with themselves" t in t.^kids
sig Commit extends Object {
mother or father: set Commit,
tree: one Tree,
}
reality "commits cannot be their very own mother or father" c in c.^mother or father
sig Tag extends Object {
commit: one Commit,
}
From right here, our subsequent step is to mannequin the operations we are able to tackle this mannequin to test if the properties we wrote earlier really maintain once we use Git’s instructions. Keep tuned!