Page 32 - MSDN Magazine, August 2017
P. 32
Stage (Add)
Clone Pull
Commit
Clone Pull
Merge
locates the blob object in the .git\\\\objects folder and updates its datemodified time (Git will never overwrite objects that already exist in the repo; it updates the lastmodified date so as to delay this newly added object from being considered for garbage collection). Otherwise, it uses the first two characters of the SHA1 string as the directory name in .git\\\\objects and the remaining 38 characters to name the blob file before zlibcompressing it and writing its con tents. In my example, Git would create a folder in .git\\\\objects called 5a and then write the blob object into that folder as a file with the name b2f8a4323abafb10abb68657d9d39f1a775057.
When Git creates a blob object in this manner, you might be surprised that one expected file property is conspicuously missing from the blob object: the file name! That’s by design, however. Recall that Git is a contentaddressable file system and, as such, it manages SHA1named blob objects—not files. Each blob object is normally referenced by at least one tree object, and tree objects in turn are normally referenced by commit objects. Ultimately, Git tree objects express the folder structure of the files you stage. But Git doesn’t cre ate those tree objects until you issue a commit. Therefore, you can conclude that if Git uses only the index to prepare a commit object, it also must capture the filepath references for each blob in the index— and that’s exactly what it does. In fact, even if two blobs have the same SHA1 value, as long as each maps to a different file name or differ ent path/file value, each will appear as a separate entry in the index.
Git also saves file metadata with each blob object it writes to the index, such as the file’s create and modified dates. Git leverages this information to efficiently detect changes to files in your working directory using filedate comparisons and heuristics rather than bruteforce recomputing the SHA1 values for each file in the working directory. Such a strategy speeds up the information you see in the Team Explorer Changes pane—or when you issue the
Working Directory
Index
DAG
Switch Branch (Check Out)
Switch Branch (Check Out)
Figure 3 Primary Git Actions That Update the Index (Green) and Git Actions That Rely on What the Index Contains (Red)
Let’s create a new file in the working directory to see what hap
pens to it as it’s written to the index. As soon as you stage that file,
Git creates a header using this stringconcatenation formula:
blob{space}{file-length in bytes}{null-termination character}
Git then concatenates the header to the beginning of the file con
tents. Thus, for a text file containing the string “Hello,” the header
+ file contents would generate a string that looks like this (keep in
mind there’s a null character before the letter “H”):
blob 5Hello
To see that more clearly, here’s the hexadecimal version of that string:
62 6C 6F 62 20 35 00 48 65 6C 6C 6F
Git then computes an SHA1 for the string:
5ab2f8a4323abafb10abb68657d9d39f1a775057
Git next inspects the existing index to determine if an entry for that folder\\\\file name already exists with the same SHA1. If so, it
porcelain Git status command. Once armed with an index entry for a workingdirectory file along with its associated metadata, Git is said to “track” the file because it can readily compare its copy of the file with the copy that remains in the working directory. Technically, a tracked file is one that also exists in the working directory and is to be included in the next commit. This is in contrast to untracked files, of which there are two types: files that are in the working directory but not in the index, and files that are explic itly designated as not to be tracked (see the Index Extensions section). To summarize, the index gives Git the power to determine which files are tracked, which are not tracked,
and which should not be tracked. To better understand the spe cific contents of the index, let’s use a concrete example by starting
Figure 4 Viewing the History in Order to See What Visual Studio Does When You Create a New Project
28 msdn magazine
DevOps