Inside Git’s Brain: How Commits Are Stored and Linked

What Really Happens When You Hit Commit!

May 24, 2025

We’ve all been there, working on a feature, writing some amazing code, and shipping the feature. After some time, BOOM! Something broke.

Now we need to undo our changes and revert to a previous stable state, hoping this will fix the issue.

But “revert” strikes fear among developers (at least I used to fear it). I always thought that I would surely mess things up.

But here’s the truth: the only way to conquer that fear is to understand how Git actually works.
Not just the commands, but the WHY and HOW behind them.

What is inside Git?

Git has 4 types of objects stored in .git/objects:

Blob – file content (no filename, just raw data)
Tree – directory structure (lists blobs + sub-trees + names)
Commit – points to a tree, contains metadata (author, message, timestamp)
Tag – references other objects, used for versioning

But how does all this create such an intricate system? Let’s understand it.

Suppose we have the following directory —

repo/
├── a.txt
├── b.txt
└── subdir/
    └── c.txt

Each file ( a.txt, b.txt, c.txt) is stored as a blob.
Each directory (subdir/) is stored as a tree, which lists the blobs (and other trees).
The root directory is also a tree, listing a.txt, b.txt, and the subdir tree.

We ran the following command —

git init
git add .
git commit -m "Initial Commit"

We initialized git in this repo
We added all the files to git
We created a commit (let’s call it commit A)

We can imagine these commits and state trees as a graph below —

Now, let’s do the following operation —

Change "a.txt"
git add a.txt
git commit -m "Changed file a"

How does the graph look now?

Graph after the next commit (Commit B)

A new commit B was created, and its parent was assigned as “Commit A”; also, it is just a pointer to a new tree snapshot
Only the changed files (a.txt) were used to create a new Blob; for the rest, old blobs are used
Git doesn’t just store the diff, it creates new blobs with the entirety of the content

How does git know the exact lines/characters that changed?

In the above example, we changed a.txt. If we do a git diff, it will show all the exact things that changed.

Git stores the blobs of different versions of the file.

a.txt → v1 → blob1
a.txt → v2 → blob2

Now the whole thing comes down to calculating the difference between these blobs, and for that, git uses some amazing and clever algorithms for comparison.

Git uses Myers diff for line-level, Histogram diff for better rename/move detection.

Now that we are acquainted with the underlying mechanism that git is composed of, let’s look at working with it.

Lifecycle of a change in git

Every change in git goes through 3 stages —

HEAD

A pointer to the latest commit on the current branch.
It’s what repo is based on right now.
It’s like - "What version of the repo am I currently looking at?"

HEAD can be moved around using commands like checkout, reset, revert

INDEX (Staging Area)

A middle layer between your work and Git’s history.
It’s what gets included in the next commit.

When we do git add a.txt, Git copies the current version of a.txt into the index.

A commit is made from whatever is in the index, not necessarily what’s in the working directory. That’s why we need to “add” the changed files we want git to track.

Working Directory

What we actually see in the code editor.
These are files on disk, not in Git yet.

Running Commands

We will look at how the commit graph changes on running different commands.

This is how our initial commit graph looks —

The current HEAD is at C, which is in the main branch.

Git Checkout

Let’s run the following commands and see how the above graph changes —

1. current branch is in main and current commit at C

2. git checkout -b feature (new branch from main branch, from commit C)

3. git commit -m "new changes" (Commit D)

4. git commit -m "new changes again" (Commit E)

How the graph changes with checkout and commits

Git Revert

Let’s do the revert now —

git revert C

What does normal revert do?

A new commit C' is added.
C' undoes changes introduced in C, but keeps history.

It seems simple, until the commit C is not a merge commit. If it is, it’s an entirely different story I might cover in the next part.

Git Reset

Let’s do the infamous reset now —

git reset --hard HEAD~1

Let’s understand each line of the syntax first —

Reset

This moves the current HEAD pointer to a different commit.
It also changes the state of:
- Index (staging area)
- Working directory, depending on the mode (--soft, --mixed, --hard)

So, git reset <target> says:

“Pretend the last commit(s) never happened — go back to this <target>.”

--hard

-- soft → only the HEAD changes → keeps staging area + working dir as-is
-- mixed → HEAD + Index changes → resets staging area, but keeps working dir.
-- hard → HEAD + Index + Working Dir changes → everything goes back to the target commit — files are overwritten.

So --hard means:

Reset HEAD, wipe the staging area, and make the files on disk match the target commit.

~ (tilda)

This is the commit ancestry operator.

HEAD~1 means → “The first parent of the current commit.”
HEAD~2 means → “The grandparent (two steps back).”

In our above example —

HEAD~0 → C

HEAD~1 → B

HEAD~2 → A

Now, all together, git reset —hard HEAD~1 means —

HEAD and main move back to B.
C is lost (if not referenced).
Working dir & index are forcibly set to the state at B.

Git Merge

Suppose we have the following commit graph —

Now, we want to merge the branch “feature” into “main”. We do the following —

git checkout main
git merge feature

The above graph will look like this now —

Git creates a new merge commit M.
M has two parents: C (from main) and E (from feature).
The histories of both branches are now joined.

What’s inside a merge commit?

Git compares the two branches (C and E) and finds their common ancestor (B).
It performs a three-way merge:
- Base = B
- Ours = C (the branch we’re on)
- Theirs = E (the branch we're merging in)
It auto-resolves changes where possible and creates a new snapshot = M.

Merge Conflicts

If both branches changed the same line or area, Git will pause and show conflicts like:

Auto-merging a.txt
CONFLICT (content): Merge conflict in a.txt

We must resolve manually and then do:

git add a.txt
git commit  Creates the merge commit

Thanks for reading Brain Bytes & Binary! This post is public, so feel free to share it.

Git stores data as objects: blob (file content), tree (directories), commit (snapshot + metadata), and tag (reference for versioning).
Every file change creates a new blob, Git doesn't store diffs; it stores full snapshots.
Diffing is done by comparing blobs using algorithms like Myers and Histogram.
Three key areas in Git:
- HEAD → points to the latest commit
- INDEX (Staging Area) → holds changes ready to commit
- Working Directory → actual files we’re editing
Revert creates a new commit that undoes changes from a previous commit without altering history.
Reset moves HEAD back to an earlier commit:
- --soft: only updates HEAD
- --mixed: resets index too
- --hard: resets everything, including the working directory
Merge combines histories from two branches and creates a merge commit with two parents.
Three-way merge uses:
- Base = common ancestor
- Ours = current branch
- Theirs = merging branch
Conflicts must be resolved manually when the same lines are changed on both branches.

Wrapping Up

As software engineers, we use Git multiple times a day, committing, pushing, pulling, and checking out branches.

But the moment we need to do something like revert or reset, many of us hesitate. Why? Because we’re not used to how these commands work under the hood.

By understanding how Git is structured internally, blobs, trees, commits, and the graph structure, we stop relying on memorization and start using Git intuitively.

It becomes a tool we control, not something we’re afraid to break.

Feel free to let me know in the comments if you want me to cover some more operations in git in the same detail as I did here.

As always, thanks for reading this. And I will be back with another interesting topic next week.

Stay tuned!