Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I feel that having a default merge strategy to squash and merge all commits in a branch is a version control anti-pattern. This discourages thoughtful and frequent commits that express the intent of a change because all the commits are just smashed together anyway so why bother. I think context and intent is lost when looking through git history of large smashed commits.

I prefer using a precommit hook to automatically prepend a Jira ticket number to each commit so when you look at the history you'll see multiple commits grouped together with the same ticket prefix, but the commits still retain the intention of the commit. Knowing that commits will not be squashed promotes devs to make meaningful commits. I still advocate for cleaning up and squashing your own commits as you see fit with an interactive rebase before your branch is merged. Having discrete commits can also help when running git bisect to find when a bug was introduced so you identify the specific commit instead of a feature being merged.



That's why I prefer feature branches with merge commits.

Your dev branch is clean because each merge-commit is a single commit per task.

So you can see which tasks were merged and in which order and what file as a whole were changed in each task.

If for debugging / code review or any other reason need to look at specifics, you can look through the feature branch commit by commit to see what was changed and why.

It's best of both worlds.

Similarly merging dev into master/main. You get a release by release view of what files were changed in a single merge commit.


Completely agree with parent and grand-parent. For me, a nice clean commit history is an investment in future maintenance.

Following the rules described above means that you get a lot of context on why code was changed:

* the invidivual commit which should contain a description of the change if the change is not self explanatory or not 'intuitive', the individual commit should consist of only 1 functional change

* the commits surrounding it, the feature branch (clearly visible because of the merge commits)

* the issue number on the commit itself and possibly in the merge commit

I've found all this context very informative for projects that are in maintenance mode and still need changes from time to time. Obviously, the higher the quality of the commit history, the higher the quality of the information you get out of it.

Meaning: if you put a rename with an impact over the whole code base (because the original name just happened to bother you that day) together with a bugfix in the same commit and the commit message has the very informative text 'fix', and the referenced issue mentions 'add support for blah' (but the commit obviously does not implement anything related to 'blah'), then... well, yeah, then how you organise your commit history does not really matter.


* the issue number on the commit itself and possibly in the merge commit

Absolutely this too!

My process is to create a feature branch named "ISSUE-123-issue-description"

The benefit of this is all changes are tracked (and tested) against a specific issue in bug management software.

It also prevents people making small / unrelated changes or fixes in association with another task. If these are grouped in together in a single and unrelated task they won't be trackable or testable.


The merge commit always refers back to the PR id, so even if it was squashed you can still look up the play-by-play by looking up the PR. I wonder if this could be represented in git by tagging the squashed branch with the PR id with a description that has the merge commit id, and the merge commit description that references the tag. It would be nice if something like this could be standardized so it works across systems.


Something people don't get about commits is they have multiple purposes. A lot of this is due to still entrenched assumptions and practices from older, inferior version control systems.

There are at least two types of commit in git: a savepoint and a version.

A savepoint is what happens during development on a branch. Git makes it super easy to make many, many savepoints throughout the day. These help you as a developer because it gives you something to fall back on if you make a mistake. But most of them should never be exposed to anyone not directly working on the branch.

A version is what you share with others. A version is a fully working version of the software that can be reasonably checked out and put through a release process at any time. Usually a version will be unit tested but not subject to the same rigorous tests as a release.

There is a direct analogy here with database transactions. Just replace version with transaction.

Often while working you will find it's possible to write the version commit right away. This is usually for more trivial fixes or in some cases when a commit is required for something like a database migration (when things need to be deployed in stages). Other times you will need to make several savepoints before you get to a new version. This is what rebase is for. Many of those savepoints don't belong on the master branch as they are often fixing stuff you haven't even committed to master yet.

Git has a few tools to help you defer rebasing until later. In particular you can make fixup and squash commits. These will be normal savepoint commits, but they will be labelled in a way that later you can issue an "autosquash" command to automatically rebase these into version commits.


There's also nothing wrong with leaving "savepoints" type commits in a branch. Sometimes "I stopped here and took a break" is still useful information to have later on.

Git provides a DAG and you can use a --no-ff merge to build your "version commit" from the sub history of its "savepoints". You can follow one parent of the merge to the next "version commit" or you can follow the other parent through the intermediate "savepoints" that built it step by step.

You can use --first-parent today for most git operations to get "clean views" no matter how complex the DAG web is beyond it. I think a lot of these debates would "go away" if more people and user interfaces defaulted to --first-parent and "drill down" navigation rather than firehose of the complete graph and confusing (but pretty) "subway diagrams".


> There's also nothing wrong with leaving "savepoints" type commits in a branch. Sometimes "I stopped here and took a break" is still useful information to have later on.

I don't think they should be on an eternal branch. I seriously doubt it is ever useful to know that some developer took a break at 11:21 three months ago. This is just noise and makes bisecting impossible, which is the entire point of keeping any history.

> Git provides a DAG and you can use a --no-ff merge to build your "version commit" from the sub history of its "savepoints". You can follow one parent of the merge to the next "version commit" or you can follow the other parent through the intermediate "savepoints" that built it step by step.

You could, but it's thoroughly nonstandard and requires knowledge and careful use of tools to filter out all the noise. Be nice and filter out the noise in a rebase.

> You can use --first-parent today for most git operations to get "clean views" no matter how complex the DAG web is beyond it. I think a lot of these debates would "go away" if more people and user interfaces defaulted to --first-parent and "drill down" navigation rather than firehose of the complete graph and confusing (but pretty) "subway diagrams".

Maybe, but this goes down the route of a blessed workflow. It would require all the tooling to agree on the workflow as important information could easily (even maliciously) be hidden if the workflow wasn't followed. It reminds me of the joke about emacs: an OS that lacks a good editor. Git is the SCM that lacks a good version control system.


> I don't think they should be on an eternal branch. I seriously doubt it is ever useful to know that some developer took a break at 11:21 three months ago. This is just noise and makes bisecting impossible, which is the entire point of keeping any history.

As someone who has had to do deep code archeology, "this was finished before coffee kicked in and is suspect" or "this was written before lunch and the coder may have been hangry" can be really interesting information to have.

git bisect supports --first-parent and bisecting a noisy history is not just possible, but often faster with --first-parent. When you find the merge commit that introduced the regression you branch that commit and run git bisect --first-parent in that branch for additional drilldown into which "sub-commit" of the merge introduced the problem. (And you can do that into additional layers if you've got deep merge commits.)

> You could, but it's thoroughly nonstandard and requires knowledge and careful use of tools to filter out all the noise.

It doesn't require that much "care" to use --first-parent as a default in your git commands. You can even set it as a default in you git config for relevant commands (like git log, git praise, git bisect), or just add simple aliases for them. Pretty standard, and not that much knowledge, and you can pass it around with a couple quick git config commands. Assuming of course remembering the pretty much only one option --first-parent is too hard.

Also, I don't know what "care" has to do with it: forget to do it and you see a lot of "noise". Noise isn't dangerous. Annoying maybe, but it's definitely not dangerous to see extra noise when you wanted a cleaner view.

Outside of the command line, sure there aren't a lot of great UI tools that take a --first-parent centric approach to git. But it doesn't need all of them to "agree" (because again, if a tool shows you too much noise, that's not dangerous, that's just annoying), just one good --first-parent based drilldown UI would do a lot to make people more comfortable with thinking about the git log in two dimensions instead of trying so much to squeeze git into the one dimension of CVS or SVN. I think it's mostly a matter of aesthetics and what "sells": the subway diagrams of the DAG look pretty in screenshots but rarely are a great user experience in practice. (So much so that everyone keeps wanting to smash git into a single dimension of code history because they find it too "noisy".) Rather than "declutter" with rebases, a --first-parent / drill-down-oriented UX would do wonders for the git ecosystem, especially for Junior Developers uncomfortable at the command line, that likely shouldn't be trusted with rebases, and would have a much better time all around if told them "don't sweat your individual commits, they'll roll up into a cleaner merge commit at PR time".


Care has to be taken that the trunk branch actually is the first parent in each merge commit. Maybe it's unlikely to go wrong in practice, but it's certainly possible to have a merge where the parents are the "wrong way around" thus messing up your first parent strategy.

Also, "git praise"? Is that really a thing now? Talk about not understanding programmer humour.


In most cases where people are using a PR system as the primary integration point, I've not seen a single PR system that has a problem with sometimes getting the parents in merges backwards. The only time I've seen that is junior developers making merges they shouldn't have been (and are the same developers I would never trust with rebase, even just in their own branches) and there is a way to rebase merge parents if you really want to pull out your rebase fu for something.

> Also, "git praise"? Is that really a thing now? Talk about not understanding programmer humour.

git praise has been a standard git alias for git blame for several years now. I'd prefer if git had followed most other VCSes and named it git annotate rather than making it a micro-aggression out of the box, but yeah one person's micro-aggression that makes a papercut in daily workflows is another person's punch down "humour", I guess. I'm glad you seem to enjoy it, I don't appreciate it.


I have to disagree with this as it relies on the assumption that every commit on a branch is logical and descriptive. In my experience a lot of PRs will have small commits that have poor names as they go through a review process. If you merge this using a regular merge commit or by rebasing the commits on the target branch this creates a lot of noise for those who look at the commit history.

In my opinion it is best to squash all commits into one before rebasing it on top of the target branch. During this process any information that is considered important for the history can be preserved by leaving it in the commit body.


> I have to disagree with this as it relies on the assumption that every commit on a branch is logical and descriptive. In my experience a lot of PRs will have small commits that have poor names as they go through a review process.

There's your problem. Code reviews should not allow such commits to pass through.


> Code reviews should not allow such commits to pass through.

Are you suggesting that your code review process has a stage for combing through commit messages? What does this look like? If the third commit message of 15 isn't up to par what happens?


You ask them to fix the commit message. Every git GUI should support that by now, so it should be a 1 minute fix, even for junior devs.


> I have to disagree with this as it relies

No, you didn't read the comment fully, or you only disagree with part of it. Because, you clearly missed this part:

> I still advocate for cleaning up and squashing your own commits as you see fit with an interactive rebase before your branch is merged.

If you do that, you don't end up with 'small, poorly named commits'. Or if you do, you have a lazy programmer / an idiot programmer in the team.

Which certainly happens, but, they ruin everything. You can't start shooting down processes, languages, tools, or anything else in the programmer space __just__ because some moron who abuses it ends up in a bad place. You need to show that a tool / feature / process / hook / etc turns otherwise fine, capable programmers into idiots in order to advocate for its abolishment. Not the other way around, or you end up with a blunt rock and a club and are then debating that they're holding the club at the wrong end.


As someone who carefully crafts my git history, I hate it when somebody smashes my work.


I agree. I do not use squash, I prefer to have a feature branch and live it alone after a merge.

Anyway some workmate use Squash when accepting pull request on GitLab/GitHub as a general workflow suggested by such tools and in context where trunk based development is not feasible.


> I still advocate for cleaning up and squashing your own commit

Completely agree. And I suspect the increasing frequency of squash-merging is mainly to avoid having to do the work of cleaning up and commenting individual commits in a longer sequence.

I can see this both ways, it really is faster and easier to squash. And you’re right, it really does bury some context and functionally makes large changes harder to read or bisect or revert or modify.

One benefit to squash merging that you might have overlooked is that it can encourage frequent (and messy) committing, knowing that the churn will disappear without having to work hard to clean it up. This does, in a way, make the git workflow more appealing and easier to manage for more people.


> One benefit to squash merging that you might have overlooked is that it can encourage frequent (and messy) committing, knowing that the churn will disappear without having to work hard to clean it up.

I've noticed the opposite. Developers who know all of their work will be smashed into one commit at the end tend to not commit as frequently, and the commits they do make are just checking in all of their work at intervals. It is more of a process of saving state. It doesn't matter how frequently they commit if all the commits will become one.


IME this depends on whether people make large or small PRs in a repo.

If people make small PRs, committing to mainline as they go, then squashing each PR fits well.


I've seen this argument come up a few times, and the best suggestion I've heard which could make both camps happy is to add the notion of commit groups. You could view a pr in history as a single commit group, or see each individual commit for the full context.


Is commit groups a feature or a wish list? I hadn't heard of it before but a Duck Duck Go search only throws up a blog post discussing the desirability for such a feature in git.


You can get something somewhat similar with "always create merge commit" workflows (no fast forward) and changing your tooling to look at the first-parent-history by default. This view will have one commit per merge, but you can choose to follow the second+ parent history for a given commit to see what went into it.


Or just include a group name in the commit message. No need for empty merge requests in your history. Since most people use issue tracking systems, just prepend the issue number to each commit message in the group.


> This discourages thoughtful and frequent commits that express the intent of a change because all the commits are just smashed together anyway so why bother.

This is only the case if said squashing just bundles commits without context or consistent logic. If merges to a mainline branch consist of feature branches whose pull request was already approved after a couple of iterations then the end result is a cleaner commit with it's history thoroughly audited. In practice it's equivalent to a fast-forward merge of a single-commit feature branch that just happened to be nearly lined up with mainline.


Agreed. This is when you believe that your program should at the very least compile (or pass tests) at any point in the history. In this case a commit must be a consistent and related set of changes.

In other words, a commit to us is sort of like an "atomic" change, something that cannot be split or else more or less bad things happen.

I have trouble conceiving a better way to use Git when you really care about the readability of your history. in some cases I don't care about readability though. On hobby projects I sometimes use Git more like a file transfer and synchronization tool. In this case I don't give a huck about how the history looks like.

Just like with code, the more readable this history is (in terms of what features/fixes are in there at some point in time), the better.


> This is when you believe that your program should at the very least compile (or pass tests) at any point in the history.

I only expect that at merge commits, which I can see with `git log --merges`.


Why would you? Linux (and any other C or Rust open source project I have worked on) compile and work at any commit.


Most of the time, that's what I expect. However, sometimes when proceeding step-by-step through a large refactor or a large feature addition, the codebase may be left temporarily in an incomplete state.

My preference under such circumstances is to favor clarity of commit history, and leave the step-by-step commits intact — with the requirement that they always be located in a feature branch behind a merge commit.


I mean, it does? If you rigidly turn every feature branch into a single commit, that means that also applies to feature branches that would be thoughtfully crafted into multiple clean commits. (Note: that is not the same as random fixup commits with code review iterations.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: