I have been contributing code for 10+ years, and I have worked on teams that did rebase and others that did not.
Not once have a ever debugged a problem that benefited from rebase vs merge. Fundamentally, I do not debug off git history. Not once has git history helped debug outside of looking at the blame + offending PR and diff.
Can someone tell me when they were fixing a problem and they were glad that they rebased? Bc I can't.
Debugging from git history is a separate question from merge vs rebase. Debugging from history can be done with non-rebased merges, with rebased merges, and with squashed commits, without any noticeable difference. Pass `--first-parent` to git-log and git-bisect in the first two cases and it's virtually identical.
My preference for rebasing comes from delivering stacked PRs: when you're working on a chain of individually reviewable changes, every commit is a clean, atomic, deliverable patch. git-format-patch works well with this model. GitHub is a pain to use this way but you can do it with some extra scripts and setting a custom "base" branch.
The reason in that scenario to prefer rebasing over "merging in master" is that every merge from master into the head of your stack is a stake in the ground: you can't push changes to parent commits anymore. But the whole point of stacked diffs is that I want to be able to identify different issues while I work, which belong to different changes. I want to clean things up as I go, without bothering reviewers with irrelevant changes. "Oh this README could use a rewrite; let me fix that and push it all the way up the chain into its own little commit," or "Actually now that I'm here, let me update dependencies and ensure we're on latest before I apply my changes". IME, an ideal PR is 90% refactors and "prefactors" which don't change semantics, all the way up to "implemented functionality behind a feature flag", and 10% actual changes which change the semantics. Having an editable history that you can "keep bringing with you" is indispensible.
Debugging history isn't really related. Other than that this workflow allows you to create a history of very small, easily testable, easily reviewable, easily revertible commits, which makes debugging easier. But that's a downstream effect.
I have worked on several codebases where it was enforced that the commit be rebased off of whatever the main branch was, all units of work squashed to a single commit, and only "working" code be checked into the main branch. This gives you a really good linear history, and when you're disciplined about writing good final commit messages and tagging them to a ticket, it means bisecting to find challenging bugs later becomes tractable, as each commit nominally should work and should be ready to deploy for testing. I've personally solved a number of challenging regressions this way.
I can give you an example of when I am glad I rebased. There have been many times I have been working on a feature that was going to take some time to finish. In that case my general workflow is to rebase against main every day or two. It lets me keep track of changes and handle conflicts early and makes the eventual merge much simpler. As for debugging I’ve never personally had to do this, but I imagine git bisect would probably work better with rebased, squashed commits.
I used hg (mercurial) before git. Every time I see someone make an argument like yours I think "only because git's merge/branch model is bad and so you need hacks to make it acceptable".
Git won, which is why I've been using it for more than 10 years, but that doesn't mean it was ever best, it was just most popular and so the rest of the eco system makes it worth it accepting the flaws (code review tools and CI system both have much better git support - these are two critical things that if you use anything else will work against you).
They kind of spoke to it. Rebasing to bring in changes from main to a feature branch which is a bit longer running keeps all your changes together.
All the commits for your feature get popped on top the commits you brought in from main. When you are putting together your PR you can more easily squash your commits together and fix up your commit history before putting it out for review.
It is a preference thing for sure but I fall into the atomic, self contained, commits camp and rebase workflows make that much cleaner in my opinion. I have worked with both on large teams and I like rebase more but each have their own tradeoffs
Bisect is one of those things where if you're on a certain kind of project, it's really useful, and if you're not on that kind of project you never need it.
If the contributor count is high enough (or you're otherwise in a role for which "contribution" is primarily adjusting others' code), or the behaviors that get reported in bugs are specific and testable, then bisect is invaluable.
If you're in a project where buggy behavior wasn't introduced so much as grew (e.g. the behavior evolved A -> B -> C -> D -> E over time and a bug is reported due to undesirable interactions between released/valuable features in A, C, and E), then bisecting to find "when did this start" won't tell you that much useful. If you often have to write bespoke test scripts to run in bisect (e.g. because "test for presence of bug" is a process that involves restarting/orchestrating lots of services and/or debugging by interacting with a GUI), then you have to balance the time spent writing those with the time it'd take for you to figure out the causal commit by hand. If you're in a project where you're personally familiar with roughly what was released when, or where the release process/community is well-connected, it's often better to promote practices like "ask in Slack/the mailing list whether anyone has made changes to ___ recently, whoever pipes up will help you debug" rather than "everyone should be really good at bisect". Those aren't mutually exclusive, but they both do take work to install in a community and thus have an opportunity cost.
This and many other perennial discussions about Git (including TFA) have a common cause: people assume that criticisms/recommendations for how to use Git as a release coordinator/member of a disconnected team of volunteers apply to people who use Git who are members of small, tightly-coupled teams of collaborators (e.g. working on closed-source software).
Git bisect is a wonder, especially combined with its ability to potentially do the success/fail testing on its own (with the help of some command you provide).
It is a tragedy that more people don't know about it.
I manage a maintained fork and periodically rebase our changes on top of upstream.
In this case, rebasing is nice because our changes stay in a contiguous block at the top (vs merging which would interleave them), so it's easy for me and others to see exactly where our fork diverges.
Git has gotten pretty smart about that recently: once you resolve a conflict, if you get the same conflict again it automatically resolves it the same way. Works for both rebase and merge.
I've used git bisect on a repo whose commit graph is at least 20-wide at some points. In the two cases I used it, it identified the individual commit. I didn't think very hard about it. It was the first time I used bisect. Maybe I got lucky.
This may be outdated because git’s defaults have improved a lot over the years. When I first used git on a team was in 2011. As I recall, there were various commands like git log -p that would show nothing for a merge commit. So without extra knowledge of the git flags you would not find what you were looking for if it was in a side path of the merge history. This caused a lot of confusion at times. We switched to a rebase approach because linear history is easier for people to use.
To answer your question directly, if somewhat glibly, I’m glad I rebased every time I go looking for something in the history because I don’t have to think about the history as a graph. It’s easier.
More to your point, there are times when blame on a line does not show the culprit. If you move code, or do anything else to that line, then you have to keep searching. Sometimes it’s easier to look at the entire patch history of a file. If there is a way to repeatedly/recursively blame on a line, that’s cool and I’d love to know about it.
I now manage two junior engineers and I insist that they squash and rebase their work. I’ve seen what happens if they don’t. The merges get tangled and crazy, they include stuff from other branches they didn’t mean to, etc. the squash/rebase flow has been a way to make them responsible for what they put into the history, in a way that is simple enough that they got up to speed and own it.
In fact, for searching how a file got to the state it is I prefer that when PRs are merged, they are merged and not rebased. I want the commit shas to be the same.
Rebasing on main loses provenance.
If you want a clean history doing it in the PR, before merging it. That way the PR is the single unit of work.
Merging a PR with rebase doesn't lose provenance. You can just keep all the commits in the PR branch. But even if you squash the branch into a single commit and merge (which these tools automate and many people do), it still doesn't lose provenance. The provenance is the PR itself. The PR is connected to a work item in the ticketing system. The git history preserves all the relevant info.
The main benefit I've found is when there is work happening concurrently in multiple feature branches at once (e.g. by different people). Rebase-merging greatly simplifies dealing with merge conflicts as you only have a simple diff against a single branch to deal with. The more work you have in progress at once the more important this becomes.
If you haven't used git bisect to find a regression, you should try it.
You can write a test (outside of source control) and run `git bisect good` on a good commit and `git bisect bad` on bad one and it'll do a binary search (it's up to you to rerun your test each time and tell git whether that's a good or a bad commit). Rather quickly, it'll point you to the commit that caused the regression.
If you rebase, that commit will be part of a block of commits all from the same author, all correlated with the same feature (and likely in the same PR). Now you know who you need to talk to about it. If you merge, you can still start with the author of the commit that git bisect found, but it's more likely to be interleaved with nearby commits in such a way that when it was under development, it had a different predecessor. That's a recipe for bugs that get found later than they otherwise would've.
If you're not using git history to debug, you're probably not really aware of which problems would've turned out differently if the history was handled differently. If you do, you'll catch cases where one author or another would've caught the bug before merging to main, had they bothered to rebase, but instead the bug was only visible after both authors thought they were done.
Once we had a slowdown in our application that went unadressed for a couple of months. Using git bisect to binary search across a bunch of different commits and run a perf test, every commit being a "good" historical commit allowed that to be much easier, and I found the offending commit fast.
I have often been happy to have a clean linear history when asking myself things like "does build X.Y.Z include this buggy change I found in commit abcdefg?". With a history full of merges, where a commit from 1st of January might be merged only on the 20th of July, this gets MUCH harder to answer.
This is especially true if you have multiple repos and builds from each one, such that you can't just checkout the commit for build X.Y.Z and easily check if the code contains that or not (you'd have to track through dependency builds, checkout those other dependencies, possibly repeat for multiple levels). If the date of a commit always reflects the date it made it into the common branch, a quick git log can tell you the basic info a lot of the time.
I use `git rebase` all the time, along with `git add -p`, `git diff` and other tools. It helps me to maintain logical commit history.
- Reshuffle commits into a more logical order.
- Edit commit subjects if I notice a mistake.
- Squash (merge) commits. Often, for whatever reason pieces of a fix end up in separate commits and it's useful to collect and merge them.
I'd like to make every commit perfect the first time but I haven't managed to do that yet. git-rebase really helps me clean things up before pushing or merging a branch.
I avoid rebase like plague (perhaps because of my early experiences with it). I used to get continuous conflicts for the same commits again and again, and the store and replay kinda helped with it but not always. Merge always worked for me (once I resolve conflicts, thats the end of it). Now I always merge main into my feature branch and then merge it back to main when ready. Does it pollute the history? Maybe, but Ive never looked. It does not matter to our team.
Allow me (today) to be that person to propose checking out Jujutsu instead [0]. Not only it has a superpower of atomic commits (reviewers will love you, peers will hate 8 small PRs that are chained together ;-)) but it's also more consistent than git and works perfectly well as a drop-in replacement.
In fact, I've been using Jujutsu for ~2 years as a drop-in and nobody complained (outside of the 8 small PRs chained together). Git is great as a backend, but Jujutsu shines as a frontend.
The key thing to point out is that jujutsu is a rebase-based workflow, and no on who uses jujutsu ever worries about rebasing (they may not even be aware of it). It's a good demonstration of a tool that got rebase right, unlike git.
Pre-jujutsu, I never rebased unless my team required it. Now I do it all the time.
Pre-jj, I never had linear history, unless the team required it. Now most of my projects have linear history.
You don't have to chain 8 PRs together, Github tries really hard to hide this from you but you can in fact review one commit at a time, which means you don't need to have a stack of 8 PRs that cascade into each other.
What is the problem with submodules? I like to use them because it means the code I need from another repo remains the same until I update it. No unexpected breaking changes.
And if a major problem, `jj op log` + `jj op log restore` fix it. This is the major super power of jj: Before I need to nuke the git repo on bad rebases (not chance in hell I can find how undo the massive 20+ steps bad rebase)
I used to think like this, but then I realized: jj-mode.el exists[0] and you can still use magic since it's still a git repo underneath. Seriously, don't let this hold you back.
I wish rebase was taught as the default - I blame the older inferior version control software. It’s honestly easier to reason about a rebase than a merge since it’s so linear.
Understanding of local versus origin branch is also missing or mystical to a lot of people and it’s what gives you confidence to mess around and find things out
The end result of a git rebase is arguably superior. However, I don't do it, because the process of running git rebase is a complete hassle. git merge is one-shot, whereas git rebase replays commits one-by-one.
Replaying commits one-by-one is like a history quiz. It forces me to remember what was going on a week ago when I did commit #23 out of 45. I'm grateful that git stores that history for me when I need it, but I don't want it to force me to interact with the history. I've long since expelled it from my brain, so that I can focus on the current state of the codebase. "5 commits ago, did you mean to do that, or can we take this other change?" I don't care, I don't want to think about it.
Of course, this issue can be reduced by the "squash first, then rebase" approach. Or judicious use of "git commit --amend --no-edit" to reduce the number of commits in my branch, therefore making the rebase less of a hassle. That's fine. But what if I didn't do that? I don't want my tools to judge me for my workflow. A user-friendly tool should non-judgmentally accommodate whatever convenient workflow I adopted in the past.
Git says, "oops, you screwed up by creating 50 lazy commits, now you need to put in 20 minutes figuring out how to cleverly combine them into 3 commits, before you can pull from main!" then I'm going to respond, "screw you, I will do the next-best easier alternative". I don't have time for the judgement.
> "oops, you screwed up by creating 50 lazy commits, now you need to put in 20 minutes figuring out how to cleverly combine them into 3 commits, before you can pull from main!"
You can also just squash them into 1, which will always work with no effort.
Then is not rebase your problem, but all your other practices. Long lived feature branches with lot's of unorganized commits with low cohesion.
Sometimes it's ok to work like this, but you asking git not being judgamental is like saying your roomba should accomodate to you didin't asking you to empty it's dust bag.
I always do long lived feature branches, and rarely have issues. When I hear people complain about it, I question their workflow/competence.
Lots of commits is good. The thing I liked about mercurial is you could squash, while still keeping the individual commits. And this is also why I like jj - you get to keep the individual commits while eliminating the noise it produces.
You can make long lived feature branches work with rebase, you just have to regularly rebase along the way.
I had a branch that lived for more than a year, ended up with 800+ commits on it. I rebased along the way, and the predictably the final merge was smooth and easy.
1) because git rerere remembers the resolutions to the ..
2) small conflicts when rebasing the long lived branch on the main branch
if instead I delayed any rebasing until the long lived branch was done, I'd have no idea of the scale of the conflicts, and the task could be very, very different.
Granted, in some cases there would be no or very few conflicts, and then both approaches (long-lived branch with or without rebases along the way) would be similar.
While it is a bit of a pain, it can be made a lot easier with the --keep-base option. This article is a great example https://adamj.eu/tech/2022/03/25/how-to-squash-and-rebase-a-... of how to make rebasing with merge conflicts significantly easier. Like you said though, it's not super user-friendly but at least there are options out there.
This seems crazy to me as a self-admitted addict of “git commit --amend --no-edit && git push --force-with-lease”.
I don’t think the tool is judgmental. It’s finicky. It requires more from its user than most tools do. Including bending over to make your workflow compliant with its needs.
I don't mind rebasing a single commit, but I hate it when people rebase a list of commits, because that makes commits which never existed before, have probably never been tested, and generally never will be.
I've had failures while git bisecting, hitting commits that clearly never compiled, because I'm probably the first person to ever check them out.
Sometimes it feels like the least-bad alternative.
e.g. I'm currently working on a substantial framework upgrade to a project - I've pulled every dependency/blocker out that could be done on its own and made separate PRs for them, but I'm still left with a number of logically independent commits that by their nature will not compile on their own. I could squash e.g. "Update core framework", "Fix for new syntax rules" and "Update to async methods without locking", but I don't know that reviewers and future code readers are better served by that.
It seems to me the "Not Rocket Science" invariant is upheld if you just require all PRs to be fast-forward changes. Which I guess is an argument in support of rebase, but a clean merge counts too. If the test suite passes on the PR branch, it'll pass on main, because that's what main will be afterward. Ideally you don't even test the same commit hash twice.
If you have expensive e2e tests, then you might want to keep a 'latest' tag on main that's only updated when those pass.
Oh, that's why. I barely used any CVS before Git, so I was always puzzled about the "weird" opinions on this topic. I'm still puzzled by the fact that some people seem to reject entirely the idea of rewriting history - even locally before you have pushed/published it anywhere.
Sometimes people look sort of "superstitious" to me about Git. I believe this is caused by learning Git through web front-ends such as Github, GitLab, Gitea etc., that don't tell you the entire truth; desktop GUI clients also let the users only see Git through their own, more-or-less narrow "window".
TBH, sometimes Git can behave in ways you don't expect, like seeing conflicts when you thought there wouldn't be (but up to now never things like choosing the "wrong" version when doing merges, something I did fear when I started using it a ~decade ago).
However one usually finds an explanation after the fact. Something I've learned is that Git is usually right, and forcing it to do things is a good recipe to mess things up badly.
Rebase your local history, merge collaborative work. It helps to just relabel rebase as "rewrite history". That makes it more clear that it's generally not acceptable to force push your rewritten history upstream. I've seen people trying to force push their changes and overwrite the remote history. If you need to force push, you probably messed up. Maybe OK on your own pull request branches assuming nobody else is working on them. But otherwise a bad idea.
I tend to rebase my unpushed local changes on top of upstream changes. That's why rebase exists. So you can rewrite your changes on top of upstream changes and keep life simple for consumers of your changes when they get merged. It's a courtesy to them. When merging upstream changes gets complicated (lots of conflicts), falling back to merging gives you more flexibility to fix things.
The resulting pull requests might get a bit ugly if you merge a lot. One solution is squash merging when you finally merge your pull request. This has as the downside that you lose a lot of history and context. The other solution is to just accept that not all change is linear and that there's nothing wrong with merging. I tend to bias to that.
If your changes are substantial, conflict resolution caused by your changes tends to be a lot easier for others if they get lots of small commits, a few of which may conflict, rather than one enormous one that has lots of conflicts. That's a good reason to avoid squash merges. Interactive rebasing is something I find too tedious to bother with usually. But some people really like those. But that can be a good middle ground.
It's not that one is better than the other. It's really about how you collaborate with others. These tools exist because in large OSS projects, like Linux, where they have to deal with a lot of contributions, they want to give contributors the tools they need to provide very clean, easy to merge contributions. That includes things like rewriting history for clarity and ensuring the history is nice and linear.
Maybe I'm old, but I still think a repository should be a repository: sitting on a server somewhere, receiving clean commits with well written messages, running CI. And a local copy should be a local copy: sitting on my machine, allowing me to make changes willy-nilly, and then clean them up for review and commit. That's just a different set of operations. There's no reason a local copy should have the exact same implementation as a repository, git made a wrong turn in this, let's just admit it.
I agree but I think git got the distributed (ie all nodes the same) part right. I also think what you say doesn't take it far enough.
I think it should be possible to assign different instances of the repository different "roles" and have the tooling assist with that. For example. A "clean" instance that will only ever contain fully working commits and can be used in conjunction with production and debugging. And various "local" instances - per feature, per developer, or per something else - that might be duplicated across any number of devices.
You can DIY this using raw git with tags, a bit of overhead, and discipline. Or the github "pull" model facilitates it well. But either you're doing extra work or you're using an external service. It would be nice if instead it was natively supported.
This might seem silly and unnecessary but consider how you handle security sensitive branches or company internal (proprietary) versus FOSS releases. In the latter case consider the difficulty of collaborating with the community across the divide.
> I still think a repository should be a repository: sitting on a server somewhere, receiving clean commits with well written messages, running CI. And a local copy should be a local copy: sitting on my machine, allowing me to make changes willy-nilly, and then clean them up for review and commit
This is one way to see things and work and git supports that workflow. Higher-level tooling tailored for this view (like GitHub) is plentiful.
> There's no reason a local copy should have the exact same implementation as a repository
...Except to also support the many git users who are different from you and in different context. Bending gits API to your preferences would make it less useful, harder to use, or not even suitable at all for many others.
> git made a wrong turn in this, let's just admit it.
Nope. I prefer my VCS decentralized and flexible, thank you very much. SVN and Perforce are still there for you.
Besides, it's objectively wrong calling it "a wrong turn" if you consider the context in which git was born and got early traction: Sharing patches over e-mail. That is what git was built for. Had it been built your way (first-class concepts coupled to p2p email), your workflow would most likely not be supported and GitHub would not exist.
If you are really as old as you imply, you are showing your lack of history more than your age.
I've heard people say before that it is easier to reason about a linear history, but I can't a think of a situation where this would let me solve a problem easier. All I can think of is a lot of downsides. Can you give an example where it helps?
If this was the main strategy used even for public/shared branches, then everyone would have to deal with changing, conflicting histories all the time.
I've had recent interns who've struggled with rebase and they've never known anything but Git. Never understood why that was given they seem ok with basic commits and branching. I would agree that rebase is easier to reason about than merging yet I'm still needing to give what feels like a class on it.
The fact that people have a harder time understanding rebase is evidence that rebase is harder to reason about. Whether you update your understanding based on that evidence is up to you. If I have to pick between merge and rebase, I would generally pick merge. It seems to cause less conflicts with long-lived branches. Commits maintain their identity so each one has to be conflict-resolved at most once.
However, even better for me (and my team) is squash on PR resolve.
git rebase squash as a single commit on a single main branch is the one true way.
I know a lot of people want to maintain the history of each PR, but you won't need it in your VCS.
You should always be able to roll back main to a real state. Having incremental commits between two working stages creates more confusion during incidents.
If you need to consult the work history of transient commits, that can live in your code review software with all the other metadata (such as review comments and diagrams/figures) that never make it into source control.
> I know a lot of people want to maintain the history of each PR, but you won't need it in your VCS.
I strongly disagree. Losing this discourages swarming on issues and makes bisect worse.
> You should always be able to roll back main to a real state. Having incremental commits between two working stages creates more confusion during incidents.
If you only use merge commits this shouldn't be any more difficult. You just need to make sure you specify that you want to use the first parent when doing reverts.
Merging merge requests as merge commits (rather than fast-forwarding them) gives the same granularity in the main branch, while preserving the option to have bisect dive inside the original MR to actually find the change that made the interesting change in behavior.
But they have, with pull requests. When you merge a pull request it is done via the "subtree" merge strategt, which preserves partial commits and also does not flatten them.
This is one of the few hills I will die on. After working on a team that used Phabricator for a few years and going back to GitHub when I joined a new company, it really does make life so much nicer to just rebase -> squash -> commit a single PR to `main`
What was stopping you from squash -> merge -> push two new changesets to `main`? Isn't your objection actually to the specifics of the workflow that was mandated by your employer as opposed to anything inherent to merge itself?
> You should always be able to roll back main to a real state.
Well there's your problem. Why are you assuming there are non-working commits in the history with a merge based workflow? If you really need to make an incremental commit at a point where the build is broken you can always squash prior to merge. There's no reason to conflate "non-working commits" and "merge based workflow".
Why go out of the way to obfuscate the pathway the development process took? Depending on the complexity of the task the merge operation itself can introduce its own bugs as incompatible changes to the source get reconciled. It's useful to be able to examine each finished feature in isolation and then again after the merge.
> with all the other metadata (such as review comments and diagrams/figures) that never make it into source control.
I hate that all of that is omitted. It can be invaluable when debugging. More generally I personally think the tools we have are still extremely subpar compared to what they could be.
> I know a lot of people want to maintain the history of each PR, but you won't need it in your VCS.
Having worked on a maintenance team for years, this is just wrong. You don't know what someone will or won't need in the future. Those individual commits have had extra context that have been a massive help for me all sorts of times.
I'm fine with manually squashing individual "fix typo"-style commits, but just squashing the entire branch removes too much.
When your PR build takes more than an hour you'll think twice before creating multiple PRs for multiple related commits (e.g. refactoring+feature) when working on a single issue.
I completely agree. It also forces better commit messages, because "maintaining the history of each PR" is forced into prose written by the person responsible for the code instead of hand-waving it away into "just check the commits" -- no thanks.
I never understood why rebase is such a staple in the git world. For me "loosing" historical data, like on which branch my work was done is a real issue.
In the same class, for commit to not have on which branch they were created as a metadata is a rel painpoint. It always a mess to find what commit were done for what global feature/bugfix in a global gitflow process...
I'll probably be looking into adding an commit auto suffix message with the current branch in the text, but it will only work for me, not any contributors...
Ideally you only rebase your own commit on your own feature branch, just before merging. Having a clean commit history before merging make the main branch/trunk more readable.
Also (and especially) it make it way easier to revert a single feature if all the relevant commits to that feature are already grouped.
For your issue about not knowing which branch the commits are from: that why I love merge commits and tree representation (I personally use 'tig', but git log also have a tree representation and GUI tools always have it too).
> Surely a better approach is to record the complete ancestry of every check-in but then fix the tool to show a "clean" history in those instances where a simplified display is desirable and edifying
From your link. The actual issue that people ought to be discussing in this comment section imo.
Why do we advocate destroying information/data about the dev process when in reality we need to solve a UI/display issue?
The amount of times in the last 15ish years I've solved something by looking back at the history and piecing together what happened (eg. refactor from A to B as part of a PR, then tweak B to eventually become C before getting it merged, but where there are important details that only resulted because of B, and you don't realize they are important until 2 years later) is high enough that I consider it very poor practice to remove the intermediate commits that actually track the software development process.
Isn't this just `--first-parent`? I think that should probably be the default in git. Maybe the only way this will happen is with a new SCM.
But the git authors are adamant that there's no convention for linearity, and somehow extended that to why there shouldn't be a "theirs" merge strategy to mirror "ours" (writing it out it makes even less sense, since "theirs" is what you'd want in a first-parent-linear repo, not "ours").
Which branch your work was done on is noise, not signal. There is absolutely zero signal lost by rebasing, and it prunes a lot of noise. If your branch somehow carries information, that information should be in your commit message.
I disagree, without this info, I can't easily tell if any commit is part of a feature or is a simple hotfix. I need to rely on the commiter to include the info in the commit message, which is almost always not the case.
Every commit message starts with the ticket number of whatever issue tracking system you're using. If you're not using issue tracking with a system large enough for multiple devs, you've got a much bigger problem.
it's just "gitflow" is unnecessary complex (for most applications). with rebase you can work more or less as with "patches" and a single master, like many projects did in 90x, just much more comfortably and securely.
> Warning - Because changing your commit history can make things difficult for everyone else using the repository, it's considered bad practice to rebase commits when you've already pushed to a repository.
Also branches that are write-only by a single person by consensus. E.g. "personal" PR branches that are not supposed to be modified by anyone but owner.
- the web tooling must react properly to this (as GH does mostly)
- comments done at the commit level are complicated to track
- and given the reach of tools like GH, people shooting their own foot with this is (even experienced ones) most likely generate a decent level of support for these tools teams
"fetched this branch" needs to include "started reviewing the PR", and probably other cases; it does mean switching modes for devs who usually rebase privately.
It's funny because I learned git on the job and we exclusively used rebase when I was learning my git fundamentals. I wouldn't say merging scares be, but it's never a tool a reach for.
> The response is often hesitation or outright fear. I get it. Rebase has a reputation for destroying work, and the warnings you see online don’t help.
The best method for stop being terrified of destructive operations in git when I first learned it, was literally "cp -r $original-repo $new-test-repo && go-to-town". Don't know what will happen when you run `git checkout -- $file` or whatever? Copy the entire directory, run the command, look at what happens, then decide if you want to run that in your "real" repository.
Sound stupid maybe, but if it works, it works. Been using git for something like a decade now, and I'm no longer afraid of destructive git operations :)
Yeah, works for normal "lets try out what happens when I do this" but it can get messy, depending on what you're trying out. That's why I always recommended beginners to literally "cp -r" the entire directory instead, together with the git repository, so they can feel freer to completely experiment and not be afraid of loosing anything.
I guess it's actually more of a mental "divider" than anything, it tends to relax people more when they can literally see that their old stuff is still there, and I think git branches can "scare" people in that way.
Granted, this is about people very new to git, not people who understands what is/isn't destructive, and just because a file isn't on disk doesn't mean git doesn't know exactly what it is.
> Granted, this is about people very new to git, not people who understands what is/isn't destructive, and just because a file isn't on disk doesn't mean git doesn't know exactly what it is.
I've been using git almost exclusively since 2012 and feel very comfortable with everything it does and where the sharp edges are. Despite that, I still regularly use the cp -r method when doing something even remotely risky. The reason being, that I don't want to have to spend time unwinding git if I mess something up. I have the understanding and capability of doing so, but it's way easier to just cp -r and then rm -rf && cp -r again if I encounter something unexpected.
Two examples situations where I do this:
1. If I'm rebasing or merging with commits that have a moderate to high risk of merge conflicts that could be complicated. I might get 75% through and then hit that one commit where there's a dozen spots of merge conflict and it isn't straightforwardly clear which one I want (usually because I didn't write them). It's usually a lot easier to just rm -rf the copy and start over in a clean cp -r after looking through the PR details or asking the person who wrote the code, etc.
2. If there are uncommitted files in the repo that I don't want to lose. I routinely slap personal helper scripts or Makefiles or other things on top of repos to ease my workflow, and those don't ever get committed. If they are non-trivial then I usually try to keep a copy of them somewhere else in case I need to restore, but I'm not alway ssuper disciplined about that. The cp -r method helps a lot
There are more scenarios but those are the big two that come to mind.
in my experience some of the trickiest situations are around gitignore file updates, crlf conversion, case [in]sentivity, etc. where clones and branches are less useful as a testing ground.
whoa. well, if it really works for you. The thing is, git has practically zero "destructive" commands, you almost always (unless you called garbage collector aggressively) return to the previous state of anything committed to it. `git reflog` is a good starting point.
I think i've seen someone coded user-friendlier `git undo` front for it.
TDLR is: people feel safer when they can see that their original work is safe, while just making a new branch and playing around there is safe in 99% of the cases, people are more willing to experiment when you isolate what they want to keep.
In order for that to work you need some level of confidence that rebase doesn't mess with your branch. Rebase has a reputation for "rewriting history".
The fastest way to eliminate fear is to practice. I had the team go through it one day. They didn't get a choice. I locked us on a screen share until everyone was comfortable with how rebasing works. The call lasted maybe 90 minutes. You just have to decide one day that you (or the team) will master this shit, spend a few hours doing it, and move on.
Rebase is a super power but there are a few ground rules to follow that can make it go a lot better. Doing things across many smaller commits can make rebase less painful downstream. One of the most important things is to learn that sometimes a rebase is not really feasible. This isn't a sign that your tools are lacking. This is a sign that you've perhaps deviated so far that you need to reevaluate your organization of labor.
Rebasing replays your commits on top of the current main branch, as if you’d just created your branch today. The result is a clean, linear history that’s easier to review and bisect when tracking down bugs.
The article discusses why contributors should rebase their feature branches (pull request).
The reason they give is for clean git history on main.
The more important reason is ensure the PR branch actually works if merged into current main. If I add my change onto main, does it then build, pass all tests, etc? What if my PR branch is old, and new commits have been added onto main that I don't have in my PR branch? Then I can merge and break main. That's why you need to update your PR branch to include the newer commits from main (and the "update" could be a rebase or a merge from main or possibly something else).
The downside of requiring contributors to rebase their PR branch is (1) people are confused about rebase and (2) if your repository has many contributors and frequent merges into main, then contributors will need to frequently rebase their PR branch, and each rebase their PR checks need to re-run, which can be time consuming.
My preference with Github is to squash merge into main[1] to keep clean git history on main. And to use merge queue[2], which effectively creates a temp branch of main+PR, runs your CI checks, and then the PR merge succeeds into main only if checks pass on the temp branch. This approach keeps super clean history on main, where every commit includes a specific PR number, and more importantly minimizes friction for contributors by reducing frequent PR rebases on large/busy repos. And it ensures main is never broken (as far as your CI checks can catch issues). There's also basically no downside for very small repos either.
I don't understand why I would push to origin my local branch. This will (potentially) require multiple push -f. I prefer to share the work in a state I considere complete.
As an alternative, just create a new branch! `git branch savepoint-pre-rebase`. That's all. This is extremely cheap (just copy a reference to a commit) and you are free to play all you want.
You are a little more paranoid? `git switch -c test-rebase` and work over the new branch.
Yeah, deleting your local clone and starting over should normally not be necessary, unless you really mess things up badly.
The "local backup branch" is not really needed either because you can still reference `origin/your-branch` even after you messed up a rebase of `your-branch` locally.
Even if you force-pushed and overwrote `origin/your-branch` it's most likely still possible to get back to the original state of things using `git reflog`.
For amateurs at Git, recovery branches/tags are probably easier to switch back to than digging through reflog. Particularly if you're interacting with Git via some GUI that hides reflog away as some advanced feature.
Github is not Git but I find the Squash and Merge functionality on Github's Pull Request system means I no longer need to worry about rebasing or squashing my commits locally before rebasing.
At work though it is still encouraged to rebase, and I have sometimes forgotten to squash and then had to abort, or just suck it up and resolve conflicts from my many local commits.
Squash merges are a hacky solution to the git bisect problem that was solved correctly by --first-parent 20 years ago. There are fully employed software developers working on important stuff that literally were never alive in a world where squash merges were needed.
Don't erase history. Branch to a feature branch, develop in as many commits as you need, then merge to main, always creating a merge commit. Oftentimes, those commit messages that you're erasing with a squash are the most useful documentation in the entire project.
Maintaining linear history is arguably more work. But excessively non-linear history can be so confusing to reason over.
Linear history is like reality: One past and many potential futures. With non-linear history, your past depends on "where you are".
----- M -----+--- P
/
----- D ---+
Say I'm at commit P (for present). I got married at commit M and got a dog at commit D. So I got married first and got a Dog later, right? But if I go back in time to commit D where I got the dog, our marriage is not in my past anymore?! Now my wife is sneezing all the time. Maybe she has a dog allergy. I go back in time to commit D but can't reproduce the issue. Guess the dog can't be the problem.
> So I got married first and got a Dog later, right?
No. In one reality, you got married with no dog, and in another reality you got a dog and didn't marry. Then you merged those two realities into P.
Going "back in time to commit D" is already incorrect phrasing, because you're implying linear history where one does not exist. It's more like you're switching to an alternate past.
I don't really agree that it's harder to reason over in the sense that it's hard to understand the consequences, but I also agree that a linear history is superior for troubleshooting, just like another comment pointed out that single squashed commits onto a main branch makes it easier to troubleshoot because you go from a working state to a non-working state between two commits.
there are others tricky time issues with staging/prod parallel branching models too, the most recent merge (to prod) contains older content, so time slips .. maybe for most people it's obvious but it caused me confusion a few times to compare various docker images
> the most recent merge (to prod) contains older content
Can't that also happen with a rebase? Isn't it an (all too easy to make) error any time you have conflicting changes from two different branches that you have to resolve? Or have I misunderstood your scenario?
Until you commit to trunk, you haven't gotten the dog. You're proposing getting a dog. The action of getting the dog happens at merge to trunk. That's when history is created.
You omitted the merge commit. M is taken so let's go with R. You jump back to M to confirm that the symptoms really don't predate the marriage. Then you jump to R to reproduce and track down the underlying cause of the bad interaction.
Had you simply rebased you would have lost the ability to separate the initial working implementation of D from the modifications required to reconcile it with M (and possibly others that predate it). At least, unless you still happen to have a copy of your pre-rebase history lying around but I prefer not to depend on happenstance.
> Had you simply rebased you would have lost the ability to separate the initial working implementation of D from the modifications required to reconcile it with M
I'd say: cleaning that up is an advantage. Why keep that around? It wouldn't be necessary if there was no update on the main branch in the meantime. With rebase you just pretend you started working after that update on main.
For the reason I stated that you quoted right there. Separating the potentially quite large set of changes of the initial (properly working) feature from the (hopefully not too large) set of changes reconciling that feature with the other (apparently incompatible in this example) feature. It provides yet another option for filtering the irrelevant from the relevant thus could prove quite useful at times.
Recall that the entire premise is that there's a bug (the allergy). So at some point a while back something went wrong and the developer didn't notice. Our goal is to pick up the pieces in this not-so-ideal situation.
What's the advantage of "cleaning up" here? Why pretend anything? In this context there shouldn't be a noticeable downside to having a few extra kilobytes of data hanging around. If you feel compelled to "clean up" in this scenario I'd argue that's a sign you should be refactoring your tools to be more ergonomic.
It might be worthwhile to consider the question, why have history in the first place? Why not periodically GC anything other than the N most recent commits behind the head of each branch and tag?
That’s also because there are multiple concerned that are tried to be presented as the same exposed output through a common feature. Having one branch that provides a linear logical overview of the project feature progression is not incompatible with having many other branches with all kind of messes going back and forth, merging and forking each other and so on.
In my experience, when there is a bug, it’s often quicker to fix it without having a look at the past commits, even when a regression occurs. If it’s not obvious just looking at the current state of the code, asking whoever touch that part last will generally give a better shortcut because there is so much more in the person mind than the whole git history.
Yes logs and commit history can brings the "haha" insight, and in some rare occasion it’s nice to have git bisect at hand.
Maybe that’s just me, and the pinnacle of best engineers will always trust the source tree as most important source of information and starting point to move forward. :)
> the worst case scenario for a rebase gone wrong is that you delete your local clone and start over. That’s it. Your remote fork still exists.
This is absolute nonsense. You commit your work, and make a "backup" branch pointing at the same commit as your branch. The worst case is you reset back to your backup.
I was working on a local branch, periodically rebasing it to master. All was well, my git history was beautiful etc.
Then down the line I realised something was off. Code that should have been there wasn't. In the end I concluded some automatic commit application while rebasing gobbled up my branch changes. Or frankly, I don't even entirely know what happened (this is my best guess), all I know is, suddenly it wasn't there.
No big deal, right? It's VCS. Just go back in time and get a snapshot of what the repo looked like 2 weeks ago. Ah. Except rebase.
I like a clean linear history as much as the next guy, but in the end I concluded that the only real value of a git repo is telling the truth and keeping the full history of WTF really happened.
You could say I was holding it wrong, that if you just follow this one weird old trick doctor hate, rebase is fine. Maybe. But not rebasing and having a few more squiggles in my git history is a small price to pay for the peace of mind that my code change history is really, really all there.
Nowadays, if something leaves me with a chance that I cannot recreate the repo history at any point in time, I don't bother. Squash commits and keeping the branch around forever are OK in my book, for example. And I always commit with --no-ff. If a commit was never on master, it shouldn't show up in it.
> Just go back in time and get a snapshot of what the repo looked like 2 weeks ago. Ah. Except rebase.
This is false.
Any googling of "git undo rebase" will immediately point out that the git reflog stores all rebase history for convenient undoing.
Shockingly, got being a VCS has version control for the... versions of things you create in it, not matter if via merge or rebase or cherry-pick or whatever. You can of course undo all of that.
Up to a point - they are garbage collected, right?
And anyway, I don't want to dig this deep in git internals. I just want my true history.
Another way of looking at it is that given real history, you can always represent it more cleanly. But without it you can never really piece together what happened.
The reflog is not a git internal -- it is your local repository's "true history", including all operations that you ran.
The `git log` history that you push is just that curated specific view into what you did that you wish to share with others outside of your own local repository.
The reflog is to git what Ctrl+Z is to Microsoft Word. Saying you don't want to use the reflog to undo a rebase is a bit like saying you don't want to use Ctrl+Z to undo mistakes in Word.
(Of course the reflog is a bit more powerful of an undo tool than Ctrl+Z, as the reflog is append-only, so undoing something doesn't lose you the newer state, you can "undo the undo", while in Word, pressing Ctrl+Z and then typing something loses the tail of the history you undid.)
Indeed, like for Word, the undo history expires after a configurable time. The default is 90 days for reachable changes and 30 days for unreachable changes, which is usually enough to notice whether one messed up one's history and lost work. You can also set it to never expire.
It is fine for people to prefer merge over rebase histories to share the history of parallel work (if in turn they can live with the many drawbacks of not having linear history).
But it is misleading to suggest that rebase is more likely to lose work from interacting with it. Git is /designed/ to not lose any of your work on the history -- no matter the operation -- via the reflog.
But it's at best much harder to find stuff in the reflog than to simply use git's history browsing tools. "What's the state of my never-rebased branch at time X" is a trivial question to answer. Undoing a rebase, at best, involves some hard resets or juggling commit hashes.
None of it is impossible, but IMHO it's a lot of excitement of the wrong kind for essentially no reward.
> I always use VS Code for this step. Its merge conflict UI is the clearest I’ve found: it shows “Accept Current Change,” “Accept Incoming Change,” “Accept Both Changes,” and “Compare Changes” buttons right above each conflict.
I still get confused by vscode’s changing the terms used by Git. «Current» vs «incoming» are not clear, and can be understood to mean two different things.
- Is “current” what is on the branch I am rebasing on? Or is it my code? (It’s my code)
- Is “incoming” the code I’m adding to the repo? Or is it what i am rebasing on to? (Again, the latter is correct)
I find that many tools are trying to make Git easier to understand, but changing the terms is not so helpful. Since different tools seldom change to the same words, it just clutters any attempts to search for coherent information.
Git's "ours"/"theirs" terminology is often confusing to newcomers, especially when from a certain (incorrect, but fairly common) point of view their meaning may appear to be swapped between merge and rebase. I think in an attempt to make the terminology less confusing UIs tend to reinvent it, but they always fail miserably, ending up with the same problem, just with slightly different words.
This constant reinvention makes the situation even worse, because now the terminology is not only confusing, but also inconsistent across different tools.
We use SVN at work and it's a nightmare there too, "mine" and "theirs" and whatnot. I frequently end up looking at historical versions just to verify which is which.
If I have a merge conflict I typically have to be very conscious about what was done in both versions, to make sure the combination works.
I wish for "working copy" and "from commit 1234 (branch xyz)" or something informative, rather than confusing catch-all terms.
Using SmartSVN which makes life a fair bit better but still keeps this confusing terminology.
We'll be migrating to Git this year though so.
For reference, the codebase is over 20 years old, and includes binary dependencies like libraries. Makes it easy to compile old versions when needed, not so easy on the repository size...
I think even presenting them as options makes it even more confusing to newcomers. Usually I find that neither is correct and there's a change on both sides I need to manually merge (so I don't even pay attention to the terminology), but I've seen co-workers just blindly choose their changes because it's familiar looking then get confused when it doesn't work right.
For merges current is the branch you are on, for rebases it helps to see them as a serie of cherry picks, so current would be the branch you would be on while doing the cherry pick equivalent to this step of the rebase.
When things get messy I use Sublime Merge with two tabs, one with the code that's open in VS Code and one with the same project but different branch/commit.
It works well on Linux. I've managed to make it work with Windows + WSL but I don't recommend it.
In about 12 years of using git (jj user now) I almost never rebased through the CLI, but I found shuffling branches around in a GUI pretty intuitive. I liked GitUp[0], which gave me undo way before jj existed.
The common view that a Git GUI is a crutch is very wrong, even pernicious. To me it is the CLI that is a disruptive mediation, whereas in a GUI you can see and manipulate the DAG directly.
Obligatory jj plug: before jj, I would have agreed with the top comment[1] that rebasing was mostly unnecessary, even though I was doing it in GitUp pretty frequently — I didn't think of it as rebasing because it was so natural. Now that I use jj I see that the cost-benefit analysis around git rebase is dominated by the fact that both rebasing and conflict resolution in git are a pain in the ass, which means the benefit has to be very high to compensate. In jj they cost much less, so the neatness benefit can be quite small and still be worth it. Add on the fact that Claude Code can handle it all for you and the cost is down to zero.
git rebase appears to be a random number generator to me. I've got two PRs in flight, I realize that one branch requires something I did in the other branch. I go merge branch one, then rebase branch two off main. Half the time; success, get on with my day. The other half of the time; anarchy, madness, dogs and cats living together.
Not once have a ever debugged a problem that benefited from rebase vs merge. Fundamentally, I do not debug off git history. Not once has git history helped debug outside of looking at the blame + offending PR and diff.
Can someone tell me when they were fixing a problem and they were glad that they rebased? Bc I can't.
My preference for rebasing comes from delivering stacked PRs: when you're working on a chain of individually reviewable changes, every commit is a clean, atomic, deliverable patch. git-format-patch works well with this model. GitHub is a pain to use this way but you can do it with some extra scripts and setting a custom "base" branch.
The reason in that scenario to prefer rebasing over "merging in master" is that every merge from master into the head of your stack is a stake in the ground: you can't push changes to parent commits anymore. But the whole point of stacked diffs is that I want to be able to identify different issues while I work, which belong to different changes. I want to clean things up as I go, without bothering reviewers with irrelevant changes. "Oh this README could use a rewrite; let me fix that and push it all the way up the chain into its own little commit," or "Actually now that I'm here, let me update dependencies and ensure we're on latest before I apply my changes". IME, an ideal PR is 90% refactors and "prefactors" which don't change semantics, all the way up to "implemented functionality behind a feature flag", and 10% actual changes which change the semantics. Having an editable history that you can "keep bringing with you" is indispensible.
Debugging history isn't really related. Other than that this workflow allows you to create a history of very small, easily testable, easily reviewable, easily revertible commits, which makes debugging easier. But that's a downstream effect.
Git won, which is why I've been using it for more than 10 years, but that doesn't mean it was ever best, it was just most popular and so the rest of the eco system makes it worth it accepting the flaws (code review tools and CI system both have much better git support - these are two critical things that if you use anything else will work against you).
I think the question was about situations where you were glad to rebase, when you could have merged instead
All the commits for your feature get popped on top the commits you brought in from main. When you are putting together your PR you can more easily squash your commits together and fix up your commit history before putting it out for review.
It is a preference thing for sure but I fall into the atomic, self contained, commits camp and rebase workflows make that much cleaner in my opinion. I have worked with both on large teams and I like rebase more but each have their own tradeoffs
especially since every developer has a different idea of what a commit should be, with there being no clear right answer
Are you saying that you've never used git bisect? If that's the case, I think you're missing out.
If the contributor count is high enough (or you're otherwise in a role for which "contribution" is primarily adjusting others' code), or the behaviors that get reported in bugs are specific and testable, then bisect is invaluable.
If you're in a project where buggy behavior wasn't introduced so much as grew (e.g. the behavior evolved A -> B -> C -> D -> E over time and a bug is reported due to undesirable interactions between released/valuable features in A, C, and E), then bisecting to find "when did this start" won't tell you that much useful. If you often have to write bespoke test scripts to run in bisect (e.g. because "test for presence of bug" is a process that involves restarting/orchestrating lots of services and/or debugging by interacting with a GUI), then you have to balance the time spent writing those with the time it'd take for you to figure out the causal commit by hand. If you're in a project where you're personally familiar with roughly what was released when, or where the release process/community is well-connected, it's often better to promote practices like "ask in Slack/the mailing list whether anyone has made changes to ___ recently, whoever pipes up will help you debug" rather than "everyone should be really good at bisect". Those aren't mutually exclusive, but they both do take work to install in a community and thus have an opportunity cost.
This and many other perennial discussions about Git (including TFA) have a common cause: people assume that criticisms/recommendations for how to use Git as a release coordinator/member of a disconnected team of volunteers apply to people who use Git who are members of small, tightly-coupled teams of collaborators (e.g. working on closed-source software).
It is a tragedy that more people don't know about it.
In this case, rebasing is nice because our changes stay in a contiguous block at the top (vs merging which would interleave them), so it's easy for me and others to see exactly where our fork diverges.
I like to keep a linear history mainly so I don't have to think very hard about tools like that.
To answer your question directly, if somewhat glibly, I’m glad I rebased every time I go looking for something in the history because I don’t have to think about the history as a graph. It’s easier.
More to your point, there are times when blame on a line does not show the culprit. If you move code, or do anything else to that line, then you have to keep searching. Sometimes it’s easier to look at the entire patch history of a file. If there is a way to repeatedly/recursively blame on a line, that’s cool and I’d love to know about it.
I now manage two junior engineers and I insist that they squash and rebase their work. I’ve seen what happens if they don’t. The merges get tangled and crazy, they include stuff from other branches they didn’t mean to, etc. the squash/rebase flow has been a way to make them responsible for what they put into the history, in a way that is simple enough that they got up to speed and own it.
Rebasing on main loses provenance.
If you want a clean history doing it in the PR, before merging it. That way the PR is the single unit of work.
Well if I have a diff of the PR with just the changes, then the PR is already a "unit of work," regardless of merge or rebase, right?
You can write a test (outside of source control) and run `git bisect good` on a good commit and `git bisect bad` on bad one and it'll do a binary search (it's up to you to rerun your test each time and tell git whether that's a good or a bad commit). Rather quickly, it'll point you to the commit that caused the regression.
If you rebase, that commit will be part of a block of commits all from the same author, all correlated with the same feature (and likely in the same PR). Now you know who you need to talk to about it. If you merge, you can still start with the author of the commit that git bisect found, but it's more likely to be interleaved with nearby commits in such a way that when it was under development, it had a different predecessor. That's a recipe for bugs that get found later than they otherwise would've.
If you're not using git history to debug, you're probably not really aware of which problems would've turned out differently if the history was handled differently. If you do, you'll catch cases where one author or another would've caught the bug before merging to main, had they bothered to rebase, but instead the bug was only visible after both authors thought they were done.
not true. You can use
where script-command is a ... script that will test the result of the build.Besides testing for a perf slow down, any other use cases for git bisect + rebase?
This is especially true if you have multiple repos and builds from each one, such that you can't just checkout the commit for build X.Y.Z and easily check if the code contains that or not (you'd have to track through dependency builds, checkout those other dependencies, possibly repeat for multiple levels). If the date of a commit always reflects the date it made it into the common branch, a quick git log can tell you the basic info a lot of the time.
https://lottia.net/notes/0013-git-jujutsu-miniature.html
- Reshuffle commits into a more logical order. - Edit commit subjects if I notice a mistake. - Squash (merge) commits. Often, for whatever reason pieces of a fix end up in separate commits and it's useful to collect and merge them.
I'd like to make every commit perfect the first time but I haven't managed to do that yet. git-rebase really helps me clean things up before pushing or merging a branch.
In fact, I've been using Jujutsu for ~2 years as a drop-in and nobody complained (outside of the 8 small PRs chained together). Git is great as a backend, but Jujutsu shines as a frontend.
[0]: https://www.jj-vcs.dev/latest/
Pre-jujutsu, I never rebased unless my team required it. Now I do it all the time.
Pre-jj, I never had linear history, unless the team required it. Now most of my projects have linear history.
A better UI makes a huge difference.
My peers are my reviewers...
`jj` is the only tool that make me use `rebase` personally. Before, I see as the punishment given by my team wishes :)
Besides, Magit rebasing is also pretty sweet.
[0]: https://github.com/bolivier/jj-mode.el
Understanding of local versus origin branch is also missing or mystical to a lot of people and it’s what gives you confidence to mess around and find things out
Replaying commits one-by-one is like a history quiz. It forces me to remember what was going on a week ago when I did commit #23 out of 45. I'm grateful that git stores that history for me when I need it, but I don't want it to force me to interact with the history. I've long since expelled it from my brain, so that I can focus on the current state of the codebase. "5 commits ago, did you mean to do that, or can we take this other change?" I don't care, I don't want to think about it.
Of course, this issue can be reduced by the "squash first, then rebase" approach. Or judicious use of "git commit --amend --no-edit" to reduce the number of commits in my branch, therefore making the rebase less of a hassle. That's fine. But what if I didn't do that? I don't want my tools to judge me for my workflow. A user-friendly tool should non-judgmentally accommodate whatever convenient workflow I adopted in the past.
Git says, "oops, you screwed up by creating 50 lazy commits, now you need to put in 20 minutes figuring out how to cleverly combine them into 3 commits, before you can pull from main!" then I'm going to respond, "screw you, I will do the next-best easier alternative". I don't have time for the judgement.
You can also just squash them into 1, which will always work with no effort.
Sometimes it's ok to work like this, but you asking git not being judgamental is like saying your roomba should accomodate to you didin't asking you to empty it's dust bag.
I always do long lived feature branches, and rarely have issues. When I hear people complain about it, I question their workflow/competence.
Lots of commits is good. The thing I liked about mercurial is you could squash, while still keeping the individual commits. And this is also why I like jj - you get to keep the individual commits while eliminating the noise it produces.
Lots of commits isn't inherently bad. Git is.
I had a branch that lived for more than a year, ended up with 800+ commits on it. I rebased along the way, and the predictably the final merge was smooth and easy.
I rebase often myself, but I don’t understand the logic here.
2) small conflicts when rebasing the long lived branch on the main branch
if instead I delayed any rebasing until the long lived branch was done, I'd have no idea of the scale of the conflicts, and the task could be very, very different.
Granted, in some cases there would be no or very few conflicts, and then both approaches (long-lived branch with or without rebases along the way) would be similar.
I don’t think the tool is judgmental. It’s finicky. It requires more from its user than most tools do. Including bending over to make your workflow compliant with its needs.
I've had failures while git bisecting, hitting commits that clearly never compiled, because I'm probably the first person to ever check them out.
e.g. I'm currently working on a substantial framework upgrade to a project - I've pulled every dependency/blocker out that could be done on its own and made separate PRs for them, but I'm still left with a number of logically independent commits that by their nature will not compile on their own. I could squash e.g. "Update core framework", "Fix for new syntax rules" and "Update to async methods without locking", but I don't know that reviewers and future code readers are better served by that.
Where you have two repositories, one "polished" where every commit always passes, and another for messier dev history.
If you have expensive e2e tests, then you might want to keep a 'latest' tag on main that's only updated when those pass.
Sometimes people look sort of "superstitious" to me about Git. I believe this is caused by learning Git through web front-ends such as Github, GitLab, Gitea etc., that don't tell you the entire truth; desktop GUI clients also let the users only see Git through their own, more-or-less narrow "window".
TBH, sometimes Git can behave in ways you don't expect, like seeing conflicts when you thought there wouldn't be (but up to now never things like choosing the "wrong" version when doing merges, something I did fear when I started using it a ~decade ago).
However one usually finds an explanation after the fact. Something I've learned is that Git is usually right, and forcing it to do things is a good recipe to mess things up badly.
I tend to rebase my unpushed local changes on top of upstream changes. That's why rebase exists. So you can rewrite your changes on top of upstream changes and keep life simple for consumers of your changes when they get merged. It's a courtesy to them. When merging upstream changes gets complicated (lots of conflicts), falling back to merging gives you more flexibility to fix things.
The resulting pull requests might get a bit ugly if you merge a lot. One solution is squash merging when you finally merge your pull request. This has as the downside that you lose a lot of history and context. The other solution is to just accept that not all change is linear and that there's nothing wrong with merging. I tend to bias to that.
If your changes are substantial, conflict resolution caused by your changes tends to be a lot easier for others if they get lots of small commits, a few of which may conflict, rather than one enormous one that has lots of conflicts. That's a good reason to avoid squash merges. Interactive rebasing is something I find too tedious to bother with usually. But some people really like those. But that can be a good middle ground.
It's not that one is better than the other. It's really about how you collaborate with others. These tools exist because in large OSS projects, like Linux, where they have to deal with a lot of contributions, they want to give contributors the tools they need to provide very clean, easy to merge contributions. That includes things like rewriting history for clarity and ensuring the history is nice and linear.
I think it should be possible to assign different instances of the repository different "roles" and have the tooling assist with that. For example. A "clean" instance that will only ever contain fully working commits and can be used in conjunction with production and debugging. And various "local" instances - per feature, per developer, or per something else - that might be duplicated across any number of devices.
You can DIY this using raw git with tags, a bit of overhead, and discipline. Or the github "pull" model facilitates it well. But either you're doing extra work or you're using an external service. It would be nice if instead it was natively supported.
This might seem silly and unnecessary but consider how you handle security sensitive branches or company internal (proprietary) versus FOSS releases. In the latter case consider the difficulty of collaborating with the community across the divide.
This is one way to see things and work and git supports that workflow. Higher-level tooling tailored for this view (like GitHub) is plentiful.
> There's no reason a local copy should have the exact same implementation as a repository
...Except to also support the many git users who are different from you and in different context. Bending gits API to your preferences would make it less useful, harder to use, or not even suitable at all for many others.
> git made a wrong turn in this, let's just admit it.
Nope. I prefer my VCS decentralized and flexible, thank you very much. SVN and Perforce are still there for you.
Besides, it's objectively wrong calling it "a wrong turn" if you consider the context in which git was born and got early traction: Sharing patches over e-mail. That is what git was built for. Had it been built your way (first-class concepts coupled to p2p email), your workflow would most likely not be supported and GitHub would not exist.
If you are really as old as you imply, you are showing your lack of history more than your age.
However, even better for me (and my team) is squash on PR resolve.
I know a lot of people want to maintain the history of each PR, but you won't need it in your VCS.
You should always be able to roll back main to a real state. Having incremental commits between two working stages creates more confusion during incidents.
If you need to consult the work history of transient commits, that can live in your code review software with all the other metadata (such as review comments and diagrams/figures) that never make it into source control.
I strongly disagree. Losing this discourages swarming on issues and makes bisect worse.
> You should always be able to roll back main to a real state. Having incremental commits between two working stages creates more confusion during incidents.
If you only use merge commits this shouldn't be any more difficult. You just need to make sure you specify that you want to use the first parent when doing reverts.
Well there's your problem. Why are you assuming there are non-working commits in the history with a merge based workflow? If you really need to make an incremental commit at a point where the build is broken you can always squash prior to merge. There's no reason to conflate "non-working commits" and "merge based workflow".
Why go out of the way to obfuscate the pathway the development process took? Depending on the complexity of the task the merge operation itself can introduce its own bugs as incompatible changes to the source get reconciled. It's useful to be able to examine each finished feature in isolation and then again after the merge.
> with all the other metadata (such as review comments and diagrams/figures) that never make it into source control.
I hate that all of that is omitted. It can be invaluable when debugging. More generally I personally think the tools we have are still extremely subpar compared to what they could be.
Having worked on a maintenance team for years, this is just wrong. You don't know what someone will or won't need in the future. Those individual commits have had extra context that have been a massive help for me all sorts of times.
I'm fine with manually squashing individual "fix typo"-style commits, but just squashing the entire branch removes too much.
If those commits were ready for production, they would have been merged. ;)
Don't put a commit on main unless I can roll back to it.
In the same class, for commit to not have on which branch they were created as a metadata is a rel painpoint. It always a mess to find what commit were done for what global feature/bugfix in a global gitflow process...
I'll probably be looking into adding an commit auto suffix message with the current branch in the text, but it will only work for me, not any contributors...
Also (and especially) it make it way easier to revert a single feature if all the relevant commits to that feature are already grouped.
For your issue about not knowing which branch the commits are from: that why I love merge commits and tree representation (I personally use 'tig', but git log also have a tree representation and GUI tools always have it too).
I also prefer Fossil to Git whenever possible, especially for small or personal projects.
[0] https://fossil-scm.org/home/doc/trunk/www/rebaseharm.md
From your link. The actual issue that people ought to be discussing in this comment section imo.
Why do we advocate destroying information/data about the dev process when in reality we need to solve a UI/display issue?
The amount of times in the last 15ish years I've solved something by looking back at the history and piecing together what happened (eg. refactor from A to B as part of a PR, then tweak B to eventually become C before getting it merged, but where there are important details that only resulted because of B, and you don't realize they are important until 2 years later) is high enough that I consider it very poor practice to remove the intermediate commits that actually track the software development process.
But the git authors are adamant that there's no convention for linearity, and somehow extended that to why there shouldn't be a "theirs" merge strategy to mirror "ours" (writing it out it makes even less sense, since "theirs" is what you'd want in a first-parent-linear repo, not "ours").
Except perhaps crappy gui options in GitHub. I really wish they added that option as a button.
https://docs.github.com/en/get-started/using-git/about-git-r...
> Warning - Because changing your commit history can make things difficult for everyone else using the repository, it's considered bad practice to rebase commits when you've already pushed to a repository.
A similar warning is in Atlassian docs.
Branches that people are expected to track (i.e. pull from or merge into their regularly) should never rebase/force-push.
Branches that are short-lived or only exist to represent some state can do so quite often.
- the web tooling must react properly to this (as GH does mostly)
- comments done at the commit level are complicated to track
- and given the reach of tools like GH, people shooting their own foot with this is (even experienced ones) most likely generate a decent level of support for these tools teams
Or just do a merge and move on with your life.
lol if 1k words is "not easy" for you, i think you have bigger problems than merge vs rebase.
The best method for stop being terrified of destructive operations in git when I first learned it, was literally "cp -r $original-repo $new-test-repo && go-to-town". Don't know what will happen when you run `git checkout -- $file` or whatever? Copy the entire directory, run the command, look at what happens, then decide if you want to run that in your "real" repository.
Sound stupid maybe, but if it works, it works. Been using git for something like a decade now, and I'm no longer afraid of destructive git operations :)
And still one step further, just create a new branch to deal with the rebase/merge.
Yes there are may UX pain points in using git, but it also has the great benefits of extremely cheap and fast branching to experiment.
I guess it's actually more of a mental "divider" than anything, it tends to relax people more when they can literally see that their old stuff is still there, and I think git branches can "scare" people in that way.
Granted, this is about people very new to git, not people who understands what is/isn't destructive, and just because a file isn't on disk doesn't mean git doesn't know exactly what it is.
I've been using git almost exclusively since 2012 and feel very comfortable with everything it does and where the sharp edges are. Despite that, I still regularly use the cp -r method when doing something even remotely risky. The reason being, that I don't want to have to spend time unwinding git if I mess something up. I have the understanding and capability of doing so, but it's way easier to just cp -r and then rm -rf && cp -r again if I encounter something unexpected.
Two examples situations where I do this:
1. If I'm rebasing or merging with commits that have a moderate to high risk of merge conflicts that could be complicated. I might get 75% through and then hit that one commit where there's a dozen spots of merge conflict and it isn't straightforwardly clear which one I want (usually because I didn't write them). It's usually a lot easier to just rm -rf the copy and start over in a clean cp -r after looking through the PR details or asking the person who wrote the code, etc.
2. If there are uncommitted files in the repo that I don't want to lose. I routinely slap personal helper scripts or Makefiles or other things on top of repos to ease my workflow, and those don't ever get committed. If they are non-trivial then I usually try to keep a copy of them somewhere else in case I need to restore, but I'm not alway ssuper disciplined about that. The cp -r method helps a lot
There are more scenarios but those are the big two that come to mind.
I think i've seen someone coded user-friendlier `git undo` front for it.
TDLR is: people feel safer when they can see that their original work is safe, while just making a new branch and playing around there is safe in 99% of the cases, people are more willing to experiment when you isolate what they want to keep.
Rebase is a super power but there are a few ground rules to follow that can make it go a lot better. Doing things across many smaller commits can make rebase less painful downstream. One of the most important things is to learn that sometimes a rebase is not really feasible. This isn't a sign that your tools are lacking. This is a sign that you've perhaps deviated so far that you need to reevaluate your organization of labor.
Also, since you can choose to keep the fossil repo in a separate directory, that's an additional space saver.
[0] https://www3.fossil-scm.org/home/help/undo
The article discusses why contributors should rebase their feature branches (pull request).
The reason they give is for clean git history on main.
The more important reason is ensure the PR branch actually works if merged into current main. If I add my change onto main, does it then build, pass all tests, etc? What if my PR branch is old, and new commits have been added onto main that I don't have in my PR branch? Then I can merge and break main. That's why you need to update your PR branch to include the newer commits from main (and the "update" could be a rebase or a merge from main or possibly something else).
The downside of requiring contributors to rebase their PR branch is (1) people are confused about rebase and (2) if your repository has many contributors and frequent merges into main, then contributors will need to frequently rebase their PR branch, and each rebase their PR checks need to re-run, which can be time consuming.
My preference with Github is to squash merge into main[1] to keep clean git history on main. And to use merge queue[2], which effectively creates a temp branch of main+PR, runs your CI checks, and then the PR merge succeeds into main only if checks pass on the temp branch. This approach keeps super clean history on main, where every commit includes a specific PR number, and more importantly minimizes friction for contributors by reducing frequent PR rebases on large/busy repos. And it ensures main is never broken (as far as your CI checks can catch issues). There's also basically no downside for very small repos either.
1. https://docs.github.com/en/repositories/configuring-branches...
2. https://docs.github.com/en/repositories/configuring-branches...
As an alternative, just create a new branch! `git branch savepoint-pre-rebase`. That's all. This is extremely cheap (just copy a reference to a commit) and you are free to play all you want.
You are a little more paranoid? `git switch -c test-rebase` and work over the new branch.
Also incremental rebasing with mergify/git-imerge/git-mergify-rebase/etc is really helpful for long-lived branches that aren't merged upstream.
https://github.com/brooksdavis/mergify https://github.com/mhagger/git-imerge https://github.com/CTSRD-CHERI/git-mergify-rebase https://gist.github.com/nicowilliams/ea2fa2b445c2db50d2ee650...
I also love git-absorb for automatic fixups of a commit stack.
https://github.com/tummychow/git-absorb
Wouldn't it be enough to simply back up the branch (eg, git checkout -b current-branch-backup)? Or is there still a way to mess up the backup as well?
The "local backup branch" is not really needed either because you can still reference `origin/your-branch` even after you messed up a rebase of `your-branch` locally.
Even if you force-pushed and overwrote `origin/your-branch` it's most likely still possible to get back to the original state of things using `git reflog`.
At work though it is still encouraged to rebase, and I have sometimes forgotten to squash and then had to abort, or just suck it up and resolve conflicts from my many local commits.
Rebase only makes sense if you making huge PRs where you need to break it down into smaller commits to have them make sense.
If you keep your PRs small, squashing it works well enough, and is far less work and more consistent in teams.
Expecting your team to carefully group their commits and have good commit messages for each is a lot of unnecessary extra work.
Don't erase history. Branch to a feature branch, develop in as many commits as you need, then merge to main, always creating a merge commit. Oftentimes, those commit messages that you're erasing with a squash are the most useful documentation in the entire project.
Linear history is like reality: One past and many potential futures. With non-linear history, your past depends on "where you are".
Say I'm at commit P (for present). I got married at commit M and got a dog at commit D. So I got married first and got a Dog later, right? But if I go back in time to commit D where I got the dog, our marriage is not in my past anymore?! Now my wife is sneezing all the time. Maybe she has a dog allergy. I go back in time to commit D but can't reproduce the issue. Guess the dog can't be the problem.No. In one reality, you got married with no dog, and in another reality you got a dog and didn't marry. Then you merged those two realities into P.
Going "back in time to commit D" is already incorrect phrasing, because you're implying linear history where one does not exist. It's more like you're switching to an alternate past.
Can't that also happen with a rebase? Isn't it an (all too easy to make) error any time you have conflicting changes from two different branches that you have to resolve? Or have I misunderstood your scenario?
Had you simply rebased you would have lost the ability to separate the initial working implementation of D from the modifications required to reconcile it with M (and possibly others that predate it). At least, unless you still happen to have a copy of your pre-rebase history lying around but I prefer not to depend on happenstance.
I'd say: cleaning that up is an advantage. Why keep that around? It wouldn't be necessary if there was no update on the main branch in the meantime. With rebase you just pretend you started working after that update on main.
Recall that the entire premise is that there's a bug (the allergy). So at some point a while back something went wrong and the developer didn't notice. Our goal is to pick up the pieces in this not-so-ideal situation.
What's the advantage of "cleaning up" here? Why pretend anything? In this context there shouldn't be a noticeable downside to having a few extra kilobytes of data hanging around. If you feel compelled to "clean up" in this scenario I'd argue that's a sign you should be refactoring your tools to be more ergonomic.
It might be worthwhile to consider the question, why have history in the first place? Why not periodically GC anything other than the N most recent commits behind the head of each branch and tag?
In my experience, when there is a bug, it’s often quicker to fix it without having a look at the past commits, even when a regression occurs. If it’s not obvious just looking at the current state of the code, asking whoever touch that part last will generally give a better shortcut because there is so much more in the person mind than the whole git history.
Yes logs and commit history can brings the "haha" insight, and in some rare occasion it’s nice to have git bisect at hand.
Maybe that’s just me, and the pinnacle of best engineers will always trust the source tree as most important source of information and starting point to move forward. :)
This is absolute nonsense. You commit your work, and make a "backup" branch pointing at the same commit as your branch. The worst case is you reset back to your backup.
I was working on a local branch, periodically rebasing it to master. All was well, my git history was beautiful etc.
Then down the line I realised something was off. Code that should have been there wasn't. In the end I concluded some automatic commit application while rebasing gobbled up my branch changes. Or frankly, I don't even entirely know what happened (this is my best guess), all I know is, suddenly it wasn't there.
No big deal, right? It's VCS. Just go back in time and get a snapshot of what the repo looked like 2 weeks ago. Ah. Except rebase.
I like a clean linear history as much as the next guy, but in the end I concluded that the only real value of a git repo is telling the truth and keeping the full history of WTF really happened.
You could say I was holding it wrong, that if you just follow this one weird old trick doctor hate, rebase is fine. Maybe. But not rebasing and having a few more squiggles in my git history is a small price to pay for the peace of mind that my code change history is really, really all there.
Nowadays, if something leaves me with a chance that I cannot recreate the repo history at any point in time, I don't bother. Squash commits and keeping the branch around forever are OK in my book, for example. And I always commit with --no-ff. If a commit was never on master, it shouldn't show up in it.
This is false.
Any googling of "git undo rebase" will immediately point out that the git reflog stores all rebase history for convenient undoing.
Shockingly, got being a VCS has version control for the... versions of things you create in it, not matter if via merge or rebase or cherry-pick or whatever. You can of course undo all of that.
And anyway, I don't want to dig this deep in git internals. I just want my true history.
Another way of looking at it is that given real history, you can always represent it more cleanly. But without it you can never really piece together what happened.
The `git log` history that you push is just that curated specific view into what you did that you wish to share with others outside of your own local repository.
The reflog is to git what Ctrl+Z is to Microsoft Word. Saying you don't want to use the reflog to undo a rebase is a bit like saying you don't want to use Ctrl+Z to undo mistakes in Word.
(Of course the reflog is a bit more powerful of an undo tool than Ctrl+Z, as the reflog is append-only, so undoing something doesn't lose you the newer state, you can "undo the undo", while in Word, pressing Ctrl+Z and then typing something loses the tail of the history you undid.)
Indeed, like for Word, the undo history expires after a configurable time. The default is 90 days for reachable changes and 30 days for unreachable changes, which is usually enough to notice whether one messed up one's history and lost work. You can also set it to never expire.
It is fine for people to prefer merge over rebase histories to share the history of parallel work (if in turn they can live with the many drawbacks of not having linear history).
But it is misleading to suggest that rebase is more likely to lose work from interacting with it. Git is /designed/ to not lose any of your work on the history -- no matter the operation -- via the reflog.
None of it is impossible, but IMHO it's a lot of excitement of the wrong kind for essentially no reward.
Yes, but only because of reflog.
I still get confused by vscode’s changing the terms used by Git. «Current» vs «incoming» are not clear, and can be understood to mean two different things.
- Is “current” what is on the branch I am rebasing on? Or is it my code? (It’s my code)
- Is “incoming” the code I’m adding to the repo? Or is it what i am rebasing on to? (Again, the latter is correct)
I find that many tools are trying to make Git easier to understand, but changing the terms is not so helpful. Since different tools seldom change to the same words, it just clutters any attempts to search for coherent information.
This constant reinvention makes the situation even worse, because now the terminology is not only confusing, but also inconsistent across different tools.
If I have a merge conflict I typically have to be very conscious about what was done in both versions, to make sure the combination works.
I wish for "working copy" and "from commit 1234 (branch xyz)" or something informative, rather than confusing catch-all terms.
We'll be migrating to Git this year though so.
For reference, the codebase is over 20 years old, and includes binary dependencies like libraries. Makes it easy to compile old versions when needed, not so easy on the repository size...
The common view that a Git GUI is a crutch is very wrong, even pernicious. To me it is the CLI that is a disruptive mediation, whereas in a GUI you can see and manipulate the DAG directly.
Obligatory jj plug: before jj, I would have agreed with the top comment[1] that rebasing was mostly unnecessary, even though I was doing it in GitUp pretty frequently — I didn't think of it as rebasing because it was so natural. Now that I use jj I see that the cost-benefit analysis around git rebase is dominated by the fact that both rebasing and conflict resolution in git are a pain in the ass, which means the benefit has to be very high to compensate. In jj they cost much less, so the neatness benefit can be quite small and still be worth it. Add on the fact that Claude Code can handle it all for you and the cost is down to zero.
[0]: https://gitup.co/
[1]: https://news.ycombinator.com/item?id=46602056
I remain terrified.