this post was submitted on 18 Jul 2024
85 points (100.0% liked)

Programming

423 readers
5 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 1 year ago
MODERATORS
 
  • Facebook does not use Git due to scale issues with their large monorepo, instead opting for Mercurial.
  • Mercurial may be a better option for large monorepos, but Git has made improvements to support them better.
  • Despite some drawbacks, Git usage remains dominant with 93.87% share, due to familiarity, additional tools, and industry trends.
top 30 comments
sorted by: hot top controversial new old
[–] masterspace@lemmy.ca 63 points 4 months ago* (last edited 4 months ago)

Facebook uses Mercurial, but when people praise their developer tooling it's not just that. They're using their CLI which is built on top of Mercurial but cleans up its errors and commands further, it's all running on their own virtual filesystem (EdenFS), their dev testing in a customized version of chromium, and they sync code using their own in-house equivalent of GitHub, and all of it connects super nicely into their own customized version of VS Codium.

[–] MajorHavoc@programming.dev 20 points 4 months ago

I'm pleased to report that git has made significant strides, and git submodule can now be easily used to achieve a mono-repo-like level of painful jankiness.

[–] ace@lemmy.ananace.dev 10 points 4 months ago (1 children)

Mercurial does have a few things going for it, though for most use-cases it's behind Git in almost all metrics.

I really do like the fact that it keeps a commit number counter, it's a lot easier to know if "commit 405572" is newer than "commit 405488" after all, instead of Git's "commit ea43f56" vs "commit ab446f1". (Though Git does have the describe format, which helps somewhat in this regard. E.g. "0.95b-4204-g1e97859fb" being the 4204th commit after tag 0.95b)

[–] SkyNTP@lemmy.ml 12 points 4 months ago (1 children)

I suspect rebasing makes sequential commit IDs not really work in practice.

[–] wewbull@feddit.uk 7 points 4 months ago (2 children)

Rebasing updates the commit ids. It's fine. Commit IDs are only local anyway.

One thing that makes mercurial better for rebase based flows is obsolescence markers. The old version of the commits still exist after a rebases and are marked as being made obsolete by the new commits. This means somebody you've shared those old commits with isn't left in hyperspace when they fetch your new commits. There's history about what happened being shared.

[–] AnActOfCreation@programming.dev 4 points 4 months ago (1 children)

Commit IDs are only local anyway.

Whay do you mean by that?

[–] wewbull@feddit.uk 2 points 4 months ago (1 children)

You and I both clone a repo with ten changes in it. We each make a new commit. Both systems will call it commit 11. If I pull your change into my repo your 11 becomes my 12.

The sequential change IDs are only consistent locally.

[–] AnActOfCreation@programming.dev 1 points 4 months ago (1 children)

Got it! Are they renumbered chronologically? Like if my 11 was created before your 11, would yours be the one that's renumbered?

[–] wewbull@feddit.uk 2 points 3 months ago

No. They are not renumbered. Your 11 is always the same commit. It's consistent locally (which is what I mean by "local only") otherwise they'd change under your feet. You just can't share them with others and expect the same results. You have to use the hash for that.

[–] FizzyOrange@programming.dev 3 points 4 months ago (1 children)

That's exactly the same in git. The old commits are still there, they just don't show up in git log because nothing points to them.

[–] aport@programming.dev 1 points 4 months ago (1 children)

Old, unreachable commits will be garbage collected.

[–] FizzyOrange@programming.dev 1 points 4 months ago (1 children)

Does that not happen with Mercurial? If not that seems like a point against it.

[–] aport@programming.dev 1 points 4 months ago (1 children)

I'm confused, the behavior you just said was "exactly the same in git" is now a problem for Mercurial?

[–] FizzyOrange@programming.dev 1 points 4 months ago (1 children)

I thought it was exactly the same based on the description.

[–] wewbull@feddit.uk 1 points 4 months ago* (last edited 4 months ago)

No the old commit is always there, marked as obsolete with the information of what it became. No holes in history. (Assuming you use the obsolecense markers)

[–] collapse_already@lemmy.ml 10 points 4 months ago (1 children)

I use git daily and still wonder why I had fewer merge issues on a larger team in the 1990s with command line rcs on Solaris. Maybe we were just more disciplined then. I know we were less likely to work on the same file concurrently. I feel like I spend more time fighting the tools than I ever used to. Some of that is because of the dumb decisions that were made on our project a decade or more ago.

[–] EatATaco@lemm.ee 6 points 4 months ago (1 children)

I know we were less likely to work on the same file concurrently.

I mean, isn't that when merge conflicts happen? Isn't that your answer?

[–] collapse_already@lemmy.ml 3 points 4 months ago (1 children)

I was trying to say that tools were better about letting us know that another developer was modifying the same file as us, so we would collaborate in advance of creating the conflict.

[–] EatATaco@lemm.ee 3 points 4 months ago

I gotcha, I misunderstood

[–] Mikina@programming.dev 6 points 4 months ago

My best VCS experience so far was when working with Plastic SCM. I like how it can track merges, the code review workflow is also nice, and in general it was pretty nice to work with.

Fuck Unity, who paywalled it into unusability, though. Another amazing project that was bought and killed by absurd monetization by Unity, same as Parsec.

[–] MonkderDritte@feddit.de 4 points 4 months ago (1 children)
[–] FizzyOrange@programming.dev 11 points 4 months ago (1 children)

That brings more problems. Despite the scaling challenges monorepos are clearly the way to go for company code in most cases.

Unfortunately my company heavily uses submodules and it is a complete mess. People duplicating work all over the place, updates in submodules breaking their super-modules because testing becomes intractable. Tons of duplicate submodules because of transitive dependencies. Making cross-repo changes becomes extremely difficult.

[–] bellsDoSing@lemm.ee 3 points 4 months ago (1 children)

But if not for using submodules, how can one share code between (mono-)repos, which rely on the same common "module" / library / etc.? Is it a matter of "not letting submodules usage get out of hand", sticking to an "upper limit of submodules", or are submodules to be avoided entirely for monorepos of a certain scale and there's a better option?

[–] nous@programming.dev 3 points 4 months ago (1 children)

You don't share code between monorepos, the whole point of a monorepo is you only have one repo where all code goes. Want to share a library, just start using it as it is just in a different directory.

Submodules are a poor way to share code between lots of small separate repos. IMO they should never be used as I have never seen them work well.

If you don't want a mono repo then have your repos publish code to artifact stores/registries that can be reused by other projects. But IMO that just adds more complexities and problems then having everything in a single repo does.

[–] bellsDoSing@lemm.ee 2 points 4 months ago (1 children)

So AFAIU, if a company had:

  • frontend
  • backend
  • desktop apps
  • mobile apps

... and all those apps would share some smaller, self developed libraries / components with the frontend and/or backend, then the "no submodules, but one big monorepo" approach would be to just put all those apps into that monorepo as well and simply reference whatever shared code there might be via relative paths, effectively tracking "latest", or maybe some distinct "stable version folders" (not sure if that's a thing).

Anyway, certainly never thought to go that far, because having an app that's "mostly independant" from a codebase perspective be in it's own repo seemed beneficial. But yeah, it seems to me this is a matter of scale and at some point the cost of not having everything in a monorepo would become too great.

Thanks!

[–] FizzyOrange@programming.dev 2 points 4 months ago (1 children)

Yeah exactly that. Conceptually it's far superior to manyrepos. But it does have downsides:

  • git will be slower, and it doesn't really have great support for this way of working. I mean it provides raw commands for partial checkouts... but you're kind of on your own.
  • You can't realistically view a git log --graph any more since there will be just way too many commits. Though tbf you can get to that state without a monorepo if you have a big project and work with numskulls who make 50 commits for a small MR and don't squash.

Also it's not really a downside since you should be doing this anyway, but you need to use a build tool that sandboxes dependencies so it can guarantee there are no missing edges in your dependency graph (Bazel, Buck, Pants, Please, Landlock Make, etc.). Otherwise you will be constantly breaking master when things aren't checked in CI that should be.

[–] bellsDoSing@lemm.ee 1 points 3 months ago

True, git itself can't prevent people from creating a mess of a commit graph.

TBH, lots of build systems mentioned here I've never encountered so far. But this makes it clearer that one can't reason about how viable a "one big monorepo only" approach mighy be by just considering the capabilities of current git, coming from a "manyrepo" mindset. Likely that was the pitfall I fell into coming into this discussion.

[–] greysemanticist@lemmy.one 4 points 4 months ago (1 children)

jujutsu is a fresh take on git-- you describe the work you're about to do with jj new -m 'message'. Do the work. Anything not previously ignored in .gitignore is ready to commit with jj ci. You don't have to git add anything. No futzing with stashes to switch or refocus work. Need that file back? jj restore FILENAME.

[–] AnActOfCreation@programming.dev 7 points 4 months ago (1 children)

It's very optimistic to think people will be able to describe what they're going to do before they do it. I find things rarely go exactly as planned and my commit messages usually include some nuance about my changes that I didn't anticipate.

[–] greysemanticist@lemmy.one 2 points 3 months ago

This is true. But at jj ci you're plonked into an editor and can change the description.