Monorepos are great... but only if you can invest in the tooling scale to handle them, and most companies can't invest in that like Google can. Hyrum Wright class tooling experts don't grow on trees.
You don't need google scale tooling to work with a mono repo until you are actually at google scale. Gluing together a bunch of separate repos isn't exactly free either. See, for example, the complicated disaster Amazon has with brazil.
In the limit, there are only two options:
1. All code lives one repo
2. Every function/class/entity lives in its own repo
with a third state in between
3. You accept code duplication
This compromise state where some code duplication is (maybe implicitly) acceptable is what most people have in mind with a poly-repo.
The problem though is that (3) is not a stable equilibrium.
Most engineers have such a kneejerk reaction against code duplication that (3) is practically untenable. Even if your engineers are more reasonable, (3) style compromise means they constantly have to decide "should this code from package A be duplicated in package B, or split off into a new smaller package C, which A and B depend on". People will never agree on the right answer, which generates discussion and wastes engineering time. In my experience, the trend is almost never to combine repos, but always to generate more and more repos.
The limiting case of a mono repo (which is basically it's natural state) is far more palatable than the limiting case of poly-repo.
I don't understand why this was downvoted. Your list of three states is important to the debate. I never saw it that way. Another, more hostile way to put it: "What is a better or worse alternative and why?" Pretty much everything fits into one of those three states -- with warts.
This mostly seems like a problem for pure library code. If some bit of logic is only needed by a single independently-released service, then there's no reason not to put it in that service's repo.
I completely agree, and I think 2 is partially the forcing function behind a push for “serverless functions” as a unit of computing instead of some larger unit.
> You don't need google scale tooling to work with a mono repo until you are actually at google scale.
I really don't see how that would work for most companies in practice. Most of the off the shelf tooling used by companies with hundreds or thousands of developers assumes working with polyrepos. It's good we're seeing simpler alternative to Bazel but that's just one piece of the puzzle.
i’ve made this argument before, but you can run a 1k engineering company in a monorepo with the tools and services that exist today. between improvements to bazel (and alternatives) and adjacent tooling like build caching/target diffs, core git scalability, merge queues, and other services you can just plug things together over a few days/as needed and it will just work.
all of the stuff that you can’t do easily yet (vfs for repo, remote builds) just isn’t relevant enough at this scale.
Using bazel is nontrivial amount of effort (most of the open-source rules don't really work in a standard way due to the fact that google doesn't work in a standard way).
I guess with a 1K engineering company you can afford a substantial build team.
this is actually quite a lot better these days as the tooling adapts to integrate. go has always been the gold standard, but java/kotlin works very well and js/ts are much improved by rules_js.
You can get better tools now though, like Turbo Repo or NX. They don’t require the same level of investment as Bazel but they don’t always have the same hermetic build guarantees, though for most it’s “good enough”.
A good article to reference when this topic gets raised: http://yosefk.com/blog/dont-ask-if-a-monorepo-is-good-for-yo...