Monorepo vs multi-repo
Most Google engineers operate in a monorepo called google3
with phenomenal internal tooling. It was a joy to work there in the early 2010s, and I’ve heard that it keeps getting better. Google invests heavily in custom tooling for code search, file ownership, version control, testing, commit queue, code review, linting, etc. The code must flow.
So naturally in my jobs outside of Google I have been inclined to follow the monorepo approach and recapture that magic. But over time I’ve come to appreciate that most tooling outside of Google is built for multi-repo projects, and it’s best to embrace rather than fight the tooling. Here are some examples:
- Many projects expect a single configuration at the project root. Examples include Relay, Jest, ESLint, etc. If you have web code, mobile code, and backend code in the same repo, this can get a bit messy.
- Git operations start to slow down on large repos. This includes both remote operations (push, pull) and local operations (commit).
- JetBrains IDE autocompletion scales well but behaves best when the open files are well-scoped to say, one backend service or one frontend.
- User permissions on GitHub can be set a repo level but not at a sub-repo level.
- CI/CD and flaky test management can become unmanageable in large repositories without custom tooling.
- Code ownership and dependency management becomes unwieldy in large code bases.
For future projects, I’m inclined towards:
- One frontend codebase
- A backend codebase per backend network-isolated service (but limit the number of services)
- A shared library codebase (for TypeScript frontend/backend codebases)
The separate code bases are also a nice reminder about breaking changes – backend changes to add new API features should be merged and deployed before the frontend changes that use the new features are merged.