> I've worked in large polyrepo environments. By the time you get big enough tha...

preseinger · on Jan 13, 2023

There is no source of truth for "what was deployed at time T" except the orchestration system responsible for the deployment environment. There is no relationship between source code revision and deployed artifacts.

lmm · on Jan 13, 2023

Hopefully you have a tag in your VCS for each released/deployed version. (The fact that tags are repository-global is another argument for aligning your repository boundaries with the scope of what you deploy).

preseinger · on Jan 13, 2023

Of a service, yes. Of the entire infrastructure, no.

Spivak · on Jan 13, 2023

Why not? I’m doing it right now. The infrastructure is versioned just like the app and I can say with certainty that we are on app version X and infra version Y.

I even have a nice little db/graph of what versions were in service at what times so I can give you timestamp -> all app and infra versions for the last several years.

preseinger · on Jan 13, 2023

Unless your infrastructure is a single deployable artifact, its "version" is a function of all of the versions of all of the running services. You can define a version that establishes specific versions of each service, but that's an intent, not a fact -- it doesn't mean that's what's actually running.

Spivak · on Jan 13, 2023

Am I missing some nuance here? Yes the infra version is an amalgamation of the fixed versions of all the underlying services. Once the deploy goes green I know exactly what’s running down to the exact commit hashes everywhere. And during the deploy I know that depending on the service it’s either version n-1 or n.

The kinds of failures you’re describing are throw away all assumptions and assume that everything from terraform to the compiler could be broken which is too paranoid to be practically useful and actionable.

If deploy fails I assume that new state is undefined and throw it away, having never switched over to it. If deploy passes then I now have the next known good state.

preseinger · on Jan 14, 2023

Oh, this implies you're deploying your entire infrastructure, from provisioned resources up to application services, with a single Terraform command, and managed by a single state file. That's fine and works up to a certain scale. It's not the context I thought we were working in. Normally multi-service architectures are used in order to allow services to be deployed independently and without this form of central locking.

lmm · on Jan 13, 2023

If what was deployed was foo version x and bar version y, it's a lot easier to debug by checking out tag x in the foo repo and tag y in the bar repo than achieving the same thing in a monorepo.

preseinger · on Jan 13, 2023

Of course, but this is entirely possible with a monorepo.

lmm · on Jan 13, 2023

Possible perhaps, but not easy by any means.

funcDropShadow · on Jan 13, 2023

Then you should build one. E.g. gitlab can create special git references for every deployment it ever made.

xmodem · on Jan 13, 2023

Our artifacts are tagged with their git commit and build time, which then gets emitted with every log event.

czx4f4bd · on Jan 13, 2023

I'm not sure I understand how that scenario would arise with a monorepo. The whole point of a monorepo is that everything changes together, so if you have a shared internal library, every service should be using the same version of that library at all times.

lmm · on Jan 13, 2023

And every service deploys instantly whenever anything changes?

(I actually use that as my rule of thumb for where repository splits should happen: things that are deployed together should go in the same repo, things that deploy on different cycles should go in different repos)

xmodem · on Jan 13, 2023

Not necessarily instantly, but our CD is fast enough that changes are in production 5-10 minutes after hitting master.

But what's more valuable is that our artifacts are tagged with the commit hash that produced them, which is then emitted with every log event, so you can go straight from a log event to a checked-out copy of every relevant bit of code for that service.

Admittedly this doesn't totally guarantee you won't ever have to worry about multiple monorepo revisions when you're debugging an interaction between services, but I haven't found this to come up very much in practise.

Edit: I should also clarify, a change to any internal library in our monorepo will cause all services that consume that library to be redeployed.

hmcamp · on Jan 13, 2023

Which CD are you using @xmodem?

xmodem · on Jan 13, 2023

Buildkite, with our own orchestration layer built on top.

yafbum · on Jan 13, 2023

> things that are deployed together should go in the same repo, things that deploy on different cycles should go in different repos

What do you do with libraries shared between different deployment targets?

lmm · on Jan 13, 2023

> What do you do with libraries shared between different deployment targets?

The short answer is "make an awkward compromise". If it's a library that mostly belongs to A but is used by B then it can live in A (but this means you might sometimes have to release A with changes just for the sake of B); if it's a genuinely shared library that might be changed for the sake of A or B then I generally put it in a third repo of its own, meaning you have a two-step release process. The way to mitigate the pain of that is to make sure the library can be tested on its own without needing A or B; all I can suggest about the case where you have a library that's shared between two independent components A and B but tightly coupled to them both such that it can't really be tested on its own is to try to avoid it.

ted_dunning · on Jan 13, 2023

If you have a library that is tightly coupled to A and B, then A and B are effectively coupled.

Ergo, put all three into a single repo because you pretty much have to deploy all three together.

The test for the tightness of coupling is to ask whether A and B can use different versions of the library. If not, they are tightly coupled.

yafbum · on Jan 13, 2023

That's a great test and I think an argument for monorepo for most companies. Unless you work on products that are hermetically sealed from each other, there's very likely going to be tight dependencies between them. Your various frontends and backends are going to want to share data models for the stuff they're exchanging between them for example. You don't really want multiple versions of this to exist across your deployments, at least not long term

lmm · on Jan 14, 2023

I think it's maybe an argument for a single repo per (two-pizza) team. Beyond that, you really don't want your components to be that tightly coupled together (e.g. you need each team to be able to control their own release cycles independently of each other). Conway's law works both ways.

yafbum · on Jan 14, 2023

? You can totally have independent release cycles between multiple targets within a monorepo.

lmm · on Jan 15, 2023

If they have independent release cycles, they shouldn't be tightly coupled (sharing models etc. beyond a specific, narrowly-scoped, and carefully versioned API layer), and in that case there is little benefit and nontrivial cost to having them be in a monorepo.

gregmac · on Jan 13, 2023

Not GP, but I use versioned packages (npm, nuget, etc) for that. They're published just like they're an open source project, ideally using semantic versioning or matching the version of a parent project (in cases where eg we produce a client library from the same repo as the main service).