Launchpad itself

Code review comment for ~pelpsi/launchpad:duplicate_ci_jobs

Revision history for this message

Colin Watson (cjwatson) wrote on 2023-04-16:

I see that you had to make quite extensive changes to account for one of my earlier comments asking if the structure of `stages` was right. For the record, it's totally fine to push back on me if I ask this sort of question and dealing with it seems to have massive fallout. Looking at this, I think we need to step back and think about it a bit more.

The problem with what we've ended up with here is that the format of the data structure in `CIBuild.stages` (as opposed to whatever intermediate objects we use in the process of constructing that) has changed, and I don't think that can be the right solution to this problem. That data structure is passed to the builder, so changing its format will almost certainly break the ability to run CI jobs on the Launchpad build farm.

The structure that ends up in `CIBuild.stages` is more or less a cleaned-up version of `pipeline` from https://lpci.readthedocs.io/en/latest/configuration.html. It's supposed to be a list of stages, each stage being a list of jobs, and each job being a tuple of (job_name, job_index). Within a stage, each job is executed (the specification allows for this to be in parallel, but in practice right now it's in series); all jobs in the stage are run, but the stage fails if any of those jobs fail. Each stage is executed in sequence, with subsequent stages only executing if previous stages succeeded. We need to preserve something compatible with this, as it's part of the build dispatch protocol; and we need to make sure that the structure continues to reflect what's in the `pipeline` configuration we've read, even if we're filtering it by series.

But there's another problem here. As mentioned above, the intended semantics are that jobs in later pipeline stages aren't run if earlier stages fail, and it's not really obvious how that works with multi-series pipelines given the current arrangements. Consider this hypothetical pipeline to build an artifact once and then test it on multiple series (which might make sense for artifacts like snaps that are generally supposed to be installable on multiple series, but that might have non-trivial interactions with the host system):

  pipeline:
    - build-snap
    - test-snap
  jobs:
    build-snap:
      series: focal
      architectures: [amd64, arm64]
      build-snaps: [snapcraft]
      run: snapcraft
    test-snap:
      matrix:
        - series: bionic
          architectures: [amd64]
        - series: focal
          architectures: [amd64, arm64]
        - series: jammy
          architectures: [amd64, arm64]
      run: ...

This needs to run `build-snap` on focal first, for each of amd64 and arm64; and then if and only if that succeeds, it needs to run `test-snap` on each of bionic (only on amd64, for the sake of a more interesting example), focal, and jammy. If we dispatch entirely separate jobs to the build farm that only contain the CI jobs for each of those releases, this isn't going to work as intended.

What I think `CIBuild.requestBuildsForRefs` needs to do here is to work out a list of build farm jobs it needs to request, each of which will run an appropriate subset of the pipeline depending on the architecture. Each build farm job needs a compatible-enough `DistroArchSeries` (but note that this doesn't have to be equal to the CI jobs it'll be running - it just specifies the environment in which `lpci` will be executed, and it's then `lpci`'s job to execute the CI jobs themselves in containers of the right series): that could reasonably be the latest of the specified series for the given architecture. It would then request that build farm job with a list of stages filtered by architecture. So the example I gave above would request two build farm jobs with jammy/amd64 and jammy/arm64 as its `DistroArchSeries` (this doesn't matter too much, but picking the latest one seems reasonable). The amd64 build farm job would have this as its `stages`:

[[("build-snap", 0)], [("test-snap", 0), ("test-snap", 1), ("test-snap", 2)]]

... while the arm64 build farm job would have this as its `stages`:

[[("build-snap", 0)], [("test-snap", 1), ("test-snap", 2)]]

Given this model, there are going to be some pipeline specifications that we can't execute right now. For example, one that asks for a build to run on amd64 and then to be tested on multiple architectures (let's say it's building a pure-Python wheel) is conceptually reasonable, but we just don't have a way to orchestrate the sequencing of the necessary build farm jobs right now. We'll probably need to figure this out at some point, but for the moment I think it would be fine (though not ideal) for such cases to log an error and not request any builds.

This is complicated stuff even to try to explain! Feel free to ask me if you need me to clarify anything here.

I see that you had to make quite extensive changes to account for one of my earlier comments asking if the structure of `stages` was right.  For the record, it's totally fine to push back on me if I ask this sort of question and dealing with it seems to have massive fallout.  Looking at this, I think we need to step back and think about it a bit more.

The problem with what we've ended up with here is that the format of the data structure in `CIBuild.stages` (as opposed to whatever intermediate objects we use in the process of constructing that) has changed, and I don't think that can be the right solution to this problem.  That data structure is passed to the builder, so changing its format will almost certainly break the ability to run CI jobs on the Launchpad build farm.

The structure that ends up in `CIBuild.stages` is more or less a cleaned-up version of `pipeline` from https://lpci.readthedocs.io/en/latest/configuration.html.  It's supposed to be a list of stages, each stage being a list of jobs, and each job being a tuple of (job_name, job_index).  Within a stage, each job is executed (the specification allows for this to be in parallel, but in practice right now it's in series); all jobs in the stage are run, but the stage fails if any of those jobs fail.  Each stage is executed in sequence, with subsequent stages only executing if previous stages succeeded.  We need to preserve something compatible with this, as it's part of the build dispatch protocol; and we need to make sure that the structure continues to reflect what's in the `pipeline` configuration we've read, even if we're filtering it by series.

But there's another problem here.  As mentioned above, the intended semantics are that jobs in later pipeline stages aren't run if earlier stages fail, and it's not really obvious how that works with multi-series pipelines given the current arrangements.  Consider this hypothetical pipeline to build an artifact once and then test it on multiple series (which might make sense for artifacts like snaps that are generally supposed to be installable on multiple series, but that might have non-trivial interactions with the host system):

This needs to run `build-snap` on focal first, for each of amd64 and arm64; and then if and only if that succeeds, it needs to run `test-snap` on each of bionic (only on amd64, for the sake of a more interesting example), focal, and jammy.  If we dispatch entirely separate jobs to the build farm that only contain the CI jobs for each of those releases, this isn't going to work as intended.

What I think `CIBuild.requestBuildsForRefs` needs to do here is to work out a list of build farm jobs it needs to request, each of which will run an appropriate subset of the pipeline depending on the architecture.  Each build farm job needs a compatible-enough `DistroArchSeries` (but note that this doesn't have to be equal to the CI jobs it'll be running - it just specifies the environment in which `lpci` will be executed, and it's then `lpci`'s job to execute the CI jobs themselves in containers of the right series): that could reasonably be the latest of the specified series for the given architecture.  It would then request that build farm job with a list of stages filtered by architecture.  So the example I gave above would request two build farm jobs with jammy/amd64 and jammy/arm64 as its `DistroArchSeries` (this doesn't matter too much, but picking the latest one seems reasonable).  The amd64 build farm job would have this as its `stages`:

[[("build-snap", 0)], [("test-snap", 0), ("test-snap", 1), ("test-snap", 2)]]

... while the arm64 build farm job would have this as its `stages`:

[[("build-snap", 0)], [("test-snap", 1), ("test-snap", 2)]]

Given this model, there are going to be some pipeline specifications that we can't execute right now.  For example, one that asks for a build to run on amd64 and then to be tested on multiple architectures (let's say it's building a pure-Python wheel) is conceptually reasonable, but we just don't have a way to orchestrate the sequencing of the necessary build farm jobs right now.  We'll probably need to figure this out at some point, but for the moment I think it would be fine (though not ideal) for such cases to log an error and not request any builds.

This is complicated stuff even to try to explain!  Feel free to ask me if you need me to clarify anything here.

review: Needs Fixing

« Back to merge proposal