Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3211

Toolchain binary publishing improvements

    Details

      Description

      Impala is currently built against native libraries from the native-toolchain. The native-toolchain is built for supported operating systems and binaries are published to s3 so that they can be downloaded when building Impala. s3 hosts the compiled binaries of the supported libraries for all versions (including patch versions) for all supported OSes.

      However, there are a few issues with the way this works today:

      1. Our jenkins job to build/publish the toolchain will overwrite existing published binaries with the most recent build. In theory, recompiling the same version of the same library shouldn't result in different bits, but it's possible. When we build a library of a specific version and patch level, we should keep that version forever. We can introduce a build version (e.g. library.version.patch_version.build_version) if necessary. We should be able to know exactly which bits were used for any given Impala build.
      2. s3 (native-toolchain bucket) is the only place the artifacts live, i.e. they aren't backed up anywhere. A mistake could rm all published binaries, and we would have to rebuild everything, possibly resulting in a slightly different binary (see #1). We should have a strategy for backing up or checking in the artifacts.

        Issue Links

          Activity

          Hide
          caseyc casey added a comment -

          #2 shouldn't be needed. Rebuilding from scratch should be fine. That needs to work for non-standard OSs. Or am I missing something?

          Another improvement is how we need to do a run for each change twice – the first time without publishing to check that the build works, then again to publish the final artifacts. Some solution should be available where we publish only if all the builds for all os's pass. Maybe a staging area in s3 that get merged into the main area when the overall job is a success.

          Show
          caseyc casey added a comment - #2 shouldn't be needed. Rebuilding from scratch should be fine. That needs to work for non-standard OSs. Or am I missing something? Another improvement is how we need to do a run for each change twice – the first time without publishing to check that the build works, then again to publish the final artifacts. Some solution should be available where we publish only if all the builds for all os's pass. Maybe a staging area in s3 that get merged into the main area when the overall job is a success.
          Show
          mikesbrown Michael Brown added a comment - Regarding 2: https://aws.amazon.com/s3/faqs/#data-protection https://docs.aws.amazon.com/AmazonS3/latest/UG/enable-bucket-versioning.html
          Hide
          mjacobs Matthew Jacobs added a comment -

          It's unlikely but it is possible, e.g. that there could be subtle differences the next time we build where small environmental changes that could affect seeding a rng which could result in different coloring choices by the compiler. That's basically a bug in the compiler (Dan had seen this before where filename length resulted in different coloring choices), but it's possible and we should guard against it.

          Agreed about the 2-pass issue.

          Show
          mjacobs Matthew Jacobs added a comment - It's unlikely but it is possible, e.g. that there could be subtle differences the next time we build where small environmental changes that could affect seeding a rng which could result in different coloring choices by the compiler. That's basically a bug in the compiler (Dan had seen this before where filename length resulted in different coloring choices), but it's possible and we should guard against it. Agreed about the 2-pass issue.
          Hide
          caseyc casey added a comment -

          Even so what's to say that the seconds slightly different version is worse than the first? If it's random why would someone say that the first random build is better than all others? Anyhow making a backup isn't harmful, if someone wants to do that it's fine with me. Just saying it doesn't seem necessary.

          Show
          caseyc casey added a comment - Even so what's to say that the seconds slightly different version is worse than the first? If it's random why would someone say that the first random build is better than all others? Anyhow making a backup isn't harmful, if someone wants to do that it's fine with me. Just saying it doesn't seem necessary.
          Hide
          dhecht Dan Hecht added a comment -

          It's not a matter of which is better. The ability to reproduce product builds at any point in time is important, and if the toolchain binaries change, that isn't guaranteed.

          Show
          dhecht Dan Hecht added a comment - It's not a matter of which is better. The ability to reproduce product builds at any point in time is important, and if the toolchain binaries change, that isn't guaranteed.
          Hide
          tarmstrong Tim Armstrong added a comment -

          IMPALA-3211: provide toolchain build id for bootstrapping

          Testing:
          Ran a private build, which succeeded.

          Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb
          Reviewed-on: http://gerrit.cloudera.org:8080/4771
          Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
          Tested-by: Internal Jenkins

          ---------------------------------------------
          Versioning of build artifacts

          Previously publishing a new version of toolchain artifacts clobbered the
          previous version. This is problematic when changing build options since
          the new artifacts may not work in all circumstances. E.g. if an
          older version of Impala depends on some particular detail of how the
          older artifacts were built, we have no way of avoiding breakage.

          Now a unique build ID is generated (including the git hash and the
          jenkins job ID) and all build artifacts are uploaded into a directory
          based on this unique id.

          Also switches to building only the current artifacts by default (all
          historical artifacts can be built with BUILD_HISTORICAL) to speed up
          builds and reduce resource requirements.

          Change-Id: Ifb8774a7bd4bcae0c135684078f5ce89a28f6bc2

          M README.md
          M buildall.sh
          M functions.sh
          M init.sh
          4 files changed, 84 insertions, 32 deletions

          Approvals:
          Matthew Jacobs: Looks good to me, approved
          Tim Armstrong: Looks good to me, approved; Verified


          To view, visit http://gerrit.cloudera.org:8080/4742
          To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

          Show
          tarmstrong Tim Armstrong added a comment - IMPALA-3211 : provide toolchain build id for bootstrapping Testing: Ran a private build, which succeeded. Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb Reviewed-on: http://gerrit.cloudera.org:8080/4771 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins --------------------------------------------- Versioning of build artifacts Previously publishing a new version of toolchain artifacts clobbered the previous version. This is problematic when changing build options since the new artifacts may not work in all circumstances. E.g. if an older version of Impala depends on some particular detail of how the older artifacts were built, we have no way of avoiding breakage. Now a unique build ID is generated (including the git hash and the jenkins job ID) and all build artifacts are uploaded into a directory based on this unique id. Also switches to building only the current artifacts by default (all historical artifacts can be built with BUILD_HISTORICAL) to speed up builds and reduce resource requirements. Change-Id: Ifb8774a7bd4bcae0c135684078f5ce89a28f6bc2 — M README.md M buildall.sh M functions.sh M init.sh 4 files changed, 84 insertions , 32 deletions Approvals: Matthew Jacobs: Looks good to me, approved Tim Armstrong: Looks good to me, approved; Verified – To view, visit http://gerrit.cloudera.org:8080/4742 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
          Hide
          tarmstrong Tim Armstrong added a comment -

          WRT #2, we also store artifacts in an internal artifactory at Cloudera, so we have them in two places.

          It also helps that we have the native-toolchain commit hash baked into the version ID - even if we lose the bits, we can most likely recreate an equivalent set of bits.

          Show
          tarmstrong Tim Armstrong added a comment - WRT #2, we also store artifacts in an internal artifactory at Cloudera, so we have them in two places. It also helps that we have the native-toolchain commit hash baked into the version ID - even if we lose the bits, we can most likely recreate an equivalent set of bits.
          Hide
          mjacobs Matthew Jacobs added a comment -

          Thanks, Tim!

          WRT #2, we also store artifacts in an internal artifactory at Cloudera, so we have them in two places.

          When do toolchain artifacts get backed up? Is this part of the release process?

          Show
          mjacobs Matthew Jacobs added a comment - Thanks, Tim! WRT #2, we also store artifacts in an internal artifactory at Cloudera, so we have them in two places. When do toolchain artifacts get backed up? Is this part of the release process?
          Hide
          tarmstrong Tim Armstrong added a comment -

          I'm unclear what the backup model is, just that it's uploaded to artifactory with a version ID: https://github.com/cloudera/native-toolchain/blob/master/functions.sh#L351

          We should probably confirm the retention/backup situation but that seems like a Cloudera-internal issue.

          Show
          tarmstrong Tim Armstrong added a comment - I'm unclear what the backup model is, just that it's uploaded to artifactory with a version ID: https://github.com/cloudera/native-toolchain/blob/master/functions.sh#L351 We should probably confirm the retention/backup situation but that seems like a Cloudera-internal issue.

            People

            • Assignee:
              tarmstrong Tim Armstrong
              Reporter:
              mjacobs Matthew Jacobs
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development