This ticket is to list and track the required steps to finally enable the integration of HAWQ into Bigtop.
All relevant resources are linked below, and here's the overview of the remaining steps and the overall status of the integration work.
- the biggest issue was and remains the use of libthrift, which isn't packaged, provided nor supported by anyone. Right now, Bigtop-HAWQ integration branch uses my own pre-built version of the library, hosted here. However, this is clearly an insecure and has to be either solved by HAWQ adding this dependency as the source; or by convincing Bigtop community that hosting libthrift library is beneficial for the community at large
- overall, the packaging code is complete and is pushed to the Bigtop branch (see link below). Considering that the work has been completed about 5 weeks ago and was aimed at the state of trunk back in the March, there might be some minor changes, which would require additional tweaks
- libhdfs library code (if already included into HAWQ project) might require additional changes to the packaging code, so the library can be produces and properly set in the installation phase
- Bigtop CI has jobs to create CentOS and Ubuntu packages (linked from the BIGTOP-2320 below)
- smoke tests need to be created (as per BIGTOP-2322), but that seems to be a minor undertaking once the rest of the work is finished
- packaging tests are required to be integrated into Bigtop stack BIGTOP-2324
- deployment code is completed. However, it needs to be extended to property support cluster roles and to be linked to the main site.pp recipe
- because real-life deployment can not rely on in-house python wrappers using passwordless-ssh, the lifecycle management and initial bootstrap are done directly by calling into HAWQ scripts, providing such functionality. It is possible that some of these interfaces were updated in the last 6 weeks, so additional testing would be needed.
- it should be responsibility of the HAWQ to provide a concise way of initializing a master, segment, and so on without a need for password-less ssh, which is suboptimal and won't be accepted by Bigtop community as it is breaks the deployment model
- toolchain code is completed in the bigtop branch. This will allow to build HAWQ in the standard Bigtop container available for the CI and 3rd party users
- toolchain code needs to be rebased on top of current Bigtop master. and possible conflicts would have to be resolved
- once the integration is finished, Bigtop slave images will have to be updated to enable automatic CI runs