Uploaded image for project: 'Bigtop'
  1. Bigtop
  2. BIGTOP-2296

Provide a way to build Docker container with functional stack

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.1.0
    • Fix Version/s: 1.2.0
    • Component/s: docker, general
    • Labels:
      None

      Description

      It would be great to have a way for a user to run a container with a fully functional Bigtop stack in it (in some definition of fullness)

        Activity

        Hide
        cos Konstantin Boudnik added a comment -

        Initial stab at it. Need to figure out how to do the second run of the puppet apply...

        Show
        cos Konstantin Boudnik added a comment - Initial stab at it. Need to figure out how to do the second run of the puppet apply...
        Hide
        cos Konstantin Boudnik added a comment - - edited

        That seems to do the trick. We can improve on it with runtime configuration for the components set, etc. It is important to run the container with -h bigtop1.docker parameters. Otherwise it won't configure the head node.

        Thoughts?

        Show
        cos Konstantin Boudnik added a comment - - edited That seems to do the trick. We can improve on it with runtime configuration for the components set, etc. It is important to run the container with -h bigtop1.docker parameters. Otherwise it won't configure the head node. Thoughts?
        Hide
        evans_ye Evans Ye added a comment - - edited

        Hi Konstantin Boudnik, so happy to see you proposed this.
        If you'd like to let user run a one node cluster, then docker -h might be the way out.
        If you'd like to let user run a multi-node cluster, docker-compose would be better.

        Speaking of this, I've been thinking about how to speed up the cluster deployment for a long time since the day I got asked on the Apache Big Data 2015.
        And here's what I have right now. The following design let user to specify components and burn images with big data stack they'd like to have:

        1. Define a set of component that user would like to be installed in the image. The set can be defined in config.yaml
        2. Specify that burned image name, in_memory_stack, in config.yaml as well
        3. Burn! Expose something like the following to the users:
          ./docker-hadoop -C config.yaml --burn
          # Or wrapped in gradle
          ./gradlew -Pconfig=config.yaml burn-docker-image
          
        4. What burns do is to simply do yum or apt install of those pre-defined component.
        5. Run a multi-node cluster as usual will components pre-installed image and, dada, you get a cluster instantly.

        How does this sound? If I didn't make it clear. Please point it out and I'll try to describe more.

        Show
        evans_ye Evans Ye added a comment - - edited Hi Konstantin Boudnik , so happy to see you proposed this. If you'd like to let user run a one node cluster, then docker -h might be the way out. If you'd like to let user run a multi-node cluster, docker-compose would be better. Speaking of this, I've been thinking about how to speed up the cluster deployment for a long time since the day I got asked on the Apache Big Data 2015. And here's what I have right now. The following design let user to specify components and burn images with big data stack they'd like to have: Define a set of component that user would like to be installed in the image. The set can be defined in config.yaml Specify that burned image name, in_memory_stack , in config.yaml as well Burn! Expose something like the following to the users: ./docker-hadoop -C config.yaml --burn # Or wrapped in gradle ./gradlew -Pconfig=config.yaml burn-docker-image What burns do is to simply do yum or apt install of those pre-defined component. Run a multi-node cluster as usual will components pre-installed image and, dada, you get a cluster instantly. How does this sound? If I didn't make it clear. Please point it out and I'll try to describe more.
        Hide
        cos Konstantin Boudnik added a comment -

        All you're saying is very true. But perhaps because of not sufficient knowledge of the matter, I still don't see how we can avoid the configuration step when the pre-burned nodes are being brought up. In the description above I am referring to the #5. Our Puppet relies on the names of the hosts to configure the packages (and install the packages as well), so we'll have to do this second step proposed in my patch. Am I making much sense? It's late and I am tired

        Show
        cos Konstantin Boudnik added a comment - All you're saying is very true. But perhaps because of not sufficient knowledge of the matter, I still don't see how we can avoid the configuration step when the pre-burned nodes are being brought up. In the description above I am referring to the #5. Our Puppet relies on the names of the hosts to configure the packages (and install the packages as well), so we'll have to do this second step proposed in my patch. Am I making much sense? It's late and I am tired
        Hide
        evans_ye Evans Ye added a comment -

        Under Bigtop Provisioner framework, the configuration files will be auto generated and mounted inside the containers, in this way the pre-burned images are just serving the purpose to skip puppet package installation.

        Show
        evans_ye Evans Ye added a comment - Under Bigtop Provisioner framework, the configuration files will be auto generated and mounted inside the containers, in this way the pre-burned images are just serving the purpose to skip puppet package installation.
        Hide
        cos Konstantin Boudnik added a comment -

        I guess I am missing something. Provisioner creates containers dynamically, right? Even if we'll have all the packages pre-installed in the image, configuration files generation is still done by puppet apply. No? Sorry, if I am overly obtuse.

        Show
        cos Konstantin Boudnik added a comment - I guess I am missing something. Provisioner creates containers dynamically, right? Even if we'll have all the packages pre-installed in the image, configuration files generation is still done by puppet apply . No? Sorry, if I am overly obtuse.
        Hide
        evans_ye Evans Ye added a comment -

        For those configuration not being parameterized by our puppet, YES you're right. And in order to do that, users need to change the puppet configuration template file and then snapshot the image. For those parameterized config(hieradata), we can just supply a customize one or auto-gen a default at runtime.

        If we have BIGTOP-1693, then any customized configuration can be provided afterward w/o being hardcoded inside the image. Do you think this makes sense?

        Show
        evans_ye Evans Ye added a comment - For those configuration not being parameterized by our puppet, YES you're right. And in order to do that, users need to change the puppet configuration template file and then snapshot the image. For those parameterized config(hieradata), we can just supply a customize one or auto-gen a default at runtime. If we have BIGTOP-1693 , then any customized configuration can be provided afterward w/o being hardcoded inside the image. Do you think this makes sense?
        Hide
        cos Konstantin Boudnik added a comment -

        I guess you're right. We might get a more universal image, usable for different use-cases.
        So, shall we drop this ticket then?

        Show
        cos Konstantin Boudnik added a comment - I guess you're right. We might get a more universal image, usable for different use-cases. So, shall we drop this ticket then?
        Hide
        evans_ye Evans Ye added a comment -

        Actually, if this patch works and you think it's useful, I would say get it in. We can always develop something initially and iterate on it. Besides, I might not have enough time to develop that feature I planned recently.

        Show
        evans_ye Evans Ye added a comment - Actually, if this patch works and you think it's useful, I would say get it in. We can always develop something initially and iterate on it. Besides, I might not have enough time to develop that feature I planned recently.
        Hide
        cos Konstantin Boudnik added a comment -

        Updated the patch to point to the latest 1.1.0 release repo

        Show
        cos Konstantin Boudnik added a comment - Updated the patch to point to the latest 1.1.0 release repo
        Hide
        cos Konstantin Boudnik added a comment -

        I will commit this now: it works, and it let a user to quickly create a docker image with fully functional, yet basic, hadoop cluster. It is also easy to layer more components on top of it, if needed. If we are going to develop a better aporoach for swarming the services in the cluster, we can always rework/scrape this one.

        Show
        cos Konstantin Boudnik added a comment - I will commit this now: it works, and it let a user to quickly create a docker image with fully functional, yet basic, hadoop cluster. It is also easy to layer more components on top of it, if needed. If we are going to develop a better aporoach for swarming the services in the cluster, we can always rework/scrape this one.
        Hide
        cos Konstantin Boudnik added a comment -

        Pushed to the master.

        Show
        cos Konstantin Boudnik added a comment - Pushed to the master.

          People

          • Assignee:
            cos Konstantin Boudnik
            Reporter:
            cos Konstantin Boudnik
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development