Details

    • Type: Sub-task
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 0.14.0
    • Component/s: Server
    • Labels:

      Description

      Implement the foundation for Service Discovery and Topology Generation.

      • Define simple descriptor format (YAML, JSON, etc...)
      • Local simple descriptor and shared provider configuration discovery
        • Monitor conf/shared-providers, conf/descriptors similar to the way conf/topologies is currently monitored.
      • Ambari service discovery (REST API interactions and model construction)
        • Configuration
          • How to plug-in discovery implementations
          • How to configure authentication (credentials/trust) with the service registries
      • Topology assembly from simple descriptor and discovery details
      • Topology deployment
      1. KNOX-1014.patch
        215 kB
        Phil Zampino
      2. KNOX-1014-002.patch
        203 kB
        Phil Zampino

        Issue Links

          Activity

          Hide
          pzampino Phil Zampino added a comment - - edited

          KNOX-1014.patch (attached) provides the foundation for service discovery and topology generation.

          After applying the patch, if you have access to an Ambari cluster, you can try the following:

          1. Locate or create an Ambari cluster
          2. Extract the <gateway> element and its contents from conf/topologies/sandbox.xml, and save it in conf/shared-providers/sandbox-providers.xml
          3. Provision the username/password for your Ambari instance
            _bin/knoxcli.sh create-alias AMBARI_USERNAME --value AMBARI_PASSWORD
          4. Create a simple descriptor (JSON):
            Replace YOUR_AMBARI_HOST(e.g., c6401.ambari.apache.org), AMBARI_USERNAME, YOUR_CLUSTER_NAME with the appropriate values
            {
              "discovery-type":"AMBARI",
              "discovery-address":"http://YOUR_AMBARI_HOST:8080",
              "discovery-user":"AMBARI_USERNAME",
              "provider-config-ref":"sandbox-providers.xml",
              "cluster":"YOUR_CLUSTER_NAME",
              "services":[
                {"name":"NAMENODE"},
                {"name":"JOBTRACKER"},
                {"name":"WEBHDFS"},
                {"name":"WEBHCAT"},
                {"name":"OOZIE"},
                {"name":"WEBHBASE"},
                {"name":"HIVE"},
                {"name":"RESOURCEMANAGER"},
                {"name":"AMBARI", "urls":"[http://YOUR_AMBARI_HOST:8080"]},
                {"name":"AMBARIUI", "urls":["http://YOUR_AMBARI_HOST:8080"]}
              ]
            }
            
          5. Place this simple descriptor in conf/descriptors/YOUR_DESCRIPTOR_NAME.json
          6. Allow the TopologyService to notice the simple descriptor, generate and deploy the full topology.
          7. Check conf/topologies, and notice YOUR_DESCRIPTOR_NAME.xml there
          8. Review the contents of conf/topologies/YOUR_DESCRIPTOR_NAME.xml to see that the service URLs have been populated from the Ambari cluster details
          9. Verify the WEBHDFS service URL was correctly discovered
            curl -ivku guest:guest-password https://localhost:8443/gateway/YOUR_DESCRIPTOR_NAME/webhdfs/v1/tmp?op=LISTSTATUS

          Now, you can try adding/modifying/removing combinations of provider configurations and descriptions to see the results.

          • If you modify a descriptor, it will update the full topology so the changes are reflected.
          • If you modify a shared provider configuration, it will update any referencing descriptors, which will in turn update the associated full topology files.
          • If you delete a simple descriptor, the associated topology will be removed/undeployed.
          • If you delete a topology file, the associated simple descriptor will also be deleted.
          • If you delete a shared provider configuration, any referencing descriptors (and associated topology files) will also be deleted, and the topologies will be undeployed.
          Show
          pzampino Phil Zampino added a comment - - edited KNOX-1014 .patch (attached) provides the foundation for service discovery and topology generation. After applying the patch, if you have access to an Ambari cluster, you can try the following: Locate or create an Ambari cluster Extract the <gateway> element and its contents from conf/topologies/sandbox.xml, and save it in conf/shared-providers/sandbox-providers.xml Provision the username/password for your Ambari instance _bin/knoxcli.sh create-alias AMBARI_USERNAME --value AMBARI_PASSWORD Create a simple descriptor (JSON): Replace YOUR_AMBARI_HOST(e.g., c6401.ambari.apache.org), AMBARI_USERNAME, YOUR_CLUSTER_NAME with the appropriate values { "discovery-type":"AMBARI", "discovery-address":"http://YOUR_AMBARI_HOST:8080", "discovery-user":"AMBARI_USERNAME", "provider-config-ref":"sandbox-providers.xml", "cluster":"YOUR_CLUSTER_NAME", "services":[ {"name":"NAMENODE"}, {"name":"JOBTRACKER"}, {"name":"WEBHDFS"}, {"name":"WEBHCAT"}, {"name":"OOZIE"}, {"name":"WEBHBASE"}, {"name":"HIVE"}, {"name":"RESOURCEMANAGER"}, {"name":"AMBARI", "urls":"[http://YOUR_AMBARI_HOST:8080"]}, {"name":"AMBARIUI", "urls":["http://YOUR_AMBARI_HOST:8080"]} ] } Place this simple descriptor in conf/descriptors/YOUR_DESCRIPTOR_NAME.json Allow the TopologyService to notice the simple descriptor, generate and deploy the full topology. Check conf/topologies, and notice YOUR_DESCRIPTOR_NAME.xml there Review the contents of conf/topologies/YOUR_DESCRIPTOR_NAME.xml to see that the service URLs have been populated from the Ambari cluster details Verify the WEBHDFS service URL was correctly discovered curl -ivku guest:guest-password https://localhost:8443/gateway/YOUR_DESCRIPTOR_NAME/webhdfs/v1/tmp?op=LISTSTATUS Now, you can try adding/modifying/removing combinations of provider configurations and descriptions to see the results. If you modify a descriptor, it will update the full topology so the changes are reflected. If you modify a shared provider configuration, it will update any referencing descriptors, which will in turn update the associated full topology files. If you delete a simple descriptor, the associated topology will be removed/undeployed. If you delete a topology file, the associated simple descriptor will also be deleted. If you delete a shared provider configuration, any referencing descriptors (and associated topology files) will also be deleted, and the topologies will be undeployed.
          Hide
          lmccay Larry McCay added a comment -

          Phil Zampino - this is looking really good!

          If you delete a shared provider configuration, any referencing descriptors (and associated topology files) will also be deleted, and the topologies will be undeployed.

          This should probably be changed such that you cannot delete a provider config through the admin API that is actively referenced by topologies.
          If someone deletes it at the OS level from the filesystem let's leave the referencing topologies and let them fail to deploy. Let's make sure that the logged deployment or topology generation failures have meaningful messages.

          Show
          lmccay Larry McCay added a comment - Phil Zampino - this is looking really good! If you delete a shared provider configuration, any referencing descriptors (and associated topology files) will also be deleted, and the topologies will be undeployed. This should probably be changed such that you cannot delete a provider config through the admin API that is actively referenced by topologies. If someone deletes it at the OS level from the filesystem let's leave the referencing topologies and let them fail to deploy. Let's make sure that the logged deployment or topology generation failures have meaningful messages.
          Hide
          pzampino Phil Zampino added a comment -

          Updated patch with review comments addressed

          Show
          pzampino Phil Zampino added a comment - Updated patch with review comments addressed
          Hide
          pzampino Phil Zampino added a comment -

          Larry McCay I went back and forth on that behavior wrt to shared provider configuration deletions. I've modified it such that deleting a shared provider config that is referenced by one or more simple descriptors no longer results in those descriptors being deleted. Once nice thing about this change is that one is able to recover from an unintentional shared provider configuration deletion. If one of these is deleted, the simple descriptor processing will fail, and the topology generation/redeployment will be aborted. This means 1) Your topologies will still be running and working as they were 2) You can recover the shared provider configuration by copying it from a referencing topology in conf/topologies.

          I've also enabled support for YAML descriptors. While this would not be a good wire format for the API, it is a good config file format to support hand-editing descriptors. The associated APIs will use the JSON format, but the YAML will be a convenience for local manual deployments.

          I've updated the attached patch file with both of these changes.

          Show
          pzampino Phil Zampino added a comment - Larry McCay I went back and forth on that behavior wrt to shared provider configuration deletions. I've modified it such that deleting a shared provider config that is referenced by one or more simple descriptors no longer results in those descriptors being deleted. Once nice thing about this change is that one is able to recover from an unintentional shared provider configuration deletion. If one of these is deleted, the simple descriptor processing will fail, and the topology generation/redeployment will be aborted. This means 1) Your topologies will still be running and working as they were 2) You can recover the shared provider configuration by copying it from a referencing topology in conf/topologies. I've also enabled support for YAML descriptors. While this would not be a good wire format for the API, it is a good config file format to support hand-editing descriptors. The associated APIs will use the JSON format, but the YAML will be a convenience for local manual deployments. I've updated the attached patch file with both of these changes.
          Hide
          lmccay Larry McCay added a comment -

          Phil Zampino - that completely makes sense.
          Thank you.

          I will review this within the next day or so.

          Show
          lmccay Larry McCay added a comment - Phil Zampino - that completely makes sense. Thank you. I will review this within the next day or so.
          Hide
          pzampino Phil Zampino added a comment - - edited

          Updated the attached patch with move of Ambari ServiceDiscovery implementation into its own module.

          Show
          pzampino Phil Zampino added a comment - - edited Updated the attached patch with move of Ambari ServiceDiscovery implementation into its own module.
          Hide
          lmccay Larry McCay added a comment -

          Phil Zampino -

          This is looking really good!

          Couple things for as far as I have gotten in the review process - may have more later:

          1. I notice a lot of hardcoded service names and some mappings - these would be better done more dynamically so as not to require a compile to add support for new services (can be a follow up JIRA)
          2. I notice a credential lookup from the alias service - this is a great idea but I think we should probably use a property name sort of key rather than user name. Too easy to have multiple passwords for the same user for different contexts. I see this as more of a config item stored in a credential store than a user based thing. Might be able to be convinced otherwise though.
          3. question: if a service is in the simple descriptor but unknown by ambari what happens to the materialized topology?
          4. question: if a service is in the simple descriptor but just not installed in the deployment what happens to the materialized topology?

          Show
          lmccay Larry McCay added a comment - Phil Zampino - This is looking really good! Couple things for as far as I have gotten in the review process - may have more later: 1. I notice a lot of hardcoded service names and some mappings - these would be better done more dynamically so as not to require a compile to add support for new services (can be a follow up JIRA) 2. I notice a credential lookup from the alias service - this is a great idea but I think we should probably use a property name sort of key rather than user name. Too easy to have multiple passwords for the same user for different contexts. I see this as more of a config item stored in a credential store than a user based thing. Might be able to be convinced otherwise though. 3. question: if a service is in the simple descriptor but unknown by ambari what happens to the materialized topology? 4. question: if a service is in the simple descriptor but just not installed in the deployment what happens to the materialized topology?
          Hide
          pzampino Phil Zampino added a comment -

          Larry McCay

          1. This had crossed my mind, and I even implemented something along these lines in the POC, but I wanted to give that some more thought. I agree that a follow-up jira to address this would be appropriate.

          2. Are you suggesting something like... knoxcli.sh create-alias ambari-discovery-credentials --value 'admin:admin' , and then have a discovery-credentials-alias in the descriptor referencing that alias, rather than discovery-user?

          3. The topology will not be generated if we can't determine a url for a declared service.

          4. This is effectively the same as #3, at least in terms of the resulting behavior.

          Show
          pzampino Phil Zampino added a comment - Larry McCay 1. This had crossed my mind, and I even implemented something along these lines in the POC, but I wanted to give that some more thought. I agree that a follow-up jira to address this would be appropriate. 2. Are you suggesting something like... knoxcli.sh create-alias ambari-discovery-credentials --value 'admin:admin' , and then have a discovery-credentials-alias in the descriptor referencing that alias, rather than discovery-user? 3. The topology will not be generated if we can't determine a url for a declared service. 4. This is effectively the same as #3, at least in terms of the resulting behavior.
          Hide
          lmccay Larry McCay added a comment -

          2. Are you suggesting something like... knoxcli.sh create-alias ambari-discovery-credentials --value 'admin:admin' , and then have a discovery-credentials-alias in the descriptor referencing that alias, rather than discovery-user?

          I may have misinterpreted what I saw in the patch. I thought that I saw the user name itself as the key to credential store.
          For instance if the default user admin:admin (which we really also should not codify) I thought that I saw code that would look up the credential for 'admin'.

          Show
          lmccay Larry McCay added a comment - 2. Are you suggesting something like... knoxcli.sh create-alias ambari-discovery-credentials --value 'admin:admin' , and then have a discovery-credentials-alias in the descriptor referencing that alias, rather than discovery-user? I may have misinterpreted what I saw in the patch. I thought that I saw the user name itself as the key to credential store. For instance if the default user admin:admin (which we really also should not codify) I thought that I saw code that would look up the credential for 'admin'.
          Hide
          pzampino Phil Zampino added a comment -

          Larry McCayI don't think there is any misinterpretation. It is implemented that way in the patch (username as key, pwd as value). In my question, I'm asking for confirmation that I've understood what you have in mind.

          Show
          pzampino Phil Zampino added a comment - Larry McCay I don't think there is any misinterpretation. It is implemented that way in the patch (username as key, pwd as value). In my question, I'm asking for confirmation that I've understood what you have in mind.
          Hide
          lmccay Larry McCay added a comment -

          haha - okay.

          Yes, I think it would be better to use something like ambari.discovery.password - username may not need to be an alias but if you did I could see ambari.discovery.username.

          Show
          lmccay Larry McCay added a comment - haha - okay. Yes, I think it would be better to use something like ambari.discovery.password - username may not need to be an alias but if you did I could see ambari.discovery.username.
          Hide
          pzampino Phil Zampino added a comment - - edited

          Yes, I think it would be better to use something like ambari.discovery.password - username may not need to be an alias but if you did I could see ambari.discovery.username

          I can see your point about having an alias for the password that is not the username. I'm also thinking, for flexibility, that the descriptor should specify the alias for retrieving the password, rather than requiring ambari.discovery.password. So, the descriptor would have:

          "discovery-user" : "myusername"
          "discovery-pwd-alias" : "my.discovery.password.alias"
          

          What do you think about that?

          And I also agree that the hard-coded default username/password should be removed from the code.

          Show
          pzampino Phil Zampino added a comment - - edited Yes, I think it would be better to use something like ambari.discovery.password - username may not need to be an alias but if you did I could see ambari.discovery.username I can see your point about having an alias for the password that is not the username. I'm also thinking, for flexibility, that the descriptor should specify the alias for retrieving the password, rather than requiring ambari.discovery.password. So, the descriptor would have: "discovery-user" : "myusername" "discovery-pwd-alias" : "my.discovery.password.alias" What do you think about that? And I also agree that the hard-coded default username/password should be removed from the code.
          Hide
          pzampino Phil Zampino added a comment -

          Updated the attached patch with modified treatment of credentials.

          If no username is specified in a descriptor (discovery-user), then the default ambari.discovery.user alias is checked for a username.

          If no password alias is specified as the discovery-pwd-alias value in the descriptor, then the default ambari.discovery.password alias is checked for the password. If the descriptor specifies an alias as the value of the discovery-pwd-alias property, then that alias will be applied at discovery time.

          Hard-coded username and password have been removed.

          Show
          pzampino Phil Zampino added a comment - Updated the attached patch with modified treatment of credentials. If no username is specified in a descriptor (discovery-user), then the default ambari.discovery.user alias is checked for a username. If no password alias is specified as the discovery-pwd-alias value in the descriptor, then the default ambari.discovery.password alias is checked for the password. If the descriptor specifies an alias as the value of the discovery-pwd-alias property, then that alias will be applied at discovery time. Hard-coded username and password have been removed.

            People

            • Assignee:
              pzampino Phil Zampino
              Reporter:
              pzampino Phil Zampino
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:

                Development