Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-4600

Shell does not fall back to accumulo-site.xml when on classpath

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.7.3, 1.8.2, 2.0.0
    • Component/s: shell
    • Labels:
      None

      Description

      When inspecting 1.7.3-rc1 for the VOTE, I did the following steps:

      • Unpack bin-tarball
      • Copy 3gb native example confs
      • Set instance.volumes in accumulo-site.xml to hdfs://localhost:8020/accumulo173rc1
      • export ACCUMULO_HOME="$(pwd)"
      • ./bin/accumulo init
      • ./bin/start-all.sh
      • ./bin/accumulo shell -u root

      The shell failed to connect stating that no tservers were running. By turning on the debug option to the shell, I could see that the wrong HDFS directory was being used to find the Accumulo instance ID, /accumulo instead of /accumulo173rc1.

      This appears to be because of ClientContext#convertClientConfig(Configuration) and Shell#getZooInstance(String, String, ClientConfiguration. The client configuration is empty, therefore, all values end up being pulled from the DefaultConfiguration instance instead of the accumulo-site.xml which is on the classpath.

        Issue Links

          Activity

          Hide
          elserj Josh Elser added a comment -

          Looks like ACCUMULO-4505 broke this.

          FYI Michael Miller, Christopher Tubbs.

          Show
          elserj Josh Elser added a comment - Looks like ACCUMULO-4505 broke this. FYI Michael Miller , Christopher Tubbs .
          Hide
          milleruntime Michael Miller added a comment -

          Josh Elser Thoughts on remedies?

          Show
          milleruntime Michael Miller added a comment - Josh Elser Thoughts on remedies?
          Hide
          elserj Josh Elser added a comment -

          Thoughts on remedies?

          The first (trivial) idea I had was that this method should not use a DefaultConfiguration instance, instead a SiteConfiguration. I assume if we reverted your change, we would end up breaking your fix. So, we would have to break the ClientContext call into some different logic – perhaps passing in an AccumuloConfiguration instance instead of blindly choosing one in the method itself.

          However, I don't know if there are more edge cases lying in wait...

          Show
          elserj Josh Elser added a comment - Thoughts on remedies? The first (trivial) idea I had was that this method should not use a DefaultConfiguration instance, instead a SiteConfiguration . I assume if we reverted your change, we would end up breaking your fix. So, we would have to break the ClientContext call into some different logic – perhaps passing in an AccumuloConfiguration instance instead of blindly choosing one in the method itself. However, I don't know if there are more edge cases lying in wait...
          Hide
          ctubbsii Christopher Tubbs added a comment - - edited

          Ugh. I really hate this special shell behavior. We don't expect this of any other client.

          Show
          ctubbsii Christopher Tubbs added a comment - - edited Ugh. I really hate this special shell behavior. We don't expect this of any other client.
          Hide
          ctubbsii Christopher Tubbs added a comment -

          I think maybe another solution might work: create a SiteConfiguration.getInstanceOrDefault(...) method specifically for this shell use case. We have to be careful not to override the client config if it's available, but then sanely fall back on the site config, but only if that site config is accessible.

          Show
          ctubbsii Christopher Tubbs added a comment - I think maybe another solution might work: create a SiteConfiguration.getInstanceOrDefault(...) method specifically for this shell use case. We have to be careful not to override the client config if it's available, but then sanely fall back on the site config, but only if that site config is accessible.
          Hide
          milleruntime Michael Miller added a comment - - edited

          I made the (breaking) fix in 1.8.1 as well. Did we miss this issue with the last release or does it not apply in the 1.8 branch?

          Show
          milleruntime Michael Miller added a comment - - edited I made the (breaking) fix in 1.8.1 as well. Did we miss this issue with the last release or does it not apply in the 1.8 branch?
          Hide
          elserj Josh Elser added a comment -

          I made the (breaking) fix in 1.8.1 as well. Did we miss this issue with the last release or does it not apply in the 1.8 branch?

          I'm guessing it was missed (I don't think I made time to test that out), but I have not yet confirmed. Steps above should apply..

          Show
          elserj Josh Elser added a comment - I made the (breaking) fix in 1.8.1 as well. Did we miss this issue with the last release or does it not apply in the 1.8 branch? I'm guessing it was missed (I don't think I made time to test that out), but I have not yet confirmed. Steps above should apply..
          Hide
          elserj Josh Elser added a comment -

          Ugh. I really hate this special shell behavior. We don't expect this of any other client.

          Compatibility is hard

          Show
          elserj Josh Elser added a comment - Ugh. I really hate this special shell behavior. We don't expect this of any other client. Compatibility is hard
          Hide
          milleruntime Michael Miller added a comment -

          I think I got burned by this comment:
          https://github.com/apache/accumulo/blob/1.7/shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java#L329

          What about just making ShellOptionsJC read from accumulo-site like the comment claims its doing? I'd feel more comfortable modifying ShellOptionsJC instead of messing with any of the *Configuration or ClientContext objects.

          Show
          milleruntime Michael Miller added a comment - I think I got burned by this comment: https://github.com/apache/accumulo/blob/1.7/shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java#L329 What about just making ShellOptionsJC read from accumulo-site like the comment claims its doing? I'd feel more comfortable modifying ShellOptionsJC instead of messing with any of the *Configuration or ClientContext objects.
          Hide
          ctubbsii Christopher Tubbs added a comment -

          I think the problem in ACCUMULO-4505 was that when ShellOptionsJC tried to call the SiteConfiguration method, it failed, because SiteConfiguration assumes the file is readable. If SiteConfiguration can have a more tolerant method that doesn't cause a hard failure, and ShellOptionsJC called that... I think that would make the most sense.

          Show
          ctubbsii Christopher Tubbs added a comment - I think the problem in ACCUMULO-4505 was that when ShellOptionsJC tried to call the SiteConfiguration method, it failed, because SiteConfiguration assumes the file is readable. If SiteConfiguration can have a more tolerant method that doesn't cause a hard failure, and ShellOptionsJC called that... I think that would make the most sense.
          Hide
          elserj Josh Elser added a comment -

          What about just making ShellOptionsJC read from accumulo-site like the comment claims its doing?

          I don't understand what you mean by this. Are you implying that we should also have instance.volumes (and maybe the old instance.dfs.* properties also automagically pulled via the ClientConfiguration?

          Show
          elserj Josh Elser added a comment - What about just making ShellOptionsJC read from accumulo-site like the comment claims its doing? I don't understand what you mean by this. Are you implying that we should also have instance.volumes (and maybe the old instance.dfs.* properties also automagically pulled via the ClientConfiguration?
          Hide
          milleruntime Michael Miller added a comment -

          After some debugging, the Shell is reading from accumulo-site.xml just fine. I think the problem is the conversion from SiteConfiguration to ClientConfiguration... mainly the fact that there is no INSTANCE_VOLUMES ClientProperty.

          Show
          milleruntime Michael Miller added a comment - After some debugging, the Shell is reading from accumulo-site.xml just fine. I think the problem is the conversion from SiteConfiguration to ClientConfiguration... mainly the fact that there is no INSTANCE_VOLUMES ClientProperty.
          Hide
          elserj Josh Elser added a comment -

          I don't think it makes sense to have an instance.volumes client property, explicitly. Like Christopher pointed out, this is an edge case, not a normal one. Users shouldn't have to know where in hdfs Accumulo is writing files.

          Show
          elserj Josh Elser added a comment - I don't think it makes sense to have an instance.volumes client property, explicitly. Like Christopher pointed out, this is an edge case, not a normal one. Users shouldn't have to know where in hdfs Accumulo is writing files.
          Hide
          ctubbsii Christopher Tubbs added a comment -

          I agree. Clients don't need to know where instance volumes are, in general. Unfortunately, in this case, that's exactly what makes the shell different.

          I think the problem is in Shell.getZooInstance(...). There, it is trying to read the instance ID from HDFS, but it's using the client config to determine the volumes to inspect, rather than the site config. In ShellOptionsJC, we handle a similar case by wrapping the client config with a SiteConfiguration so we can fall back to the ZK hosts in the site file when its missing from the ClientConfiguration, but we're not doing that here for the fallback volumes when the instance name is missing from the ClientConfiguration so we can get the instanceID.

          Show
          ctubbsii Christopher Tubbs added a comment - I agree. Clients don't need to know where instance volumes are, in general. Unfortunately, in this case, that's exactly what makes the shell different. I think the problem is in Shell.getZooInstance(...) . There, it is trying to read the instance ID from HDFS, but it's using the client config to determine the volumes to inspect, rather than the site config. In ShellOptionsJC , we handle a similar case by wrapping the client config with a SiteConfiguration so we can fall back to the ZK hosts in the site file when its missing from the ClientConfiguration , but we're not doing that here for the fallback volumes when the instance name is missing from the ClientConfiguration so we can get the instanceID.
          Hide
          milleruntime Michael Miller added a comment -

          I think I have a solution, Mr. Turner helped set me straight. I will open a pull request momentarily. Uno momento por favor

          Show
          milleruntime Michael Miller added a comment - I think I have a solution, Mr. Turner helped set me straight. I will open a pull request momentarily. Uno momento por favor
          Hide
          ctubbsii Christopher Tubbs added a comment -

          It took me SO much longer than it should have to figure out who "Mr. Turner" was.
          I assume he means Keith Turner

          Show
          ctubbsii Christopher Tubbs added a comment - It took me SO much longer than it should have to figure out who "Mr. Turner" was. I assume he means Keith Turner
          Hide
          elserj Josh Elser added a comment -

          It took me SO much longer than it should have to figure out who "Mr. Turner" was.

          Nah, probably that Dick Turner guy.

          Show
          elserj Josh Elser added a comment - It took me SO much longer than it should have to figure out who "Mr. Turner" was. Nah, probably that Dick Turner guy.
          Hide
          phrocker marco polo added a comment -

          Who is Tick Durner?

          Show
          phrocker marco polo added a comment - Who is Tick Durner?
          Hide
          ctubbsii Christopher Tubbs added a comment -

          Looks like this was pushed to 1.7 branch by Mike. I've merged it forward, assigned this issue to him, and closed it. If this isn't done, please reopen.

          Show
          ctubbsii Christopher Tubbs added a comment - Looks like this was pushed to 1.7 branch by Mike. I've merged it forward, assigned this issue to him, and closed it. If this isn't done, please reopen.

            People

            • Assignee:
              milleruntime Michael Miller
              Reporter:
              elserj Josh Elser
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 50m
                1h 50m

                  Development