Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-3176

Add ability to create a table with user specified initial properties

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.7.0
    • Component/s: None
    • Labels:
      None

      Description

      This change would allow for table properties to be set before the default tablet is created. Instead of just adding a new create method, a NewTableConfiguration class could be created and passed and the other create methods deprecated.

        Issue Links

        There are no Sub-Tasks for this issue.

          Activity

          Hide
          elserj Josh Elser added a comment -

          (pulling this comment from ACCUMULO-3089 since it's relevant here)

          Sean Busbey wrote:
          I moved ACCUMULO-3176 to be a subtask so it would be clearer what I considered covered by my objections (this ticket and its subtasks) ...

          Sean Busbey can you clarify your objections on this specific issue for me? I'd actually like to see this done (perhaps with some more follow on now that I think about it). I believe it lays the ground work to squash race conditions WRT delays in ZooKeeper watcher firing in systems with more than one tabletserver, most notably with configuration of iterators/combiners.

          For example, the following code is subject to a race condition of the necessary tservers seeing the configuration update from ZK before running the scan.

          Connector conn = getConnector();
          conn.tableOperations().create("foo");
          conn.tableOperations().attachIterator("foo", new IteratorSetting(50, MyIterator.class));
          BatchWriter bw = conn.createBatchWriter("foo", new BatchWriterConfig());
          List<Mutation> mutations = getMutations();
          bw.addMutations(mutations);
          bw.close();
          List<Entry<Key,Value>> data = Lists.newArrayList(conn.createScanner("foo"));
          

          If the properties (in this case, iterator configuration) are set when the table is created, there isn't a race condition. There is no opportunity for a tserver to read the table configuration before the iterators are configured and perform some operation before the ZK Watcher fires to update the cached table configuration (ZooCache) on this tserver.

          Show
          elserj Josh Elser added a comment - (pulling this comment from ACCUMULO-3089 since it's relevant here) Sean Busbey wrote: I moved ACCUMULO-3176 to be a subtask so it would be clearer what I considered covered by my objections (this ticket and its subtasks) ... Sean Busbey can you clarify your objections on this specific issue for me? I'd actually like to see this done (perhaps with some more follow on now that I think about it). I believe it lays the ground work to squash race conditions WRT delays in ZooKeeper watcher firing in systems with more than one tabletserver, most notably with configuration of iterators/combiners. For example, the following code is subject to a race condition of the necessary tservers seeing the configuration update from ZK before running the scan. Connector conn = getConnector(); conn.tableOperations().create( "foo" ); conn.tableOperations().attachIterator( "foo" , new IteratorSetting(50, MyIterator.class)); BatchWriter bw = conn.createBatchWriter( "foo" , new BatchWriterConfig()); List<Mutation> mutations = getMutations(); bw.addMutations(mutations); bw.close(); List<Entry<Key,Value>> data = Lists.newArrayList(conn.createScanner( "foo" )); If the properties (in this case, iterator configuration) are set when the table is created, there isn't a race condition. There is no opportunity for a tserver to read the table configuration before the iterators are configured and perform some operation before the ZK Watcher fires to update the cached table configuration (ZooCache) on this tserver.
          Hide
          busbey Sean Busbey added a comment -

          my objection is it changes the api for table creation unnecessarily.

          So long as table properties are configurable after creation you've just pushed off some of the race conditions around e.g. setting iterators. Why not instead add the ability to wait for a set of properties to have been seen by all tservers? That solves more use cases and doesn't change the table creation api.

          Show
          busbey Sean Busbey added a comment - my objection is it changes the api for table creation unnecessarily. So long as table properties are configurable after creation you've just pushed off some of the race conditions around e.g. setting iterators. Why not instead add the ability to wait for a set of properties to have been seen by all tservers? That solves more use cases and doesn't change the table creation api.
          Hide
          elserj Josh Elser added a comment - - edited

          Thanks for the details.

          my objection is it changes the api for table creation unnecessarily.

          Specifically the deprecation of the previous calls? Or the addition of any new create table method?

          Why not instead add the ability to wait for a set of properties to have been seen by all tservers?

          Because that really sucks for large numbers of tservers. Your single client now needs to communicate to every single server in the instance and ask them to confirm that they've seen a given property. The more realistic way to do it would be to introduce a new RPC call that forces all tservers to drop their ZooCache entries. However, the end result is that you still have to add new API calls to support this.

          Show
          elserj Josh Elser added a comment - - edited Thanks for the details. my objection is it changes the api for table creation unnecessarily. Specifically the deprecation of the previous calls? Or the addition of any new create table method? Why not instead add the ability to wait for a set of properties to have been seen by all tservers? Because that really sucks for large numbers of tservers. Your single client now needs to communicate to every single server in the instance and ask them to confirm that they've seen a given property. The more realistic way to do it would be to introduce a new RPC call that forces all tservers to drop their ZooCache entries. However, the end result is that you still have to add new API calls to support this.
          Hide
          busbey Sean Busbey added a comment -

          Why not instead add the ability to wait for a set of properties to have been seen by all tservers?

          Because that really sucks for large numbers of tservers. Your single client now needs to communicate to every single server in the instance and ask them to confirm that they've seen a given property. The more realistic way to do it would be to introduce a new RPC call that forces all tservers to drop their ZooCache entries. However, the end result is that you still have to add new API calls to support this.

          Couldn't the RPC for setting configuration on a table tell all the tservers to drop cached entries related to said configuration change? that avoids adding new client RPC and solves the general race condition on table properties.

          Show
          busbey Sean Busbey added a comment - Why not instead add the ability to wait for a set of properties to have been seen by all tservers? Because that really sucks for large numbers of tservers. Your single client now needs to communicate to every single server in the instance and ask them to confirm that they've seen a given property. The more realistic way to do it would be to introduce a new RPC call that forces all tservers to drop their ZooCache entries. However, the end result is that you still have to add new API calls to support this. Couldn't the RPC for setting configuration on a table tell all the tservers to drop cached entries related to said configuration change? that avoids adding new client RPC and solves the general race condition on table properties.
          Hide
          elserj Josh Elser added a comment -

          Couldn't the RPC for setting configuration on a table tell all the tservers to drop cached entries related to said configuration change

          Can you be more specific on how you see that working? I can't think of a simple way that wouldn't still be a broadcast to all servers. We could probably hide something inside the implementation for setProperty on TableOperationsImpl but someone will still have to reach out and talk to each tabletserver (whether client, master, or a tserver). A FATE operation could probably be used to wait for it to happen, but the master would just be doing the RPC then. The other option would be using ZooKeeper to broadcast that, but we've just looped onto the same problem we were trying to solve at that point . We could potentially have the master delegate to other tservers to tell other tservers to drop their ZooCache for this table (akin to how bulk-loading delegation works) but that's rather complex and scary.

          Trying to make sure that all possible servers see an update seems more complex to me than just allowing properties to be provided at table creation time before the servers even know about that table.

          Show
          elserj Josh Elser added a comment - Couldn't the RPC for setting configuration on a table tell all the tservers to drop cached entries related to said configuration change Can you be more specific on how you see that working? I can't think of a simple way that wouldn't still be a broadcast to all servers. We could probably hide something inside the implementation for setProperty on TableOperationsImpl but someone will still have to reach out and talk to each tabletserver (whether client, master, or a tserver). A FATE operation could probably be used to wait for it to happen, but the master would just be doing the RPC then. The other option would be using ZooKeeper to broadcast that, but we've just looped onto the same problem we were trying to solve at that point . We could potentially have the master delegate to other tservers to tell other tservers to drop their ZooCache for this table (akin to how bulk-loading delegation works) but that's rather complex and scary. Trying to make sure that all possible servers see an update seems more complex to me than just allowing properties to be provided at table creation time before the servers even know about that table.
          Hide
          kturner Keith Turner added a comment -

          Why not instead add the ability to wait for a set of properties to have been seen by all tservers? That solves more use cases and doesn't change the table creation api.

          Even if set props waits that still does not solve the following problem.

          1. Thread 1 creates table foo
          2. Thread 2 writes some bad data to table foo and it succeeds
          3. Thread 1 adds a constraint to table foo using a new synchronous set props method
          4. Thread 1 writes some bad data to table foo and it fails

          It seems like create table w/ props would avoid this problem

          1. Thread 1 creates table foo w/ constraint
          2. Thread 2 writes some bad data to table foo and it fails
          3. Thread 1 writes some bad data to table foo and it fails
          Show
          kturner Keith Turner added a comment - Why not instead add the ability to wait for a set of properties to have been seen by all tservers? That solves more use cases and doesn't change the table creation api. Even if set props waits that still does not solve the following problem. Thread 1 creates table foo Thread 2 writes some bad data to table foo and it succeeds Thread 1 adds a constraint to table foo using a new synchronous set props method Thread 1 writes some bad data to table foo and it fails It seems like create table w/ props would avoid this problem Thread 1 creates table foo w/ constraint Thread 2 writes some bad data to table foo and it fails Thread 1 writes some bad data to table foo and it fails
          Hide
          busbey Sean Busbey added a comment -

          Unless we disallow non-creation-time table properties, creation time properties still aren't a real solution.

          What if we

          1) required tables to be offline before setting properties
          2) tie the zk cache to a lock that we can delete to indicate that a refresh is needed

          that way you can't have a thread writing bad data in.

          Show
          busbey Sean Busbey added a comment - Unless we disallow non-creation-time table properties, creation time properties still aren't a real solution. What if we 1) required tables to be offline before setting properties 2) tie the zk cache to a lock that we can delete to indicate that a refresh is needed that way you can't have a thread writing bad data in.
          Hide
          busbey Sean Busbey added a comment -

          or perhaps instead of #2 we force a refresh of properties related to a table when it transitions from offline to online?

          Show
          busbey Sean Busbey added a comment - or perhaps instead of #2 we force a refresh of properties related to a table when it transitions from offline to online?
          Hide
          elserj Josh Elser added a comment -

          Unless we disallow non-creation-time table properties, creation time properties still aren't a real solution.

          Do you mean that the update of a table property at some point in time (after it exists) is subject to the same race condition? That's definitely true. I'd speculate that the issue is possibly less heinous for iterators/combiners/constraints because it's likely that you only set at table creation but that's subjective.

          we force a refresh of properties related to a table when it transitions from offline to online?

          It's not clear to me how this would work either. You could probably invalidate the table properties when a tablet is loaded on a tserver, which would encompass the offline->online transition. I think that could work. I don't like the story of "just offline and re-online your table to make sure configuration propagates". That's really nasty.

          Are we at least in agreement that allowing properties at table creation do mitigate the potential race condition? We would definitely need something more heavy-handed to properly fix this (or we'd have to do away with ZooCache...) that is probably out of scope for this issue specifically.

          Show
          elserj Josh Elser added a comment - Unless we disallow non-creation-time table properties, creation time properties still aren't a real solution. Do you mean that the update of a table property at some point in time (after it exists) is subject to the same race condition? That's definitely true. I'd speculate that the issue is possibly less heinous for iterators/combiners/constraints because it's likely that you only set at table creation but that's subjective. we force a refresh of properties related to a table when it transitions from offline to online? It's not clear to me how this would work either. You could probably invalidate the table properties when a tablet is loaded on a tserver, which would encompass the offline->online transition. I think that could work. I don't like the story of "just offline and re-online your table to make sure configuration propagates". That's really nasty. Are we at least in agreement that allowing properties at table creation do mitigate the potential race condition? We would definitely need something more heavy-handed to properly fix this (or we'd have to do away with ZooCache...) that is probably out of scope for this issue specifically.
          Hide
          busbey Sean Busbey added a comment -

          Are we at least in agreement that allowing properties at table creation do mitigate the potential race condition? We would definitely need something more heavy-handed to properly fix this (or we'd have to do away with ZooCache...) that is probably out of scope for this issue specifically.

          No, because it only solves one race condition. It leaves people prone to needing to redo a table entirely because they didn't realize they had to set the property at creation time. Or worse, it leaves them unaware that they are in this state.

          It's not clear to me how this would work either. You could probably invalidate the table properties when a tablet is loaded on a tserver, which would encompass the offline->online transition. I think that could work. I don't like the story of "just offline and re-online your table to make sure configuration propagates". That's really nasty.

          It's not nasty if you can only set properties when it's offline. That would mean the story is "In order to ensure a consistent view of per-table configuration properties you can only alter them when the table is offline. When the table is brought online each tablet server will properly reflect the updated property."

          If I forgot to set something at creation time, I can follow a procedure to ensure I have a consistent set of properties. That might get more complicated if I need better assurances (like doing an offline MR job to verify the extant data is consistent with some constraint I want to add), but atleast there's a usable path that isn't "you have to create a new table and set the properties then."

          Show
          busbey Sean Busbey added a comment - Are we at least in agreement that allowing properties at table creation do mitigate the potential race condition? We would definitely need something more heavy-handed to properly fix this (or we'd have to do away with ZooCache...) that is probably out of scope for this issue specifically. No, because it only solves one race condition. It leaves people prone to needing to redo a table entirely because they didn't realize they had to set the property at creation time. Or worse, it leaves them unaware that they are in this state. It's not clear to me how this would work either. You could probably invalidate the table properties when a tablet is loaded on a tserver, which would encompass the offline->online transition. I think that could work. I don't like the story of "just offline and re-online your table to make sure configuration propagates". That's really nasty. It's not nasty if you can only set properties when it's offline. That would mean the story is "In order to ensure a consistent view of per-table configuration properties you can only alter them when the table is offline. When the table is brought online each tablet server will properly reflect the updated property." If I forgot to set something at creation time, I can follow a procedure to ensure I have a consistent set of properties. That might get more complicated if I need better assurances (like doing an offline MR job to verify the extant data is consistent with some constraint I want to add), but atleast there's a usable path that isn't "you have to create a new table and set the properties then."
          Hide
          elserj Josh Elser added a comment -

          No, because it only solves one race condition.

          Sorry, I don't think I was clear. I meant to say it "helps mitigate", not implying that it solves the entire problem. More specifically, that the proposed changes here would help solve part of the problem. There is undeniably a bigger problem that needs solving.

          It's not nasty if you can only set properties when it's offline.

          "nasty" was too colloquial on my part. I meant it as: only allowing table properties to change is a very drastic change WRT current user operations/expectations. It's certainly the least impactful code-change to solve the problem (read-as: zero code changes), but I would rather solve the larger issue in some way that doesn't require such a change. I'm not entirely sure of how we could do it though.

          Show
          elserj Josh Elser added a comment - No, because it only solves one race condition. Sorry, I don't think I was clear. I meant to say it "helps mitigate", not implying that it solves the entire problem. More specifically, that the proposed changes here would help solve part of the problem. There is undeniably a bigger problem that needs solving. It's not nasty if you can only set properties when it's offline. "nasty" was too colloquial on my part. I meant it as: only allowing table properties to change is a very drastic change WRT current user operations/expectations. It's certainly the least impactful code-change to solve the problem (read-as: zero code changes), but I would rather solve the larger issue in some way that doesn't require such a change. I'm not entirely sure of how we could do it though.
          Hide
          busbey Sean Busbey added a comment -

          I meant it as: only allowing table properties to change [offline] is a very drastic change WRT current user operations/expectations. It's certainly the least impactful code-change to solve the problem (read-as: zero code changes), but I would rather solve the larger issue in some way that doesn't require such a change. I'm not entirely sure of how we could do it though.

          I agree it will be a big change in behavior. we can do it in a major version and offer an unsafe version that does the current behavior if admins want to do that for some reason. we should include on that version a clear example of what kind of inconsistencies one should expect.

          Show
          busbey Sean Busbey added a comment - I meant it as: only allowing table properties to change [offline] is a very drastic change WRT current user operations/expectations. It's certainly the least impactful code-change to solve the problem (read-as: zero code changes), but I would rather solve the larger issue in some way that doesn't require such a change. I'm not entirely sure of how we could do it though. I agree it will be a big change in behavior. we can do it in a major version and offer an unsafe version that does the current behavior if admins want to do that for some reason. we should include on that version a clear example of what kind of inconsistencies one should expect.
          Hide
          kturner Keith Turner added a comment -

          Its a bit clunky, but users can currently get a good deal of consistency w/ clone table. The API call to clone a table allows a user to set per table props before the clone is created. So if a user really wants to ensure a particular per table config is seen by all tablets at the same time they could do the following.

          1. clone table T1 to T1_tmp w/ new per table props
          2. delete table T1
          3. rename T1_tmp to T1

          Of course this would be disruptive to anything using T1.

          This discussion made me wonder what the javadocs for setProperty say. Currenty says the following :

          Sets a property on a table. Note that it may take a short period of time (a second) to propagate the change everywhere.

          Maybe this should be expanded to state that tablets may see the config change at different times and explicitly state that its an async operation.

          Show
          kturner Keith Turner added a comment - Its a bit clunky, but users can currently get a good deal of consistency w/ clone table. The API call to clone a table allows a user to set per table props before the clone is created. So if a user really wants to ensure a particular per table config is seen by all tablets at the same time they could do the following. clone table T1 to T1_tmp w/ new per table props delete table T1 rename T1_tmp to T1 Of course this would be disruptive to anything using T1. This discussion made me wonder what the javadocs for setProperty say. Currenty says the following : Sets a property on a table. Note that it may take a short period of time (a second) to propagate the change everywhere. Maybe this should be expanded to state that tablets may see the config change at different times and explicitly state that its an async operation.
          Hide
          ctubbsii Christopher Tubbs added a comment -

          There are 3 +1s for this patch (including mine and those on the review), which improves the table creation API to make it possible to set initial properties at table creation. I don't see any objections, other than Sean Busbey's suggestions for alternate solutions to some use cases, which this change does not preclude. As such, it is my intention to rebase and apply this patch today to 1.7, since it has been idle for more than a month. Objections should be represented as a formal veto, and brought to a vote, in accordance with our bylaws.

          I'm also going to move it to a top-level issue, because, while it was conceived of in the context of ACCUMULO-3089 and might benefit it, it is a completely independent improvement.

          The proposals for other improvements, represented by some of the above comments, should be made into their own issues if anybody intends to pursue, or consider pursuing, those proposals.

          Show
          ctubbsii Christopher Tubbs added a comment - There are 3 +1s for this patch (including mine and those on the review), which improves the table creation API to make it possible to set initial properties at table creation. I don't see any objections, other than Sean Busbey 's suggestions for alternate solutions to some use cases, which this change does not preclude. As such, it is my intention to rebase and apply this patch today to 1.7, since it has been idle for more than a month. Objections should be represented as a formal veto, and brought to a vote, in accordance with our bylaws. I'm also going to move it to a top-level issue, because, while it was conceived of in the context of ACCUMULO-3089 and might benefit it, it is a completely independent improvement. The proposals for other improvements, represented by some of the above comments, should be made into their own issues if anybody intends to pursue, or consider pursuing, those proposals.
          Hide
          busbey Sean Busbey added a comment -

          Christopher Tubbs, my objection to this change and intention to veto was clearly stated on ACCUMULO-3089.

          If we are going to need a formal vote, please do not push the patch until after we go through the necessary consensus vote. The bylaws are ambiguous about what a "code change" is for veto (e.g. either this patch as it is here / rb or that patch once it is in the git repository). I ask that you not exploit this loophole to land code before attempting to bypass my veto.

          Show
          busbey Sean Busbey added a comment - Christopher Tubbs , my objection to this change and intention to veto was clearly stated on ACCUMULO-3089 . If we are going to need a formal vote, please do not push the patch until after we go through the necessary consensus vote. The bylaws are ambiguous about what a "code change" is for veto (e.g. either this patch as it is here / rb or that patch once it is in the git repository). I ask that you not exploit this loophole to land code before attempting to bypass my veto.
          Hide
          ecn Eric Newton added a comment -

          Sean Busbey call a vote to either not commit or revert the change.

          Show
          ecn Eric Newton added a comment - Sean Busbey call a vote to either not commit or revert the change.
          Hide
          ctubbsii Christopher Tubbs added a comment -

          I initiated a vote to the dev@ list.

          Show
          ctubbsii Christopher Tubbs added a comment - I initiated a vote to the dev@ list.
          Hide
          ctubbsii Christopher Tubbs added a comment -

          Resolved on mailing list

          Show
          ctubbsii Christopher Tubbs added a comment - Resolved on mailing list

            People

            • Assignee:
              hustjl22 Jenna Huston
              Reporter:
              hustjl22 Jenna Huston
            • Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h
                1h

                  Development