Accumulo
  1. Accumulo
  2. ACCUMULO-2268

Use conditional mutations to update metadata table

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 1.7.0
    • Component/s: None
    • Labels:
      None

      Description

      For correctness Accumulo requires that only one tablet server at a time serve a tablet. In order to enforce this constraint, Accumulo uses zookeeper locks. It's assumed when a tablet server lock disappears that the tablet server will kill itself. Therefore a tablet that's assigned to a dead tablet server can be safely reassigned. However sometimes tablet servers continue to operate for a period of time after losing their locks. Sometimes this is caused by bugs in Accumulo, sometimes it's the Java GC or swapping (and the tserver does die), sometimes it's problems with zookeeper (like the zk thread that reports lock lost dies).

      In Accumulo 1.6 conditional mutations were added. Making all tablet metadata updates use conditional mutations could make multiply-assigned tablets less able to do damage.

      For example if after a minor compaction, the metadata update mutation could require the tablet location to be the current tserver: it would prevent a zombie tserver from adding an extraneous file to the metadata table for a tablet.

      Christopher Tubbs has discussed refactoring all metadata code so that its more modular and works with zookeeper (for root tablet) and metadata table using same API. This solution could depend on that. It may also be useful to make the root tablet operate more like a regular tablet and store its list of files in zookeeper. Then the root tablet could benefit from these changes with the right abstraction layer.

        Issue Links

          Activity

          Keith Turner created issue -
          Keith Turner made changes -
          Field Original Value New Value
          Link This issue relates to ACCUMULO-2261 [ ACCUMULO-2261 ]
          Keith Turner made changes -
          Description For correctness Accumulo requires that only one tablet server at a time serve a tablet. Inorder to enforce this Accumulo uses zookeeper locks. It assumed that a when a tablet server dies, it will lose its lock and kill itself. Therefore a tablet thats assigned to a dead tablet server can be safely reassigned. However sometimes tablet servers continue to operate for a period of time after losing their locks. Sometimes this caused by bugs in Accumulo, sometimes it the Java GC or swapping (and the tserver does die), sometimes its problems w/ zookeeper (like the zk thread that reports lock lost dies).

          In 1.6 Accumulo added contditional mutations. Making all tablet metadata updates use conditional mutations could make multiply assigned tablets less able to do damage.

          For example if after a minor compaction, the metadata update mutation required the tablet location to be the current tserver that would prevent a zombie tserver from adding an extraneuous file to the metadata table for a tablet.

          [~ctubbsii] has discussed refactoring all metadata code so that its more modular and works w/ zookeeper (for root tablet) and metadata table using same API. This solution could depend on that. It may also be useful to make the root tablet operate more like a regular tablet and store its list of files in zookeeper. Then the root tablet could benefit from these changes w/ the right abstraction layer.
          For correctness Accumulo requires that only one tablet server at a time serve a tablet. Inorder to enforce this Accumulo uses zookeeper locks. It assumed that a when a tablet server lock disappears, that the tablet server will kill itself. Therefore a tablet thats assigned to a dead tablet server can be safely reassigned. However sometimes tablet servers continue to operate for a period of time after losing their locks. Sometimes this caused by bugs in Accumulo, sometimes it the Java GC or swapping (and the tserver does die), sometimes its problems w/ zookeeper (like the zk thread that reports lock lost dies).

          In Accumulo 1.6 contditional mutations were added. Making all tablet metadata updates use conditional mutations could make multiply assigned tablets less able to do damage.

          For example if after a minor compaction, the metadata update mutation required the tablet location to be the current tserver that would prevent a zombie tserver from adding an extraneuous file to the metadata table for a tablet.

          [~ctubbsii] has discussed refactoring all metadata code so that its more modular and works w/ zookeeper (for root tablet) and metadata table using same API. This solution could depend on that. It may also be useful to make the root tablet operate more like a regular tablet and store its list of files in zookeeper. Then the root tablet could benefit from these changes w/ the right abstraction layer.
          Keith Turner made changes -
          Description For correctness Accumulo requires that only one tablet server at a time serve a tablet. Inorder to enforce this Accumulo uses zookeeper locks. It assumed that a when a tablet server lock disappears, that the tablet server will kill itself. Therefore a tablet thats assigned to a dead tablet server can be safely reassigned. However sometimes tablet servers continue to operate for a period of time after losing their locks. Sometimes this caused by bugs in Accumulo, sometimes it the Java GC or swapping (and the tserver does die), sometimes its problems w/ zookeeper (like the zk thread that reports lock lost dies).

          In Accumulo 1.6 contditional mutations were added. Making all tablet metadata updates use conditional mutations could make multiply assigned tablets less able to do damage.

          For example if after a minor compaction, the metadata update mutation required the tablet location to be the current tserver that would prevent a zombie tserver from adding an extraneuous file to the metadata table for a tablet.

          [~ctubbsii] has discussed refactoring all metadata code so that its more modular and works w/ zookeeper (for root tablet) and metadata table using same API. This solution could depend on that. It may also be useful to make the root tablet operate more like a regular tablet and store its list of files in zookeeper. Then the root tablet could benefit from these changes w/ the right abstraction layer.
          For correctness Accumulo requires that only one tablet server at a time serve a tablet. Inorder to enforce this Accumulo uses zookeeper locks. Its assumed when a tablet server lock disappears that the tablet server will kill itself. Therefore a tablet thats assigned to a dead tablet server can be safely reassigned. However sometimes tablet servers continue to operate for a period of time after losing their locks. Sometimes this is caused by bugs in Accumulo, sometimes its the Java GC or swapping (and the tserver does die), sometimes its problems w/ zookeeper (like the zk thread that reports lock lost dies).

          In Accumulo 1.6 contditional mutations were added. Making all tablet metadata updates use conditional mutations could make multiply assigned tablets less able to do damage.

          For example if after a minor compaction, the metadata update mutation required the tablet location to be the current tserver that would prevent a zombie tserver from adding an extraneuous file to the metadata table for a tablet.

          [~ctubbsii] has discussed refactoring all metadata code so that its more modular and works w/ zookeeper (for root tablet) and metadata table using same API. This solution could depend on that. It may also be useful to make the root tablet operate more like a regular tablet and store its list of files in zookeeper. Then the root tablet could benefit from these changes w/ the right abstraction layer.
          Christopher Tubbs made changes -
          Summary Use conditinal mutations to update metadata table Use conditional mutations to update metadata table
          Christopher Tubbs made changes -
          Link This issue relates to ACCUMULO-2272 [ ACCUMULO-2272 ]
          Eric Newton made changes -
          Description For correctness Accumulo requires that only one tablet server at a time serve a tablet. Inorder to enforce this Accumulo uses zookeeper locks. Its assumed when a tablet server lock disappears that the tablet server will kill itself. Therefore a tablet thats assigned to a dead tablet server can be safely reassigned. However sometimes tablet servers continue to operate for a period of time after losing their locks. Sometimes this is caused by bugs in Accumulo, sometimes its the Java GC or swapping (and the tserver does die), sometimes its problems w/ zookeeper (like the zk thread that reports lock lost dies).

          In Accumulo 1.6 contditional mutations were added. Making all tablet metadata updates use conditional mutations could make multiply assigned tablets less able to do damage.

          For example if after a minor compaction, the metadata update mutation required the tablet location to be the current tserver that would prevent a zombie tserver from adding an extraneuous file to the metadata table for a tablet.

          [~ctubbsii] has discussed refactoring all metadata code so that its more modular and works w/ zookeeper (for root tablet) and metadata table using same API. This solution could depend on that. It may also be useful to make the root tablet operate more like a regular tablet and store its list of files in zookeeper. Then the root tablet could benefit from these changes w/ the right abstraction layer.
          For correctness Accumulo requires that only one tablet server at a time serve a tablet. In order to enforce this constraint, Accumulo uses zookeeper locks. It's assumed when a tablet server lock disappears that the tablet server will kill itself. Therefore a tablet that's assigned to a dead tablet server can be safely reassigned. However sometimes tablet servers continue to operate for a period of time after losing their locks. Sometimes this is caused by bugs in Accumulo, sometimes it's the Java GC or swapping (and the tserver does die), sometimes it's problems with zookeeper (like the zk thread that reports lock lost dies).

          In Accumulo 1.6 conditional mutations were added. Making all tablet metadata updates use conditional mutations could make multiply-assigned tablets less able to do damage.

          For example if after a minor compaction, the metadata update mutation could require the tablet location to be the current tserver: it would prevent a zombie tserver from adding an extraneous file to the metadata table for a tablet.

          [~ctubbsii] has discussed refactoring all metadata code so that its more modular and works with zookeeper (for root tablet) and metadata table using same API. This solution could depend on that. It may also be useful to make the root tablet operate more like a regular tablet and store its list of files in zookeeper. Then the root tablet could benefit from these changes with the right abstraction layer.

            People

            • Assignee:
              Unassigned
              Reporter:
              Keith Turner
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development