Accumulo
  1. Accumulo
  2. ACCUMULO-549

Create infrastructure that supports debugging.

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Before a release an extensive amount of testing is done that usually involves running test like continuous ingest and random walk on a cluster for extended periods of time. When bugs are found by these test it can take a lot of time to track the issue down sometimes. Inorder to make tracking these issues down easier the write ahead logs are archived. These walogs archives make it possible to answer questions about a tablets history because everything ever written to the metadata table is there. It would be nice to always have this capability on an accumulo system, and have it be easy to use. Spelunking around in the write ahead logs is not an easy task.

      It would be nice if accumulo could answer questions like the following.

      • Where has a tablet been assigned
      • What compactions has a tablet done
      • What split or merge created a tablet

      These questions can currently be answered with walogs and log4j logs, but its painful.

        Activity

        Christopher Tubbs made changes -
        Fix Version/s 1.5.0 [ 12318645 ]
        Gavin made changes -
        Field Original Value New Value
        Workflow no-reopen-closed, patch-avail [ 12663450 ] patch-available, re-open possible [ 12671816 ]
        Hide
        Keith Turner added a comment -

        For comment 1 Eric made, I was thinking of walogs and was slightly confused. I asked Eric about this, he was thinking about indexing the log4j debug logs in parallel from each machine in the cluster.

        Show
        Keith Turner added a comment - For comment 1 Eric made, I was thinking of walogs and was slightly confused. I asked Eric about this, he was thinking about indexing the log4j debug logs in parallel from each machine in the cluster.
        Hide
        Keith Turner added a comment -

        Eric, I think offering a macroscopic view of the trace data would be very useful in addition to the current microscopic view. However I think this warrants a separate ticket?

        Show
        Keith Turner added a comment - Eric, I think offering a macroscopic view of the trace data would be very useful in addition to the current microscopic view. However I think this warrants a separate ticket?
        Hide
        Eric Newton added a comment -

        I would add:

        1. ability to consume the logs into an accumulo instance so we can query and search 10G of logs with greater parallelism and speed
        2. statistical analysis of traces, which will probably require the accumulo instance to be be up and running
        Show
        Eric Newton added a comment - I would add: ability to consume the logs into an accumulo instance so we can query and search 10G of logs with greater parallelism and speed statistical analysis of traces, which will probably require the accumulo instance to be be up and running
        Hide
        Keith Turner added a comment -

        I am thinking one possible way to accomplish this is with the following components.

        1. Mechanism that continuously collect and stores !METADATA mutations
          • ACCUMULO-212 or ACCUMULO-378 are possible ways to accomplish this
          • Need to ensure that latest mutations are available when needed.
          • Need to provide an easy way of copying mutations from a production system to a developers machine for analysis.
          • Mutations must be accessible even when accumulo is not functioning properly.
        2. Utility to index !METADATA table mutations into Accumulo tables
        3. Utility to query indexes
        Show
        Keith Turner added a comment - I am thinking one possible way to accomplish this is with the following components. Mechanism that continuously collect and stores !METADATA mutations ACCUMULO-212 or ACCUMULO-378 are possible ways to accomplish this Need to ensure that latest mutations are available when needed. Need to provide an easy way of copying mutations from a production system to a developers machine for analysis. Mutations must be accessible even when accumulo is not functioning properly. Utility to index !METADATA table mutations into Accumulo tables Utility to query indexes
        Keith Turner created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            Keith Turner
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development