Accumulo
  1. Accumulo
  2. ACCUMULO-2866

Default WAL size should be based on HDFS Block Size

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Not A Problem
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: tserver
    • Labels:

      Description

      HBase automatically defaults their WAL size to 0.95 * HDFS block size. This makes a lot of sense from a resource management perspective, and we should do the same.

        Activity

        Mike Drob created issue -
        Hide
        Josh Elser added a comment -

        One nice thing about setting a very large block size on the WALs (which is really like 1.05% of the configured value) is that when insufficient DFS space exists, we fail quickly. Granted, it fails in a really obscure way (looks more like a permission error iirc), but it still fails immediately instead of some time later.

        I'd be curious in some basic before/after numbers too. I am not knowledgeable enough off the top of my head if something in the DFS pipeline would be suffering by us doing this.

        Show
        Josh Elser added a comment - One nice thing about setting a very large block size on the WALs (which is really like 1.05% of the configured value) is that when insufficient DFS space exists, we fail quickly. Granted, it fails in a really obscure way (looks more like a permission error iirc), but it still fails immediately instead of some time later. I'd be curious in some basic before/after numbers too. I am not knowledgeable enough off the top of my head if something in the DFS pipeline would be suffering by us doing this.
        Hide
        Eric Newton added a comment -

        Can you explain what exactly "makes sense"?

        Show
        Eric Newton added a comment - Can you explain what exactly "makes sense"?
        Hide
        Mike Drob added a comment -

        Not sure if you intended to suggest it or not, but I would be ok with using this JIRA to set the WAL block size instead of changing the size of the WAL itself.

        Show
        Mike Drob added a comment - Not sure if you intended to suggest it or not, but I would be ok with using this JIRA to set the WAL block size instead of changing the size of the WAL itself.
        Hide
        Mike Drob added a comment -

        Eric Newton - The WAL is not splitable and will only be processed by at most one server at a time. Keeping it in a single block 1) alleviates extra NN pressure that we could be causing and 2) avoids some potential network overhead. Maybe I'm overstating the benefits here, but as far as defaults go, it makes more sense to me than just picking a number out of the air.

        Show
        Mike Drob added a comment - Eric Newton - The WAL is not splitable and will only be processed by at most one server at a time. Keeping it in a single block 1) alleviates extra NN pressure that we could be causing and 2) avoids some potential network overhead. Maybe I'm overstating the benefits here, but as far as defaults go, it makes more sense to me than just picking a number out of the air.
        Hide
        Eric Newton added a comment -

        I'm just trying to understand what would be better than what we do now.

        Presently accumulo sets the WAL block size to be slightly larger than the max size of the WAL (defaulting to 1G).

        Show
        Eric Newton added a comment - I'm just trying to understand what would be better than what we do now. Presently accumulo sets the WAL block size to be slightly larger than the max size of the WAL (defaulting to 1G).
        Hide
        Eric Newton added a comment -

        The WAL should be a single block. A recovery file might be made into multiple smaller blocks... is that what you're thinking about?

        Show
        Eric Newton added a comment - The WAL should be a single block. A recovery file might be made into multiple smaller blocks... is that what you're thinking about?
        Hide
        Mike Drob added a comment -

        I was under the impression that recovery files were WAL files?

        Show
        Mike Drob added a comment - I was under the impression that recovery files were WAL files?
        Hide
        Eric Newton added a comment -

        In the normal case, WALs are written out and never re-read. But, if a tablet is lost, the contents of memory have to be recovered. The master coordinates getting the WAL sorted into a recovery file. The tservers use the sorted recovery file to re-load the data that was in memory, and not yet flushed to disk.

        WAL file = not sorted, rarely read
        recovery file = sorted, and read by all recovering tablets/servers.

        Show
        Eric Newton added a comment - In the normal case, WALs are written out and never re-read. But, if a tablet is lost, the contents of memory have to be recovered. The master coordinates getting the WAL sorted into a recovery file. The tservers use the sorted recovery file to re-load the data that was in memory, and not yet flushed to disk. WAL file = not sorted, rarely read recovery file = sorted, and read by all recovering tablets/servers.
        Hide
        Josh Elser added a comment -

        Mike Drob do you still have issues here? Glancing at the LogSorter code, it appears that the recovery files do already use the default HDFS block size.

        Show
        Josh Elser added a comment - Mike Drob do you still have issues here? Glancing at the LogSorter code, it appears that the recovery files do already use the default HDFS block size.
        Mike Drob made changes -
        Field Original Value New Value
        Resolution Not a Problem [ 8 ]
        Fix Version/s 1.7.0 [ 12324607 ]
        Status Open [ 1 ] Resolved [ 5 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        3d 21h 38m 1 Mike Drob 10/Jun/14 15:30

          People

          • Assignee:
            Unassigned
            Reporter:
            Mike Drob
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development