Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-2613

Take advantage of HDFS caching to improve MTTR

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 2.0.0
    • Component/s: None
    • Labels:

      Description

      Hadoop 2.3.0 added HDFS caching.

      We should use this for small internal use tables (like !METADATA) and we should probably have a configurable option to use it for tables, with a stern warning that it should only be enabled on small tables that will be frequently used.

        Activity

        Hide
        ecn Eric Newton added a comment -

        I don't want to seem argumentative, because I really don't know if using this cache for the WAL is a good idea, or not. But I can think of some issues:

        • hopefully, in your clusters, recovery is an unusual operation
        • WAL has to write to disk to survive power loss, making it a bad candidate for RAM-only storage
        • Others have purposefully turned off caching of WAL data to make memory available for other things, since reading them at all is unusual

        We already know we can improve recovery time by reducing the largest WAL size, parallelizing read/sort, and computing a more optimal leaseRecovery timeout. I would strongly suggest a more in-depth look into recovery before even experimenting with HDFS caching.

        Show
        ecn Eric Newton added a comment - I don't want to seem argumentative, because I really don't know if using this cache for the WAL is a good idea, or not. But I can think of some issues: hopefully, in your clusters, recovery is an unusual operation WAL has to write to disk to survive power loss, making it a bad candidate for RAM-only storage Others have purposefully turned off caching of WAL data to make memory available for other things, since reading them at all is unusual We already know we can improve recovery time by reducing the largest WAL size, parallelizing read/sort, and computing a more optimal leaseRecovery timeout. I would strongly suggest a more in-depth look into recovery before even experimenting with HDFS caching.
        Hide
        busbey Sean Busbey added a comment -

        even then, say we pinned all the WALs. how much space is that likely to be? less than 10G per node? that's not too bad on a hardware cluster.

        Show
        busbey Sean Busbey added a comment - even then, say we pinned all the WALs. how much space is that likely to be? less than 10G per node? that's not too bad on a hardware cluster.
        Hide
        ecn Eric Newton added a comment -

        It is not unusual to have a metadata tablet on half the nodes of a cluster.

        Show
        ecn Eric Newton added a comment - It is not unusual to have a metadata tablet on half the nodes of a cluster.
        Hide
        busbey Sean Busbey added a comment -

        edited title to make my intentions clearer

        Show
        busbey Sean Busbey added a comment - edited title to make my intentions clearer
        Hide
        busbey Sean Busbey added a comment -

        caching them in the tserver doesn't help when the tserver goes down.

        Show
        busbey Sean Busbey added a comment - caching them in the tserver doesn't help when the tserver goes down.
        Hide
        vines John Vines added a comment -

        How much gain would we get from caching metadata table files when we're already caching them in the tserver?

        Show
        vines John Vines added a comment - How much gain would we get from caching metadata table files when we're already caching them in the tserver?
        Hide
        busbey Sean Busbey added a comment -

        That would be preferable, but couldn't a tablet server just recognize when it was adding a !METADATA related entry to the WAL and then request that it be pinned?

        Show
        busbey Sean Busbey added a comment - That would be preferable, but couldn't a tablet server just recognize when it was adding a !METADATA related entry to the WAL and then request that it be pinned?
        Hide
        ecn Eric Newton added a comment -

        To pin WALs to the metadata table, we would first have to separate the WAL for the metadata tablets (table-specific WALs).

        Show
        ecn Eric Newton added a comment - To pin WALs to the metadata table, we would first have to separate the WAL for the metadata tablets (table-specific WALs).
        Hide
        mdrob Mike Drob added a comment -
        Show
        mdrob Mike Drob added a comment - Previous discussion here: http://markmail.org/message/egw2pp6du3xgdkqb
        Hide
        busbey Sean Busbey added a comment -

        Actually, to really drive down recovery time would we need to also ask to pin WALs for the !METADATA table? I think we would.

        Show
        busbey Sean Busbey added a comment - Actually, to really drive down recovery time would we need to also ask to pin WALs for the !METADATA table? I think we would.

          People

          • Assignee:
            Unassigned
            Reporter:
            busbey Sean Busbey
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development