Details

    • Type: Improvement Improvement
    • Status: Patch Available
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede.

      I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not.

      1. 9857.patch
        61 kB
        Andrew Purtell
      2. 9857.patch
        61 kB
        Andrew Purtell

        Activity

        Hide
        Lars Hofhansl added a comment -

        How does this generally stack up against client triggered prefetching? I.e. the client would schedule the next partial scan ahead of time.

        Show
        Lars Hofhansl added a comment - How does this generally stack up against client triggered prefetching? I.e. the client would schedule the next partial scan ahead of time.
        Hide
        Andrew Purtell added a comment -

        The reads would not count as cache misses. There would be no RPCs. Only HFileScanner is involved, we are just loading blocks not looking at any keys.

        Show
        Andrew Purtell added a comment - The reads would not count as cache misses. There would be no RPCs. Only HFileScanner is involved, we are just loading blocks not looking at any keys.
        Hide
        Nick Dimiduk added a comment -

        This is a nice feature. I skimmed the patch, don't see why it's limited to HFileV3. Can it be made a general feature?

        I think it could be smart about loading the blocks, load either sequentially or over a random distribution until the cache is full. The "until full" part seems tricky as eviction detection isn't very straight-forward, or at least hasn't been so thus far in my work on HBASE-9806.

        Show
        Nick Dimiduk added a comment - This is a nice feature. I skimmed the patch, don't see why it's limited to HFileV3. Can it be made a general feature? I think it could be smart about loading the blocks, load either sequentially or over a random distribution until the cache is full. The "until full" part seems tricky as eviction detection isn't very straight-forward, or at least hasn't been so thus far in my work on HBASE-9806 .
        Hide
        Andrew Purtell added a comment -

        Thanks for looking at the patch Nick Dimiduk.

        don't see why it's limited to HFileV3. Can it be made a general feature

        I put the preload logic into the v3 reader because v3 is 'experimental'. Could trivially go into the v2 reader instead.

        I think it could be smart about loading the blocks, load either sequentially or over a random distribution until the cache is full

        Files to be preloaded are queued and scheduled to be handled by a small threadpool. When a thread picks up work for a file, the blocks are loaded sequentially using a non-pread scanner from offset 0 to the end of the index.

        By random did you mean randomly select work from the file queue?

        The "until full" part seems tricky as eviction detection isn't very straight-forward

        Right. If we had it, I could make use of it.

        Show
        Andrew Purtell added a comment - Thanks for looking at the patch Nick Dimiduk . don't see why it's limited to HFileV3. Can it be made a general feature I put the preload logic into the v3 reader because v3 is 'experimental'. Could trivially go into the v2 reader instead. I think it could be smart about loading the blocks, load either sequentially or over a random distribution until the cache is full Files to be preloaded are queued and scheduled to be handled by a small threadpool. When a thread picks up work for a file, the blocks are loaded sequentially using a non-pread scanner from offset 0 to the end of the index. By random did you mean randomly select work from the file queue? The "until full" part seems tricky as eviction detection isn't very straight-forward Right. If we had it, I could make use of it.
        Hide
        Andrew Purtell added a comment -

        Rebase on latest trunk

        Show
        Andrew Purtell added a comment - Rebase on latest trunk
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12614106/9857.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 26 new or modified tests.

        +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        -1 javadoc. The javadoc tool appears to have generated 3 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings).

        +1 lineLengths. The patch does not introduce lines longer than 100

        -1 site. The patch appears to cause mvn site goal to fail.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//testReport/
        Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614106/9857.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 26 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 3 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit . The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7887//console This message is automatically generated.

          People

          • Assignee:
            Unassigned
            Reporter:
            Andrew Purtell
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:

              Development