Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-7868

HFile performance regression between 0.92 and 0.94

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.94.5
    • Fix Version/s: None
    • Component/s: io
    • Labels:
      None

      Description

      By HFilePerformanceEvaluation seems that 0.94 is slower then 0.92

      Looking at the profiler for the Scan path, seems that most of the time, compared to 92, is spent in the metrics dictionary lookup. Elliott Clark pointed out the new per family/block metrics.

      By commenting the metrics call in HFileReaderV2, the performance seems to get better, but maybe metrics is not the only problem.

      1. HFilePerformanceEvaluation.txt
        56 kB
        Jean-Marc Spaggiari
      2. performances.pdf
        22 kB
        Jean-Marc Spaggiari
      3. performances.pdf
        16 kB
        Jean-Marc Spaggiari
      4. FilteredScan.png
        24 kB
        Jean-Marc Spaggiari
      5. hfileperf-graphs.png
        58 kB
        Matteo Bertozzi

        Activity

        Hide
        stack stack added a comment -

        Stale

        Show
        stack stack added a comment - Stale
        Hide
        jmspaggi Jean-Marc Spaggiari added a comment -

        I still have on my ToDo to build more granular tests to figure where the performances are impacted between 0.92 and 0.94, but I'm lacking some free time to complete that

        Based on Matteo's tests, it seems to be metrics related. Were the metrics there in 0.92?

        BTW, I'm fine to remove this JIRA from 0.94 until we have a plan for it.

        Show
        jmspaggi Jean-Marc Spaggiari added a comment - I still have on my ToDo to build more granular tests to figure where the performances are impacted between 0.92 and 0.94, but I'm lacking some free time to complete that Based on Matteo's tests, it seems to be metrics related. Were the metrics there in 0.92? BTW, I'm fine to remove this JIRA from 0.94 until we have a plan for it.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Since there is no patch, I am removing this from 0.94.
        Feel free to re-add.

        Show
        lhofhansl Lars Hofhansl added a comment - Since there is no patch, I am removing this from 0.94. Feel free to re-add.
        Hide
        mbertozzi Matteo Bertozzi added a comment -

        Jean-Marc Spaggiari Take a look at UniformRandomSmallScan you start with 4000ms and you keep increasing each release to 4330ms.

        Show
        mbertozzi Matteo Bertozzi added a comment - Jean-Marc Spaggiari Take a look at UniformRandomSmallScan you start with 4000ms and you keep increasing each release to 4330ms.
        Hide
        jmspaggi Jean-Marc Spaggiari added a comment -

        Matteo, can you retry?

        Attached is the logs logs for the HFilePerformanceEvaluation. I have not found any noticable degradation of the performances between any of the releases.

        I have not done the charts and only looked at the values (greped by test name), but it seems to be pretty stable. Anything I missed?

        Show
        jmspaggi Jean-Marc Spaggiari added a comment - Matteo, can you retry? Attached is the logs logs for the HFilePerformanceEvaluation. I have not found any noticable degradation of the performances between any of the releases. I have not done the charts and only looked at the values (greped by test name), but it seems to be pretty stable. Anything I missed?
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Any suggestions for a fix?
        Moving this to 0.94.7 for now.

        Show
        lhofhansl Lars Hofhansl added a comment - Any suggestions for a fix? Moving this to 0.94.7 for now.
        Hide
        mcorgan Matt Corgan added a comment -

        I have a decent start on a benchmark that tests many different combinations of inputs like blockSize, encoding, compression, keyLength, commonPrefixLength, valueLength. You can either generate fake test data or provide an existing HFile. It tests scans and seeks and outputs a summary of performance and memory/disk usage at the end so you can find the best settings for your use case.

        It's lurking somewhere in my git repo. I was planning to dig it up at the meetup tomorrow and get it working again. Maybe we can combine all these benchmarks somehow.

        Show
        mcorgan Matt Corgan added a comment - I have a decent start on a benchmark that tests many different combinations of inputs like blockSize, encoding, compression, keyLength, commonPrefixLength, valueLength. You can either generate fake test data or provide an existing HFile. It tests scans and seeks and outputs a summary of performance and memory/disk usage at the end so you can find the best settings for your use case. It's lurking somewhere in my git repo. I was planning to dig it up at the meetup tomorrow and get it working again. Maybe we can combine all these benchmarks somehow.
        Hide
        jmspaggi Jean-Marc Spaggiari added a comment -

        Regarding --rows=20 it's because default value is 10240x1024... Which will take forever. 20 is a bit small. I'm using 100 now and test is running about 9 minutes. Which is correct. I agree that PE should be updated to call SequentialWriteTest first when required, or at least this should be specified in the documentation.

        I never looked at TestHFilePerformance... But I will

        Show
        jmspaggi Jean-Marc Spaggiari added a comment - Regarding --rows=20 it's because default value is 10240x1024... Which will take forever. 20 is a bit small. I'm using 100 now and test is running about 9 minutes. Which is correct. I agree that PE should be updated to call SequentialWriteTest first when required, or at least this should be specified in the documentation. I never looked at TestHFilePerformance... But I will
        Hide
        apurtell Andrew Purtell added a comment -

        I think PE needs some TLC.

        I've been using TestHFilePerformance for HFile microbenchmarking. It's passable but could use some TLC too. For minicluster "full system tests", variations on extensions to TestMiniClusterLoadXXX.

        Show
        apurtell Andrew Purtell added a comment - I think PE needs some TLC. I've been using TestHFilePerformance for HFile microbenchmarking. It's passable but could use some TLC too. For minicluster "full system tests", variations on extensions to TestMiniClusterLoadXXX.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        I think PE needs some TLC.

        In it's current form it is not that useful (to say it bluntly), and it is 100% not obvious how to use it (I had to look at the source code to figure out what the filterScan test is supposed to do).

        At the very least we should add to the help text that one should seed the table first with a SequentialWriteTest picking the right of rows. The HBase wiki seems to imply that SequentialWriteTest is automatically run unless we run it in M/R mode, but it looks like that is not true.

        The help text does say to run FilterScanTest with --rows=20, not entirely sure why.

        Show
        lhofhansl Lars Hofhansl added a comment - I think PE needs some TLC. In it's current form it is not that useful (to say it bluntly), and it is 100% not obvious how to use it (I had to look at the source code to figure out what the filterScan test is supposed to do). At the very least we should add to the help text that one should seed the table first with a SequentialWriteTest picking the right of rows. The HBase wiki seems to imply that SequentialWriteTest is automatically run unless we run it in M/R mode, but it looks like that is not true. The help text does say to run FilterScanTest with --rows=20, not entirely sure why.
        Hide
        jmspaggi Jean-Marc Spaggiari added a comment -

        Hi Lars,

        I see your point! I looked at the table while the test was running and it's empty. I was relying on the PerformanceEvaluation class to create the data it needs for its tests... Which is not the case. Now --rows=100 is running way slower, more realistic. Thanks for correting me. I will update my script to make sure SequentialWriteTest is called before all read/scan tests...

        Show
        jmspaggi Jean-Marc Spaggiari added a comment - Hi Lars, I see your point! I looked at the table while the test was running and it's empty. I was relying on the PerformanceEvaluation class to create the data it needs for its tests... Which is not the case. Now --rows=100 is running way slower, more realistic. Thanks for correting me. I will update my script to make sure SequentialWriteTest is called before all read/scan tests...
        Hide
        lhofhansl Lars Hofhansl added a comment -

        JM, what I am saying is that
        bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation --rows=100000 filterScan 1
        opens 100000 scanners and calls next until there are no more rows. That would "never" finish unless your each scan returns almost no data. You are not testing scan performance with that, but RPC/Network performance (if you're testing locally you're testing context switch performance through the lo interface).

        Do you seed the table with by calling SequentialWriteTest before running each read test?
        (I find that I have to do that, and otherwise there is just an empty test table)

        The performance improvement after 0.94.0 that I measure this way is in line with other tests I did before.

        Show
        lhofhansl Lars Hofhansl added a comment - JM, what I am saying is that bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation --rows=100000 filterScan 1 opens 100000 scanners and calls next until there are no more rows. That would "never" finish unless your each scan returns almost no data. You are not testing scan performance with that, but RPC/Network performance (if you're testing locally you're testing context switch performance through the lo interface). Do you seed the table with by calling SequentialWriteTest before running each read test? (I find that I have to do that, and otherwise there is just an empty test table) The performance improvement after 0.94.0 that I measure this way is in line with other tests I did before.
        Hide
        jmspaggi Jean-Marc Spaggiari added a comment -

        Lars Hofhansl Lars, I'm running in standalone mode for those tests, so I don't think the network is really impacting the restuls. Also, HBase is stopped between each iteration, so cache is not really helping one iteration versus another one. And last, I'm deleting zll ZooKeeper and HBase data between each iteration too. I have attached an updated report with more results. I'm currently running each test 10 times and taking the 80th percentil, but I can increase that and run each test 20 or 50 times if that helps. I'm now running randomSeekScan. Results will come over the day.

        Show
        jmspaggi Jean-Marc Spaggiari added a comment - Lars Hofhansl Lars, I'm running in standalone mode for those tests, so I don't think the network is really impacting the restuls. Also, HBase is stopped between each iteration, so cache is not really helping one iteration versus another one. And last, I'm deleting zll ZooKeeper and HBase data between each iteration too. I have attached an updated report with more results. I'm currently running each test 10 times and taking the 80th percentil, but I can increase that and run each test 20 or 50 times if that helps. I'm now running randomSeekScan. Results will come over the day.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Looking at the other tests in PerformanceEvaluation I don't get what it is actually useful for. The RandomReadTest send single Gets to the server(s) so it is mostly the network RTT. Similarly scanner cache is set to 30, which also means that for the scan tests we're mostly measuring the network.

        OK... I am less worried now

        Ted Yu That might be a good idea. In the end I am not sure it is worth it, though. For most folks these metrics will (IMHO) more important than a 2.5% performance gain (and it's only 2.5% if all rows are filtered out at the server, much less percentage-wise if we actually returned data to the client).
        That said, we should try to make collection of these metrics better (lazy or approximate as Andy also suggests).

        Show
        lhofhansl Lars Hofhansl added a comment - Looking at the other tests in PerformanceEvaluation I don't get what it is actually useful for. The RandomReadTest send single Gets to the server(s) so it is mostly the network RTT. Similarly scanner cache is set to 30, which also means that for the scan tests we're mostly measuring the network. OK... I am less worried now Ted Yu That might be a good idea. In the end I am not sure it is worth it, though. For most folks these metrics will (IMHO) more important than a 2.5% performance gain (and it's only 2.5% if all rows are filtered out at the server, much less percentage-wise if we actually returned data to the client). That said, we should try to make collection of these metrics better (lazy or approximate as Andy also suggests).
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Jean-Marc Spaggiari PerformanceEvaluation --rows=100000 filterScan 1 run the filterScan test 100000, not as you might expect tests with 100000 rows.

        When I seed the table with 100000 rows and then run PerformanceEvaluation --rows=20 filterScan 1 it takes ~4100ms in 0.94.0 and ~2900ms in 0.94.5.

        Show
        lhofhansl Lars Hofhansl added a comment - Jean-Marc Spaggiari PerformanceEvaluation --rows=100000 filterScan 1 run the filterScan test 100000, not as you might expect tests with 100000 rows. When I seed the table with 100000 rows and then run PerformanceEvaluation --rows=20 filterScan 1 it takes ~4100ms in 0.94.0 and ~2900ms in 0.94.5.
        Hide
        yuzhihong@gmail.com Ted Yu added a comment -

        How about making schemaMetrics dynamically configurable ?
        When people want to investigate issues with their cluster, they can enable it without restarting region server.

        Show
        yuzhihong@gmail.com Ted Yu added a comment - How about making schemaMetrics dynamically configurable ? When people want to investigate issues with their cluster, they can enable it without restarting region server.
        Hide
        jmspaggi Jean-Marc Spaggiari added a comment -

        I'm using the PerformanceEvaluation class coming with HBase...

        And then simply loop over all my HBase folders.

        #!/bin/bash
        export JAVA_HOME=/usr/local/jdk1.7.0_05/
        for i in `ls | grep hbase-`; do
        cd $i
        echo `date` Starting tests for `pwd`
        rm -f output.txt
        
        echo Test starting for filterScan
        for i in {1..10}; do rm -rf /tmp/*; bin/start-hbase.sh; sleep 60; bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation --rows=100000 filterScan 1; bin/stop-hbase.sh; done &>> output.txt
        
        echo Test starting for randomRead
        for i in {1..10}; do rm -rf /tmp/*; bin/start-hbase.sh; sleep 60; bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation randomRead 1; bin/stop-hbase.sh; done &>> output.txt
        
        # And so on...
        
        echo Done with tests for `pwd`
        cd ..
        done
        

        I'm running that on a dedicated laptop where there is absolutly nothing else running. X environment is down, cups, exim, etc. all down too.

        Show
        jmspaggi Jean-Marc Spaggiari added a comment - I'm using the PerformanceEvaluation class coming with HBase... And then simply loop over all my HBase folders. #!/bin/bash export JAVA_HOME=/usr/local/jdk1.7.0_05/ for i in `ls | grep hbase-`; do cd $i echo `date` Starting tests for `pwd` rm -f output.txt echo Test starting for filterScan for i in {1..10}; do rm -rf /tmp/*; bin/start-hbase.sh; sleep 60; bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation --rows=100000 filterScan 1; bin/stop-hbase.sh; done &>> output.txt echo Test starting for randomRead for i in {1..10}; do rm -rf /tmp/*; bin/start-hbase.sh; sleep 60; bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation randomRead 1; bin/stop-hbase.sh; done &>> output.txt # And so on... echo Done with tests for `pwd` cd .. done I'm running that on a dedicated laptop where there is absolutly nothing else running. X environment is down, cups, exim, etc. all down too.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        I'd be interested in the test code.

        Show
        lhofhansl Lars Hofhansl added a comment - I'd be interested in the test code.
        Hide
        jmspaggi Jean-Marc Spaggiari added a comment -

        Attached is the file with the values. I can also uploade the entire spreadsheet with all the formulas if required. anyway I will share it when I will have all the 0.94.x numbers. I ran those tests from HBase distrib directly out of the box. No configuration, no tweeking, nothing. On a 4 cores laptop with only 2GB RAM. I can try the same script and produce the same report on another hardware if that helps.

        Show
        jmspaggi Jean-Marc Spaggiari added a comment - Attached is the file with the values. I can also uploade the entire spreadsheet with all the formulas if required. anyway I will share it when I will have all the 0.94.x numbers. I ran those tests from HBase distrib directly out of the box. No configuration, no tweeking, nothing. On a 4 cores laptop with only 2GB RAM. I can try the same script and produce the same report on another hardware if that helps.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        I did some admittedly unscientific benchmarking here: http://hadoop-hbase.blogspot.com/2012/12/hbase-profiling.html

        Show
        lhofhansl Lars Hofhansl added a comment - I did some admittedly unscientific benchmarking here: http://hadoop-hbase.blogspot.com/2012/12/hbase-profiling.html
        Hide
        lhofhansl Lars Hofhansl added a comment - - edited

        Hmmm... Interesting. So with each point release 0.94 has become slower? With the biggest drop in 0.94.4?
        This is not what my testing bore out (and we also had our Phoenix team test the releases).

        Are you 100% sure you did not invert the axis?

        Show
        lhofhansl Lars Hofhansl added a comment - - edited Hmmm... Interesting. So with each point release 0.94 has become slower? With the biggest drop in 0.94.4? This is not what my testing bore out (and we also had our Phoenix team test the releases). Are you 100% sure you did not invert the axis?
        Hide
        jmspaggi Jean-Marc Spaggiari added a comment -

        Rows per seconds.

        Results are quite interesting. Like for RandomWriteTest, there is a huge improvement between 0.94.2 and and 0.94.3.

        When all the tests will be done for 0.94 I will add 0.92 on the list.

        Show
        jmspaggi Jean-Marc Spaggiari added a comment - Rows per seconds. Results are quite interesting. Like for RandomWriteTest, there is a huge improvement between 0.94.2 and and 0.94.3. When all the tests will be done for 0.94 I will add 0.92 on the list.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Did the same scan test with cacheBlocks disabled for the scan. All data is in the OS cache (no actual disk reads). With that I can not discern any changes between commenting the schemaMetrics stuff, vs not.

        Jean-Marc Spaggiari What are you plotting? Time or number of ops? I assume it's time, since all of my perf improvements went into 0.94.4.

        Show
        lhofhansl Lars Hofhansl added a comment - Did the same scan test with cacheBlocks disabled for the scan. All data is in the OS cache (no actual disk reads). With that I can not discern any changes between commenting the schemaMetrics stuff, vs not. Jean-Marc Spaggiari What are you plotting? Time or number of ops? I assume it's time, since all of my perf improvements went into 0.94.4.
        Hide
        jmspaggi Jean-Marc Spaggiari added a comment -

        Attached is the FilteredScan test restuls from PerformanceTest. There is only 0.94.x for now but 0.92.x is coming soon. As you can see, there is also some perfs impacts between 0.94.0 and 0.94.5 (about 1%).

        Show
        jmspaggi Jean-Marc Spaggiari added a comment - Attached is the FilteredScan test restuls from PerformanceTest. There is only 0.94.x for now but 0.92.x is coming soon. As you can see, there is also some perfs impacts between 0.94.0 and 0.94.5 (about 1%).
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Elliott Clark Presumably the overhead of the metric update would be negligible compared to the time/effort it takes to actually load the block. At least that was my assumption when reviewing HBASE-6852. Maybe I assumed incorrectly.

        Show
        lhofhansl Lars Hofhansl added a comment - Elliott Clark Presumably the overhead of the metric update would be negligible compared to the time/effort it takes to actually load the block. At least that was my assumption when reviewing HBASE-6852 . Maybe I assumed incorrectly.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Did some scan testing... Scanning 20m KVs through the highlevel scan api, such that all KVs are touched but filtered by a Filter at the server and all data in the block cache (so we can test tight scan performance).

        I do see an improvement when the schemaMetrics calls are commented out in HFileReaderV2.
        The scan time when from ~11.9 to ~11.6, so about a 2.5% improvement.

        As Andy points out, we need to keep these metrics around and HBASE-6852 improved things (cache hits are maintained lazily), there might be further improvements that we can do.

        Show
        lhofhansl Lars Hofhansl added a comment - Did some scan testing... Scanning 20m KVs through the highlevel scan api, such that all KVs are touched but filtered by a Filter at the server and all data in the block cache (so we can test tight scan performance). I do see an improvement when the schemaMetrics calls are commented out in HFileReaderV2. The scan time when from ~11.9 to ~11.6, so about a 2.5% improvement. As Andy points out, we need to keep these metrics around and HBASE-6852 improved things (cache hits are maintained lazily), there might be further improvements that we can do.
        Hide
        eclark Elliott Clark added a comment -

        HBASE-6852 is for the cache hit branch. For cache misses (all of the hfile perf runs have caching off) we still do the full concurrent hashmap lookup. That part is slow.

        We can get some perf by just following what cache hit does (an array of atomic longs) that are looked up by index. There may still be other things that need to get faster, or better ways that was just my first thought.

        Also it would be nice if we could turn the per cf metrics off for people that are very perf conscience.

        Show
        eclark Elliott Clark added a comment - HBASE-6852 is for the cache hit branch. For cache misses (all of the hfile perf runs have caching off) we still do the full concurrent hashmap lookup. That part is slow. We can get some perf by just following what cache hit does (an array of atomic longs) that are looked up by index. There may still be other things that need to get faster, or better ways that was just my first thought. Also it would be nice if we could turn the per cf metrics off for people that are very perf conscience.
        Hide
        apurtell Andrew Purtell added a comment -

        Thought this issue might be a dup of HBASE-6852 too.

        As for per-CF metrics, we need something like this for things like HBASE-4147 and HBASE-6572, but perhaps more lazy and/or approximate and/or out of line than current.

        Show
        apurtell Andrew Purtell added a comment - Thought this issue might be a dup of HBASE-6852 too. As for per-CF metrics, we need something like this for things like HBASE-4147 and HBASE-6572 , but perhaps more lazy and/or approximate and/or out of line than current.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        On my machine I only find about a 4% difference between stock 0.94 and a version with all schemaMetric commented out in HFileReaderV2. My guess is that that will translate to less then 0.5% of actual Scan or Get performance.

        Show
        lhofhansl Lars Hofhansl added a comment - On my machine I only find about a 4% difference between stock 0.94 and a version with all schemaMetric commented out in HFileReaderV2. My guess is that that will translate to less then 0.5% of actual Scan or Get performance.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        You're not saying this is caused by HBASE-6852, right?
        HBASE-6852 should make this better, but did not eliminate the overhead completely.

        Show
        lhofhansl Lars Hofhansl added a comment - You're not saying this is caused by HBASE-6852 , right? HBASE-6852 should make this better, but did not eliminate the overhead completely.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Interesting. This did not come up in any of the scan profiling I did in the past.
        But it looks like we can get another boost from fixing this!

        Show
        lhofhansl Lars Hofhansl added a comment - Interesting. This did not come up in any of the scan profiling I did in the past. But it looks like we can get another boost from fixing this!
        Hide
        jmspaggi Jean-Marc Spaggiari added a comment -

        I have a script which is running all the PerformanceEvaluationTests for all HBase 0.94.x version.

        I will download all the 0.92.x version too and restart it. I will publish what I found. The goal was to PerformanceTest the new releases, but I can stil run that for past versions.

        Show
        jmspaggi Jean-Marc Spaggiari added a comment - I have a script which is running all the PerformanceEvaluationTests for all HBase 0.94.x version. I will download all the 0.92.x version too and restart it. I will publish what I found. The goal was to PerformanceTest the new releases, but I can stil run that for past versions.
        Hide
        yuzhihong@gmail.com Ted Yu added a comment -

        For sequential write, average performance 0.94 without metrics is a bit slower compared to that of 0.92

        Show
        yuzhihong@gmail.com Ted Yu added a comment - For sequential write, average performance 0.94 without metrics is a bit slower compared to that of 0.92
        Hide
        mbertozzi Matteo Bertozzi added a comment -

        Ted Yu graph 8 is the average of the 7 runs

        This graphs were done this morning on my machines, so I expect weird runs.
        but you can get the same results on a proper machine.
        (I've started benchmarking this stuff days ago, and I always got the same avg results)

        Show
        mbertozzi Matteo Bertozzi added a comment - Ted Yu graph 8 is the average of the 7 runs This graphs were done this morning on my machines, so I expect weird runs. but you can get the same results on a proper machine. (I've started benchmarking this stuff days ago, and I always got the same avg results)
        Hide
        yuzhihong@gmail.com Ted Yu added a comment - - edited

        Looking at the graph, 0.94 without metrics was faster than stock 0.94 in most of the samples.

        Show
        yuzhihong@gmail.com Ted Yu added a comment - - edited Looking at the graph, 0.94 without metrics was faster than stock 0.94 in most of the samples.
        Hide
        mbertozzi Matteo Bertozzi added a comment -

        Jean-Marc Spaggiari the latest one (bench done this morning, git pull)
        but since, the hotspot is SchemaMetrics and the performance regression can be seen just by looking at hfile, my guess is that is all 0.94.

        Anyway if you want you can start looking at the performances pre HBASE-6852.

        Show
        mbertozzi Matteo Bertozzi added a comment - Jean-Marc Spaggiari the latest one (bench done this morning, git pull) but since, the hotspot is SchemaMetrics and the performance regression can be seen just by looking at hfile, my guess is that is all 0.94. Anyway if you want you can start looking at the performances pre HBASE-6852 .
        Hide
        jmspaggi Jean-Marc Spaggiari added a comment -

        Hi Matteo,

        Which 0.94.x version did you used for this test?

        Thanks,

        JM

        Show
        jmspaggi Jean-Marc Spaggiari added a comment - Hi Matteo, Which 0.94.x version did you used for this test? Thanks, JM
        Hide
        mbertozzi Matteo Bertozzi added a comment -

        attached a graph extracted from HFilePerformanceEvaluation. Aside from strange runs the average seems to say that 92 is still faster than 94 and 94 with metrics commented out from HFileReaderV2, is way faster then 94

        Show
        mbertozzi Matteo Bertozzi added a comment - attached a graph extracted from HFilePerformanceEvaluation. Aside from strange runs the average seems to say that 92 is still faster than 94 and 94 with metrics commented out from HFileReaderV2, is way faster then 94

          People

          • Assignee:
            mbertozzi Matteo Bertozzi
            Reporter:
            mbertozzi Matteo Bertozzi
          • Votes:
            0 Vote for this issue
            Watchers:
            16 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development