Accumulo
  1. Accumulo
  2. ACCUMULO-3067

scan performance degrades after compaction

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: tserver
    • Labels:
      None
    • Environment:

      Macbook Pro 2.6 GHz Intel Core i7, 16GB RAM, SSD, OSX 10.9.4, single tablet server process, single client process

      Description

      I've been running some scan performance tests on 1.6.0, and I'm running into an interesting situation in which query performance starts at a certain level and then degrades by ~15% after an event. The test follows roughly the following scenario:

      1. Single tabletserver instance
      2. Load 100M small (~10byte) key/values into a tablet and let it finish major compacting
      3. Disable the garbage collector (this makes the time to the event longer)
      4. Restart the tabletserver
      5. Repeatedly scan from the beginning to the end of the table in a loop
      6. Something happens on the tablet server, like one of {idle compaction of metadata table, forced flush of metadata table, forced compaction of metadata table, forced flush of trace table}
      7. Observe that scan rates dropped by 15-20%
      8. Observe that restarting the scan will not improve performance back to original level. Performance only gets better upon restarting the tablet server.

      I've been able to get this not to happen by removing iterators from the iterator tree. It doesn't seem to matter which iterators, but removing a certain number both improves performance (significantly) and eliminates the degradation problem. The default iterator tree includes:

      • SourceSwitchingIterator
        • VersioningIterator
          • SynchronizedIterator
            • VisibilityFilter
              • ColumnQualifierFilter
                • ColumnFamilySkippingIterator
                  • DeletingIterator
                    • StatsIterator
                      • MultiIterator
                        • MemoryIterator
                        • ProblemReportingIterator
                          • HeapIterator
                            • RFile.LocalityGroupReader

      We can eliminate the weird condition by narrowing the set of iterators to:

      • SourceSwitchingIterator
        • VisibilityFilter
          • ColumnFamilySkippingIterator
            • DeletingIterator
              • StatsIterator
                • MultiIterator
                  • MemoryIterator
                  • ProblemReportingIterator
                    • HeapIterator
                      • RFile.LocalityGroupReader

      There are other combinations that also perform much better than the default. I haven't been able to isolate this problem to a single iterator, despite removing each iterator one at a time.

      Anybody know what might be happening here? Best theory so far: the JVM learns that iterators can be used in a different way after a compaction, and some JVM optimization like JIT compilation, branch prediction, or automatic inlining stops happening.

      1. jit_log_during_compaction.txt
        5 kB
        Adam Fuchs
      2. accumulo_query_perf_test.tar.gz
        1 kB
        Adam Fuchs
      3. Screen Shot 2014-08-19 at 4.19.37 PM.png
        166 kB
        Adam Fuchs

        Activity

        Adam Fuchs made changes -
        Attachment jit_log_during_compaction.txt [ 12663399 ]
        Adam Fuchs made changes -
        Description I've been running some scan performance tests on 1.6.0, and I'm running into an interesting situation in which query performance starts at a certain level and then degrades by ~15% after an event. The test follows roughly the following scenario:

         # Single tabletserver instance
         # Load 100M small (~10byte) key/values into a tablet and let it finish major compacting
         # Disable the garbage collector (this makes the time to _the event_ longer)
         # Restart the tabletserver
         # Repeatedly scan from the beginning to the end of the table in a loop
         # Something happens on the tablet server, like one of {idle compaction of metadata table, forced flush of metadata table, forced compaction of metadata table, forced flush of trace table}
         # Observe that scan rates dropped by 15-20%
         # Observe that restarting the scan will not improve performance back to original level. Performance only gets better upon restarting the tablet server.

        I've been able to get this not to happen by removing iterators from the iterator tree. It doesn't seem to matter which iterators, but removing a certain number both improves performance (significantly) and eliminates the degradation problem. The default iterator tree includes (SourceSwitchingIterator, VersioningIterator, SynchronizedIterator, VisibilityFilter, ColumnQualifierFilter, ColumnFamilySkippingIterator, DeletingIterator, StatsIterator, MultiIterator, (MemoryIterator*, RFile.Reader*)). Narrowing this to (SourceSwitchingIterator, VisibilityFilter, ColumnFamilySkippingIterator, DeletingIterator, StatsIterator, MultiIterator, (MemoryIterator*, RFile.Reader*)) eliminates the weird condition. There are also other combinations that perform much better than the default. I haven't been able to isolate this problem to a single iterator, despite removing each iterator one at a time.

        Anybody know what might be happening here? Best theory so far: the JVM learns that iterators can be used in a different way after a compaction, and some JVM optimization like JIT compilation, branch prediction, or automatic inlining stops happening.
        I've been running some scan performance tests on 1.6.0, and I'm running into an interesting situation in which query performance starts at a certain level and then degrades by ~15% after an event. The test follows roughly the following scenario:

         # Single tabletserver instance
         # Load 100M small (~10byte) key/values into a tablet and let it finish major compacting
         # Disable the garbage collector (this makes the time to _the event_ longer)
         # Restart the tabletserver
         # Repeatedly scan from the beginning to the end of the table in a loop
         # Something happens on the tablet server, like one of {idle compaction of metadata table, forced flush of metadata table, forced compaction of metadata table, forced flush of trace table}
         # Observe that scan rates dropped by 15-20%
         # Observe that restarting the scan will not improve performance back to original level. Performance only gets better upon restarting the tablet server.

        I've been able to get this not to happen by removing iterators from the iterator tree. It doesn't seem to matter which iterators, but removing a certain number both improves performance (significantly) and eliminates the degradation problem. The default iterator tree includes:

         * SourceSwitchingIterator
         ** VersioningIterator
         *** SynchronizedIterator
         **** VisibilityFilter
         ***** ColumnQualifierFilter
         ****** ColumnFamilySkippingIterator
         ******* DeletingIterator
         ******** StatsIterator
         ********* MultiIterator
         ********** MemoryIterator
         ********** ProblemReportingIterator
         *********** HeapIterator
         ************ RFile.LocalityGroupReader

        We can eliminate the weird condition by narrowing the set of iterators to:

         * SourceSwitchingIterator
         ** VisibilityFilter
         *** ColumnFamilySkippingIterator
         **** DeletingIterator
         ***** StatsIterator
         ****** MultiIterator
         ******* MemoryIterator
         ******* ProblemReportingIterator
         ******** HeapIterator
         ********* RFile.LocalityGroupReader

        There are other combinations that also perform much better than the default. I haven't been able to isolate this problem to a single iterator, despite removing each iterator one at a time.

        Anybody know what might be happening here? Best theory so far: the JVM learns that iterators can be used in a different way after a compaction, and some JVM optimization like JIT compilation, branch prediction, or automatic inlining stops happening.
        Adam Fuchs made changes -
        Description I've been running some scan performance tests on 1.6.0, and I'm running into an interesting situation in which query performance starts at a certain level and then degrades by ~15% after an event. The test follows roughly the following scenario:

         # Single tabletserver instance
         # Load 100M small (~10byte) key/values into a tablet and let it finish major compacting
         # Disable the garbage collector (this makes the time to _the event_ longer)
         # Restart the tabletserver
         # Repeatedly scan from the beginning to the end of the table in a loop
         # Something happens on the tablet server, like one of {idle compaction of metadata table, forced flush of metadata table, forced compaction of metadata table, forced flush of trace table}
         # Observe that scan rates dropped by 15-20%
         # Observe that restarting the scan will not improve performance back to original level. Performance only gets better upon restarting the tablet server.

        I've been able to get this not to happen by removing iterators from the iterator tree. It doesn't seem to matter which iterators, but removing a certain number both improves performance (significantly) and eliminates the degradation problem. The default iterator tree includes (VersioningIterator, SynchronizedIterator, VisibilityFilter, ColumnQualifierFilter, ColumnFamilySkippingIterator, DeletingIterator, StatsIterator, MultiIterator, (MemoryIterator*, RFile.Reader*)). Narrowing this to VisibilityFilter, ColumnFamilySkippingIterator, DeletingIterator, StatsIterator, MultiIterator, (MemoryIterator*, RFile.Reader*)) eliminates the weird condition. There are also other combinations that perform much better than the default. I haven't been able to isolate this problem to a single iterator, despite removing each iterator one at a time.

        Anybody know what might be happening here? Best theory so far: the JVM learns that iterators can be used in a different way after a compaction, and some JVM optimization like JIT compilation, branch prediction, or automatic inlining stops happening.
        I've been running some scan performance tests on 1.6.0, and I'm running into an interesting situation in which query performance starts at a certain level and then degrades by ~15% after an event. The test follows roughly the following scenario:

         # Single tabletserver instance
         # Load 100M small (~10byte) key/values into a tablet and let it finish major compacting
         # Disable the garbage collector (this makes the time to _the event_ longer)
         # Restart the tabletserver
         # Repeatedly scan from the beginning to the end of the table in a loop
         # Something happens on the tablet server, like one of {idle compaction of metadata table, forced flush of metadata table, forced compaction of metadata table, forced flush of trace table}
         # Observe that scan rates dropped by 15-20%
         # Observe that restarting the scan will not improve performance back to original level. Performance only gets better upon restarting the tablet server.

        I've been able to get this not to happen by removing iterators from the iterator tree. It doesn't seem to matter which iterators, but removing a certain number both improves performance (significantly) and eliminates the degradation problem. The default iterator tree includes (SourceSwitchingIterator, VersioningIterator, SynchronizedIterator, VisibilityFilter, ColumnQualifierFilter, ColumnFamilySkippingIterator, DeletingIterator, StatsIterator, MultiIterator, (MemoryIterator*, RFile.Reader*)). Narrowing this to (SourceSwitchingIterator, VisibilityFilter, ColumnFamilySkippingIterator, DeletingIterator, StatsIterator, MultiIterator, (MemoryIterator*, RFile.Reader*)) eliminates the weird condition. There are also other combinations that perform much better than the default. I haven't been able to isolate this problem to a single iterator, despite removing each iterator one at a time.

        Anybody know what might be happening here? Best theory so far: the JVM learns that iterators can be used in a different way after a compaction, and some JVM optimization like JIT compilation, branch prediction, or automatic inlining stops happening.
        Adam Fuchs made changes -
        Attachment accumulo_query_perf_test.tar.gz [ 12662854 ]
        Adam Fuchs made changes -
        Field Original Value New Value
        Attachment Screen Shot 2014-08-19 at 4.19.37 PM.png [ 12662852 ]
        Adam Fuchs created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            Adam Fuchs
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development