Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-4391

Source deepcopies cannot be used safely in separate threads in tserver

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.5
    • Fix Version/s: 1.6.6, 1.7.3, 1.8.1, 2.0.0
    • Component/s: core
    • Labels:
      None

      Description

      We have iterators that create deep copies of the source and use them in separate threads. As it turns out this is not safe and we end up with many exceptions, mostly down in the ZlibDecompressor library. Curiously if you turn on the data cache for the table being scanned then the errors disappear.

      After much hunting it turns out that the real bug is in the BoundedRangeFileInputStream. The read() method therein appropriately synchronizes on the underlying FSDataInputStream, however the available() method does not. Adding similar synchronization on that stream fixes the issues. On a side note, the available() call is only invoked within the hadoop CompressionInputStream for use in the getPos() call. That call does not appear to actually be used at least in this context.

        Issue Links

          Activity

          Hide
          ivan.bella Ivan Bella added a comment -

          I have a test case which demonstrates the issue and the fix which I will create a pull request for soon.

          Show
          ivan.bella Ivan Bella added a comment - I have a test case which demonstrates the issue and the fix which I will create a pull request for soon.
          Show
          ivan.bella Ivan Bella added a comment - https://github.com/apache/accumulo/pull/134
          Hide
          kturner Keith Turner added a comment -

          Ivan Bella while looking at the pull request for a bit. I am wondering if you have a sense of what specifically is causing problem? I have read back over the comments to try to determine this. Do you think one thread is closing an RFile while other threads are reading from deep copies of that RFile? If so, do you think this is resulting in the threads reading from deep copies on a closed RFile using decompressor that were returned? And this is resulting in multiple threads using the same decompressor?

          Show
          kturner Keith Turner added a comment - Ivan Bella while looking at the pull request for a bit. I am wondering if you have a sense of what specifically is causing problem? I have read back over the comments to try to determine this. Do you think one thread is closing an RFile while other threads are reading from deep copies of that RFile? If so, do you think this is resulting in the threads reading from deep copies on a closed RFile using decompressor that were returned? And this is resulting in multiple threads using the same decompressor?
          Hide
          ivan.bella Ivan Bella added a comment -

          Keith Turner There was a slew of different exceptions that would occur, mostly stemming from the fact that the same decompressor was being used by multiple threads:
          1) Most of them stem from using the same decompressor across multiple threads. The root cause of this was that the same decompressor was being returned to the codec pool multiple times and then later the same decompressor was being returned to multiple threads resulting is mass chaos. This is basically solved by ensuring that the decompressor is returned only once in the BCFile finish call.
          2) The second set of issues stemmed from the close being called at the same time other threads were reading from the FSDataInputStream. This resulted in the decompressor being returned underneath and subsequently reused by some other read all while some read was occurring. This is solved by the synchronization added in the BoundedRangeInputFileStream. The synchronization within the CacheableBlockFile was added more by examination of the code to ensure we were not closing the same FSDataInputStream concurrently while reading/closing within the BoundedRangeInputFileStream.
          3) The third problem was the available call being called concurrently on the FSDataInputStream while it was being closed in the BoundedRangeInputFileStream. I initially added synchronization there as well, however after our initial discussions in the pull request it was determined (and verified on our system) that the available call is really not being used except by the initialization of the CompressionInputStream the results of which are used in the getPos call which is not used. Hence simply avoiding the underlying available call seemed the best course of action.

          Show
          ivan.bella Ivan Bella added a comment - Keith Turner There was a slew of different exceptions that would occur, mostly stemming from the fact that the same decompressor was being used by multiple threads: 1) Most of them stem from using the same decompressor across multiple threads. The root cause of this was that the same decompressor was being returned to the codec pool multiple times and then later the same decompressor was being returned to multiple threads resulting is mass chaos. This is basically solved by ensuring that the decompressor is returned only once in the BCFile finish call. 2) The second set of issues stemmed from the close being called at the same time other threads were reading from the FSDataInputStream. This resulted in the decompressor being returned underneath and subsequently reused by some other read all while some read was occurring. This is solved by the synchronization added in the BoundedRangeInputFileStream. The synchronization within the CacheableBlockFile was added more by examination of the code to ensure we were not closing the same FSDataInputStream concurrently while reading/closing within the BoundedRangeInputFileStream. 3) The third problem was the available call being called concurrently on the FSDataInputStream while it was being closed in the BoundedRangeInputFileStream. I initially added synchronization there as well, however after our initial discussions in the pull request it was determined (and verified on our system) that the available call is really not being used except by the initialization of the CompressionInputStream the results of which are used in the getPos call which is not used. Hence simply avoiding the underlying available call seemed the best course of action.
          Hide
          phrocker marco polo added a comment -

          Why is that decompressor being shared? Why isn't the thread being given access to its own decompressor on its own block read?

          Show
          phrocker marco polo added a comment - Why is that decompressor being shared? Why isn't the thread being given access to its own decompressor on its own block read?
          Hide
          ivan.bella Ivan Bella added a comment -

          For case #1, the issue is more that the same decompressor was being added back into the pool multiple times. That was most likely done on the same thread. I will have to do some research to determine why close was being called multiple times. My fix still allows that to happen, but the decompressor would only be returned once.
          For case #2, is was the root RFile.Reader being closed on the main thread or perhaps a cleanup thread because the root scan was complete or the client was gone, but one of my threads was still reading from it. This is normally not a problem because we no longer care about the results, however when it results in array creation issues and subsequent memory issues then we have problems.

          Show
          ivan.bella Ivan Bella added a comment - For case #1, the issue is more that the same decompressor was being added back into the pool multiple times. That was most likely done on the same thread. I will have to do some research to determine why close was being called multiple times. My fix still allows that to happen, but the decompressor would only be returned once. For case #2, is was the root RFile.Reader being closed on the main thread or perhaps a cleanup thread because the root scan was complete or the client was gone, but one of my threads was still reading from it. This is normally not a problem because we no longer care about the results, however when it results in array creation issues and subsequent memory issues then we have problems.
          Hide
          elserj Josh Elser added a comment -

          Thanks for these last two posts, Ivan Bella. They are extremely helpful to understand the big picture.

          Show
          elserj Josh Elser added a comment - Thanks for these last two posts, Ivan Bella . They are extremely helpful to understand the big picture.
          Hide
          ivan.bella Ivan Bella added a comment -

          You are very welcome.

          Show
          ivan.bella Ivan Bella added a comment - You are very welcome.
          Hide
          ctubbsii Christopher Tubbs added a comment -

          Pull request merged, and conflicts resolved in newer branches. Leaving the issue open for now, pending further tests to add. If they don't get added in the next few days (before the 1.6.6 release candidates), I'll close this and create a new JIRA for more tests as a follow-up task.

          Show
          ctubbsii Christopher Tubbs added a comment - Pull request merged, and conflicts resolved in newer branches. Leaving the issue open for now, pending further tests to add. If they don't get added in the next few days (before the 1.6.6 release candidates), I'll close this and create a new JIRA for more tests as a follow-up task.
          Hide
          ctubbsii Christopher Tubbs added a comment -

          Closing this. I created ACCUMULO-4455 for more tests. Please comment on or amend that issue to suggest specific test cases which could/should be added, so folks don't have to mine them from this discussion or the PR. Thanks

          Show
          ctubbsii Christopher Tubbs added a comment - Closing this. I created ACCUMULO-4455 for more tests. Please comment on or amend that issue to suggest specific test cases which could/should be added, so folks don't have to mine them from this discussion or the PR. Thanks

            People

            • Assignee:
              ivan.bella Ivan Bella
              Reporter:
              ivan.bella Ivan Bella
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Time Spent - 16h 50m Remaining Estimate - 7h 10m
                7h 10m
                Logged:
                Time Spent - 16h 50m Remaining Estimate - 7h 10m
                16h 50m

                  Development