Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-4391

Source deepcopies cannot be used safely in separate threads in tserver

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.5
    • Fix Version/s: 1.6.6, 1.7.3, 1.8.1, 2.0.0
    • Component/s: core
    • Labels:
      None

      Description

      We have iterators that create deep copies of the source and use them in separate threads. As it turns out this is not safe and we end up with many exceptions, mostly down in the ZlibDecompressor library. Curiously if you turn on the data cache for the table being scanned then the errors disappear.

      After much hunting it turns out that the real bug is in the BoundedRangeFileInputStream. The read() method therein appropriately synchronizes on the underlying FSDataInputStream, however the available() method does not. Adding similar synchronization on that stream fixes the issues. On a side note, the available() call is only invoked within the hadoop CompressionInputStream for use in the getPos() call. That call does not appear to actually be used at least in this context.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ivan.bella Ivan Bella
                Reporter:
                ivan.bella Ivan Bella
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 24h
                  24h
                  Remaining:
                  Time Spent - 16h 50m Remaining Estimate - 7h 10m
                  7h 10m
                  Logged:
                  Time Spent - 16h 50m Remaining Estimate - 7h 10m
                  16h 50m