Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-7002

Insufficient logging in phoenix client when server throws StaleRegionBoundaryCacheException.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 5.2.0, 5.1.4
    • None
    • None

    Description

      Saw an incident in production cluster where a phoenix range scan query returned result outside of the range provided by the customer. There were hbck repair runs going on while the query was running. During the start of the query, there were region holes in the table (no way to confirm) and while the query was still running we ran hbck repair operation and that caused region overlaps (This is confirmed since overlap continued after the query).
      But the sad part is there were absolutely no exceptions/errors/stack trace on the client or server side.
      After the query is run we log the execution time, number of exception encountered as a log line. There we see this query encountered StaleRegionBoundaryCacheException 4 times.

      There is some logic in BaseResultIterators where we adjust the start and end key range for the scan. See here

      Without knowing the state of meta known or exception encountered, it is very difficult to debug why this happened.

      At the very least, we would want to log all the exceptions on the phoenix client side.

      Attachments

        Issue Links

          Activity

            People

              divneet18 Divneet Kaur
              shahrs87 Rushabh Shah
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: