Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-4018

HashJoin may produce nulls for LHS table columns

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Duplicate
    • 4.11.0
    • None
    • None
    • None

    Description

      Here is the problem: in HashJoinRegionScanner methods (nextRow for example) we are using the same scanner context that was created in RSRpcServices. It has limits (i.e. 2Mb size). Let's say that we have 3Mb region and the only key that match the join condition is located at the end of the region. In HashJoinRegionScanner#nextRow when we iterate through the region rows once we reached the limit of 2Mb, every region scanner nextRow will return a single cell and the scanner context will have SIZE_LIMIT_REACHED_MID_ROW state. But we don't have any logic that check that, so this single cell is considered as a complete row with all nulls except one column.

      How to fix it:
      1. for region scanner we may provide NoLimitScannerContext, so we will never get a partial result.
      2. We need to update the scanner context that we got from RSRpcServices with the real data, basing on the size of results we are going to return.

      Attachments

        1. PHOENIX-4018-1.patch
          5 kB
          Sergey Soldatov

        Issue Links

          Activity

            People

              sergey.soldatov Sergey Soldatov
              sergey.soldatov Sergey Soldatov
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: