Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21460

ACID: Load data followed by a select * query results in incorrect results

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.1.1, 4.0.0
    • Fix Version/s: 3.1.1, 4.0.0
    • Component/s: Transactions
    • Labels:
      None

      Description

      This affects current master as well. Created an orc file such that it spans multiple stripes and ran a simple select *, and got incorrect row counts (when comparing with select count. The problem seems to be that after split generation and creating min/max rowId for each row (note that since the loaded file is not written by Hive ACID, it does not have ROW_ID in the file; but the ROWID is applied on read by discovering min/max bounds which are used for calculating ROW_ID.rowId for each row of a split), Hive is only reading the last split.

        Attachments

        1. HIVE-21460.1.patch
          0.9 kB
          Vaibhav Gumashta

          Activity

            People

            • Assignee:
              vgumashta Vaibhav Gumashta
              Reporter:
              bgoerlitz Brian Goerlitz
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: