Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17296

Acid tests with multiple splits

    XMLWordPrintableJSON

Details

    • Test
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0
    • None
    • Transactions
    • None

    Description

      data files in an Acid table are ORC files which may have multiple stripes
      for such files in base/ or delta/ (and original files with non acid to acid conversion) are split by OrcInputFormat into multiple (stripe sized) chunks.
      There is additional logic in in OrcRawRecordMerger (discoverKeyBounds/discoverOriginalKeyBounds) that is not tested by any E2E tests since none of the have enough data to generate multiple stripes in a single file.

      testRecordReaderOldBaseAndDelta/testRecordReaderNewBaseAndDelta/testOriginalReaderPair
      in TestOrcRawRecordMerger has some logic to test this but it really needs e2e tests.

      With ORC-228 it will be possible to write such tests.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated: