Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-19588

Several invocation of file listing when creating VectorizedOrcAcidRowBatchReader

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.1.0
    • Fix Version/s: 3.1.0, 3.0.1, 4.0.0
    • Component/s: Transactions
    • Labels:
      None
    • Target Version/s:

      Description

      Looks like we are doing file listing several times when creating one instance of VectorizedOrcAcidRowBatchReader
      AcidUtils.parseBaseOrDeltaBucketFilename() does full file listing (when there are files with bucket_* prefix) just to get a single file out of a path to figure out if it has ACID schema (as part of HIVE-18190).
      There is full file listing where we populate
      1) ColumnizedDeleteEventRegistry
      2) SortMergedDeleteEventRegistry
      3) Twice in computeOffsetAndBucket()

       

      Attaching profiles which Gopal Vijayaraghavan took while debugging. 

        Attachments

        1. Screen Shot 2018-05-16 at 2.23.25 PM.png
          151 kB
          Prasanth Jayachandran
        2. HIVE-19588.4.patch
          20 kB
          Prasanth Jayachandran
        3. HIVE-19588.3.patch
          20 kB
          Prasanth Jayachandran
        4. HIVE-19588.2.patch
          20 kB
          Prasanth Jayachandran
        5. HIVE-19588.1.patch
          16 kB
          Prasanth Jayachandran

          Issue Links

            Activity

              People

              • Assignee:
                prasanth_j Prasanth Jayachandran
                Reporter:
                ndembla Nita Dembla
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: