Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-10114

Split strategies for ORC

Log workAgile BoardRank to TopRank to BottomVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.2.0
    • 1.2.0
    • None
    • None

    Description

      ORC split generation does not have clearly defined strategies for different scenarios (many small orc files, few small orc files, many large files etc.). Few strategies like storing the file footer in orc split, making entire file as a orc split already exists. This JIRA to make the split generation simpler, support different strategies for various use cases (BI, ETL, ACID etc.) and to lay the foundation for HIVE-7428.

      Attachments

        1. HIVE-10114.5.patch
          50 kB
          Prasanth Jayachandran
        2. HIVE-10114.4.patch
          50 kB
          Prasanth Jayachandran
        3. HIVE-10114.3.patch
          50 kB
          Prasanth Jayachandran
        4. HIVE-10114.2.patch
          51 kB
          Prasanth Jayachandran
        5. HIVE-10114.1.patch
          50 kB
          Prasanth Jayachandran

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            prasanth_j Prasanth Jayachandran Assign to me
            prasanth_j Prasanth Jayachandran
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment