Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-4838

Refactor MapJoin HashMap code to improve testability and readability

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.12.0
    • None
    • None

    Description

      MapJoin is an essential component for high performance joins in Hive and the current code has done great service for many years. However, the code is showing it's age and currently suffers from the following issues:

      • Uses static state via the MapJoinMetaData class to pass serialization metadata to the Key, Row classes.
      • The api of a logical "Table Container" is not defined and therefore it's unclear what apis HashMapWrapper
        needs to publicize. Additionally HashMapWrapper has many used public methods.
      • HashMapWrapper contains logic to serialize, test memory bounds, and implement the table container. Ideally these logical units could be seperated
      • HashTableSinkObjectCtx has unused fields and unused methods
      • CommonJoinOperator and children use ArrayList on left hand side when only List is required
      • There are unused classes MRU, DCLLItemm and classes which duplicate functionality MapJoinSingleKey and MapJoinDoubleKeys

      Attachments

        1. HIVE-4838.patch
          147 kB
          Brock Noland
        2. HIVE-4838.patch
          137 kB
          Brock Noland
        3. HIVE-4838.patch
          137 kB
          Brock Noland
        4. HIVE-4838.patch
          135 kB
          Brock Noland
        5. HIVE-4838.patch
          135 kB
          Brock Noland
        6. HIVE-4838.patch
          136 kB
          Brock Noland
        7. HIVE-4838.patch
          136 kB
          Brock Noland

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            brocknoland Brock Noland Assign to me
            brocknoland Brock Noland
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment