Description
HiveKey should be used as the key type because it holds the hash code for partitioning. While BytesWritable serves partitioning well for simple cases, we have to use HiveKey.hashCode for more complicated ones, e.g. join, bucketed table, etc.
Attachments
Attachments
Issue Links
- is depended upon by
-
HIVE-7856 Enable parallelism in Reduce Side Join [Spark Branch]
- Resolved
-
HIVE-7956 When inserting into a bucketed table, all data goes to a single bucket [Spark Branch]
- Resolved
- is related to
-
HIVE-8098 The spark golden file for union_remove_25 is different from MR version [Spark Branch]
- Open
- relates to
-
HIVE-8035 Add SORT_QUERY_RESULTS for test that doesn't guarantee order
- Closed
- links to