[HADOOP-12217] hashCode in DoubleWritable returns same value for many numbers - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Patch Available
Priority: Major
Resolution: Unresolved
Affects Version/s: 0.18.0, 0.18.1, 0.18.2, 0.18.3, 0.19.0, 0.19.1, 0.20.0, 0.20.1, 0.20.2, 0.20.203.0, 0.20.204.0, 0.20.205.0, 1.0.0, 1.0.1, 1.0.2, 1.0.3, 1.0.4, 1.1.0, 1.1.1, 1.2.0, 0.21.0, 0.22.0, 0.23.0, 0.23.1, 0.23.3, 2.0.0-alpha, 2.0.1-alpha, 2.0.2-alpha, 0.23.4, 2.0.3-alpha, 0.23.5, 0.23.6, 1.1.2, 0.23.7, 2.1.0-beta, 2.0.4-alpha, 0.23.8, 1.2.1, 2.0.5-alpha, 0.23.9, 0.23.10, 0.23.11, 2.1.1-beta, 2.0.6-alpha, 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.4.1, 2.5.1, 2.5.2, 2.6.0, 2.7.0, 2.7.1
Fix Version/s: None
Component/s: io
Labels:
- easyfix

Description

Because DoubleWritable.hashCode() is incorrect, using DoubleWritables as the keys in a HashMap results in abysmal performance, due to hash code collisions.

I discovered this when testing the latest version of Hive and certain mapjoin queries were exceedingly slow.

Evidently, Hive has its own wrapper/subclass around Hadoop's DoubleWritable that overrode used to override hashCode() with a correct implementation, but for some reason they recently removed that code, so it now uses the incorrect hashCode() method inherited from Hadoop's DoubleWritable.

It appears that this bug has been there since DoubleWritable was created(wow!) so I can understand if fixing it is impractical due to the possibility of breaking things down-stream, but I can't think of anything that should break, off the top of my head.

Searching JIRA, I found several related tickets, which may be useful for some historical perspective: ~~HADOOP-3061~~, ~~HADOOP-3243~~, ~~HIVE-511~~, ~~HIVE-1629~~, ~~HIVE-7041~~

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-12217.1.patch
11/Jul/15 17:52
0.7 kB
Steve Scaffidi

Issue Links

relates to

HIVE-11502 Map side aggregation is extremely slow

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Steve Scaffidi

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 10/Jul/15 17:46

Updated:: 10/Aug/15 08:20