[SPARK-1994] Aggregates return incorrect results on first execution - ASF JIRA

Rank to Top

Rank to Bottom

Attach files

Attach Screenshot

Bulk Copy Attachments

Bulk Move Attachments

Voters

Watch issue

Watchers

Create sub-task

Convert to sub-task

Link

Clone

Labels

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: 1.0.0
Fix Version/s: 1.0.1, 1.1.0
Component/s: SQL
Labels:
None

Target Version/s:

1.0.1, 1.1.0

Description

Aaron Davidson has a full reproduction but he has found a case where the first run returns corrupted results, but the second case does not. The same does not occur when reading from HDFS a second time...

sql("SELECT lang, COUNT(*) AS cnt FROM tweetTable GROUP BY lang ORDER BY cnt DESC").collect.foreach(println)
[bg,16636]
[16266,16266]
[16223,16223]
[16161,16161]
[16047,16047]
[lt,11405]
[hu,11380]
[el,10845]
[da,10289]
[fi,10261]
[9897,9897]
[9765,9765]
[9751,9751]

Attachments

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: Michael Armbrust

Reporter:: Michael Armbrust

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 02/Jun/14 20:37

Updated:: 15/Jul/14 06:21

Resolved:: 08/Jun/14 07:02

Agile

View on Board

Aggregates return incorrect results on first execution

Details

Description

Attachments

Attachments

Activity

People

Dates

Agile

Slack

Issue deployment